Sample records for binding site sequence

  1. Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively.

    PubMed

    Clifford, Jacob; Adami, Christoph

    2015-09-02

    Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.

  2. In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure.

    PubMed

    Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V

    2017-02-07

    Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.

  3. CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

    PubMed

    Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

    2017-09-01

    Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.

  4. Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

    PubMed Central

    Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

    1993-01-01

    The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943

  5. Patterns and plasticity in RNA-protein interactions enable recruitment of multiple proteins through a single site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valley, Cary T.; Porter, Douglas F.; Qiu, Chen

    2012-06-28

    mRNA control hinges on the specificity and affinity of proteins for their RNA binding sites. Regulatory proteins must bind their own sites and reject even closely related noncognate sites. In the PUF [Pumilio and fem-3 binding factor (FBF)] family of RNA binding proteins, individual proteins discriminate differences in the length and sequence of binding sites, allowing each PUF to bind a distinct battery of mRNAs. Here, we show that despite these differences, the pattern of RNA interactions is conserved among PUF proteins: the two ends of the PUF protein make critical contacts with the two ends of the RNA sites.more » Despite this conserved 'two-handed' pattern of recognition, the RNA sequence is flexible. Among the binding sites of yeast Puf4p, RNA sequence dictates the pattern in which RNA bases are flipped away from the binding surface of the protein. Small differences in RNA sequence allow new modes of control, recruiting Puf5p in addition to Puf4p to a single site. This embedded information adds a new layer of biological meaning to the connections between RNA targets and PUF proteins.« less

  6. Position specific variation in the rate of evolution in transcription factor binding sites

    PubMed Central

    Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

    2003-01-01

    Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282

  7. Architecture of a Fur Binding Site: a Comparative Analysis

    PubMed Central

    Lavrrar, Jennifer L.; McIntosh, Mark A.

    2003-01-01

    Fur is an iron-binding transcriptional repressor that recognizes a 19-bp consensus site of the sequence 5′-GATAATGATAATCATTATC-3′. This site can be defined as three adjacent hexamers of the sequence 5′-GATAAT-3′, with the third being slightly imperfect (an F-F-F configuration), or as two hexamers in the forward orientation separated by one base pair from a third hexamer in the reverse orientation (an F-F-x-R configuration). Although Fur can bind synthetic DNA sequences containing the F-F-F arrangement, most natural binding sites are variations of the F-F-x-R arrangement. The studies presented here compared the ability of Fur to recognize synthetic DNA sequences containing two to four adjacent hexamers with binding to sequences containing variations of the F-F-x-R arrangement (including natural operator sequences from the entS and fepB promoter regions of Escherichia coli). Gel retardation assays showed that the F-F-x-R architecture was necessary for high-affinity Fur-DNA interactions and that contiguous hexamers were not recognized as effectively. In addition, the stoichiometry of Fur at each binding site was determined, showing that Fur interacted with its minimal 19-bp binding site as two overlapping dimers. These data confirm the proposed overlapping-dimer binding model, where the unit of interaction with a single Fur dimer is two inverted hexamers separated by a C:G base pair, with two overlapping units comprising the 19-bp consensus binding site required for the high-affinity interaction with two Fur dimers. PMID:12644489

  8. Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships.

    PubMed

    Gold, Nicola D; Jackson, Richard M

    2006-02-03

    The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.

  9. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  10. TmiRUSite and TmiROSite scripts: searching for mRNA fragments with miRNA binding sites with encoded amino acid residues.

    PubMed

    Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly

    2014-01-01

    microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.

  11. How proteins bind to DNA: target discrimination and dynamic sequence search by the telomeric protein TRF1

    PubMed Central

    2017-01-01

    Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355

  12. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  13. Purification and sequencing of the active site tryptic peptide from penicillin-binding protein 1b of Escherichia coli

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nicholas, R.A.; Suzuki, H.; Hirota, Y.

    This paper reports the sequence of the active site peptide of penicillin-binding protein 1b from Escherichia coli. Purified penicillin-binding protein 1b was labeled with (/sup 14/C)penicillin G, digested with trypsin, and partially purified by gel filtration. Upon further purification by high-pressure liquid chromatography, two radioactive peaks were observed, and the major peak, representing over 75% of the applied radioactivity, was submitted to amino acid analysis and sequencing. The sequence Ser-Ile-Gly-Ser-Leu-Ala-Lys was obtained. The active site nucleophile was identified by digesting the purified peptide with aminopeptidase M and separating the radioactive products on high-pressure liquid chromatography. Amino acid analysis confirmed thatmore » the serine residue in the middle of the sequence was covalently bonded to the (/sup 14/C)penicilloyl moiety. A comparison of this sequence to active site sequences of other penicillin-binding proteins and beta-lactamases is presented.« less

  14. HMG-D is an architecture-specific protein that preferentially binds to DNA containing the dinucleotide TG.

    PubMed Central

    Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A

    1995-01-01

    The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717

  15. Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.

    PubMed Central

    Sasaki, H; Yokoyama, E; Kuroiwa, A

    1990-01-01

    The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866

  16. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  17. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE PAGES

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  18. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets.

    PubMed

    Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L

    2013-07-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.

  19. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets

    PubMed Central

    Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.

    2013-01-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967

  20. The structural basis of actinomycin D–binding induces nucleotide flipping out, a sharp bend and a left-handed twist in CGG triplet repeats

    PubMed Central

    Lo, Yu-Sheng; Tseng, Wen-Hsuan; Chuang, Chien-Ying; Hou, Ming-Hon

    2013-01-01

    The potent anticancer drug actinomycin D (ActD) functions by intercalating into DNA at GpC sites, thereby interrupting essential biological processes including replication and transcription. Certain neurological diseases are correlated with the expansion of (CGG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single G:G mispair. To characterize the binding of ActD to CGG triplet repeat sequences, the structural basis for the strong binding of ActD to neighbouring GpC sites flanking a G:G mismatch has been determined based on the crystal structure of ActD bound to ATGCGGCAT, which contains a CGG triplet sequence. The binding of ActD molecules to GCGGC causes many unexpected conformational changes including nucleotide flipping out, a sharp bend and a left-handed twist in the DNA helix via a two site-binding model. Heat denaturation, circular dichroism and surface plasmon resonance analyses showed that adjacent GpC sequences flanking a G:G mismatch are preferred ActD-binding sites. In addition, ActD was shown to bind the hairpin conformation of (CGG)16 in a pairwise combination and with greater stability than that of other DNA intercalators. Our results provide evidence of a possible biological consequence of ActD binding to CGG triplet repeat sequences. PMID:23408860

  1. Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)

    PubMed Central

    Das, Sourav; Kokardekar, Arshad

    2009-01-01

    Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089

  2. Molecular cloning and analysis of Schizosaccharomyces pombe Reb1p: sequence-specific recognition of two sites in the far upstream rDNA intergenic spacer.

    PubMed Central

    Zhao, A; Guo, A; Liu, Z; Pape, L

    1997-01-01

    The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645

  3. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.

    PubMed

    Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.

  4. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome

    PubMed Central

    Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274

  5. Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

    PubMed

    Choi, Daesik; Park, Byungkyu; Chae, Hanju; Lee, Wook; Han, Kyungsook

    2017-03-14

    Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .

  6. The FOXP2 forkhead domain binds to a variety of DNA sequences with different rates and affinities.

    PubMed

    Webb, Helen; Steeb, Olga; Blane, Ashleigh; Rotherham, Lia; Aron, Shaun; Machanick, Philip; Dirr, Heini; Fanucchi, Sylvia

    2017-07-01

    FOXP2 is a member of the P subfamily of FOX transcription factors, the DNA-binding domain of which is the winged helix forkhead domain (FHD). In this work we show that the FOXP2 FHD is able to bind to various DNA sequences, including a novel sequence identified in this work, with different affinities and rates as detected using surface plasmon resonance. Combining the experimental work with molecular docking, we show that high-affinity sequences remain bound to the protein for longer, form a greater number of interactions with the protein and induce a greater structural change in the protein than low-affinity sequences. We propose a binding model for the FOXP2 FHD that involves three types of binding sequence: low affinity sites which allow for rapid scanning of the genome by the protein in a partially unstructured state; moderate affinity sites which serve to locate the protein near target sites and high-affinity sites which secure the protein to the DNA and induce a conformational change necessary for functional binding and the possible initiation of downstream transcriptional events. © The Authors 2017. Published by Oxford University Press on behalf of the Japanese Biochemical Society. All rights reserved.

  7. Identification of a p53-response element in the promoter of the proline oxidase gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maxwell, Steve A.; Kochevar, Gerald J.

    2008-05-02

    Proline oxidase (POX) is a p53-induced proapoptotic gene. We investigated whether p53 could bind directly to the POX gene promoter. Chromatin immunoprecipitation (ChIP) assays detected p53 bound to POX upstream gene sequences. In support of the ChIP results, sequence analysis of the POX gene and its 5' flanking sequences revealed a potential p53-binding site, GGGCTTGTCTTCGTGTGACTTCTGTCT, located at 1161 base pairs (bp) upstream of the transcriptional start site. A 711-bp DNA fragment containing the candidate p53-binding site exhibited reporter gene activity that was induced by p53. In contrast, the same DNA region lacking the candidate p53-binding site did not show significantmore » p53-response activity. Electrophoretic mobility shift assay (EMSA) in ACHN renal carcinoma cell nuclear lysates confirmed that p53 could bind to the 711-bp POX DNA fragment. We concluded from these experiments that a p53-binding site is positioned at -1161 to -1188 bp upstream of the POX transcriptional start site.« less

  8. In vivo binding of PRDM9 reveals interactions with noncanonical genomic sites

    PubMed Central

    Grey, Corinne; Clément, Julie A.J.; Buard, Jérôme; Leblanc, Benjamin; Gut, Ivo; Gut, Marta; Duret, Laurent

    2017-01-01

    In mouse and human meiosis, DNA double-strand breaks (DSBs) initiate homologous recombination and occur at specific sites called hotspots. The localization of these sites is determined by the sequence-specific DNA binding domain of the PRDM9 histone methyl transferase. Here, we performed an extensive analysis of PRDM9 binding in mouse spermatocytes. Unexpectedly, we identified a noncanonical recruitment of PRDM9 to sites that lack recombination activity and the PRDM9 binding consensus motif. These sites include gene promoters, where PRDM9 is recruited in a DSB-dependent manner. Another subset reveals DSB-independent interactions between PRDM9 and genomic sites, such as the binding sites for the insulator protein CTCF. We propose that these DSB-independent sites result from interactions between hotspot-bound PRDM9 and genomic sequences located on the chromosome axis. PMID:28336543

  9. Direct association of Csk homologous kinase (CHK) with the diphosphorylated site Tyr568/570 of the activated c-KIT in megakaryocytes.

    PubMed

    Price, D J; Rivnay, B; Fu, Y; Jiang, S; Avraham, S; Avraham, H

    1997-02-28

    The Csk homologous kinase (CHK), formerly MATK, has previously been shown to bind to activated c-KIT. In this report, we characterize the binding of SH2(CHK) to specific phosphotyrosine sites on the c-KIT protein sequence. Phosphopeptide inhibition of the in vitro interaction of SH2(CHK)-glutathione S-transferase fusion protein/c-KIT from SCF/KL-treated Mo7e megakaryocytic cells indicated that two sites on c-KIT were able to bind SH2(CHK). These sites were the Tyr568/570 diphosphorylated sequence and the monophosphorylated Tyr721 sequence. To confirm this, we precipitated native CHK from cellular extracts using phosphorylated peptides linked to Affi-Gel 15. In addition, purified SH2(CHK)-glutathione S-transferase fusion protein was precipitated with the same peptide beads. All of the peptide bead-binding studies were consistent with the direct binding of SH2(CHK) to phosphorylated Tyr568/570 and Tyr721 sites. Binding of FYN and SHC to the diphosphorylated Tyr568/570 site was observed, while binding of Csk to this site was not observed. The SH2(CHK) binding to the two sites is direct and not through phosphorylated intermediates such as FYN or SHC. Site-directed mutagenesis of the full-length c-KIT cDNA followed by transient transfection indicated that only the Tyr568/570, and not the Tyr721, is able to bind SH2(CHK). This indicates that CHK binds to the same site on c-KIT to which FYN binds, possibly bringing the two into proximity on associated c-KIT subunits and leading to the down-regulation of FYN by CHK.

  10. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

    PubMed

    Schneider, T D

    2001-12-01

    The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.

  11. DNA breathing dynamics distinguish binding from nonbinding consensus sites for transcription factor YY1 in cells.

    PubMed

    Alexandrov, Boian S; Fukuyo, Yayoi; Lange, Martin; Horikoshi, Nobuo; Gelev, Vladimir; Rasmussen, Kim Ø; Bishop, Alan R; Usheva, Anny

    2012-11-01

    The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.

  12. Phyloscan: locating transcription-regulating binding sites in mixed aligned and unaligned sequence data.

    PubMed

    Palumbo, Michael J; Newberg, Lee A

    2010-07-01

    The transcription of a gene from its DNA template into an mRNA molecule is the first, and most heavily regulated, step in gene expression. Especially in bacteria, regulation is typically achieved via the binding of a transcription factor (protein) or small RNA molecule to the chromosomal region upstream of a regulated gene. The protein or RNA molecule recognizes a short, approximately conserved sequence within a gene's promoter region and, by binding to it, either enhances or represses expression of the nearby gene. Since the sought-for motif (pattern) is short and accommodating to variation, computational approaches that scan for binding sites have trouble distinguishing functional sites from look-alikes. Many computational approaches are unable to find the majority of experimentally verified binding sites without also finding many false positives. Phyloscan overcomes this difficulty by exploiting two key features of functional binding sites: (i) these sites are typically more conserved evolutionarily than are non-functional DNA sequences; and (ii) these sites often occur two or more times in the promoter region of a regulated gene. The website is free and open to all users, and there is no login requirement. Address: (http://bayesweb.wadsworth.org/phyloscan/).

  13. Sequences Flanking the Gephyrin-Binding Site of GlyRβ Tune Receptor Stabilization at Synapses

    PubMed Central

    Grünewald, Nora; Salvatico, Charlotte; Kress, Vanessa

    2018-01-01

    Abstract The efficacy of synaptic transmission is determined by the number of neurotransmitter receptors at synapses. Their recruitment depends upon the availability of postsynaptic scaffolding molecules that interact with specific binding sequences of the receptor. At inhibitory synapses, gephyrin is the major scaffold protein that mediates the accumulation of heteromeric glycine receptors (GlyRs) via the cytoplasmic loop in the β-subunit (β-loop). This binding involves high- and low-affinity interactions, but the molecular mechanism of this bimodal binding and its implication in GlyR stabilization at synapses remain unknown. We have approached this question using a combination of quantitative biochemical tools and high-density single molecule tracking in cultured rat spinal cord neurons. The high-affinity binding site could be identified and was shown to rely on the formation of a 310-helix C-terminal to the β-loop core gephyrin-binding motif. This site plays a structural role in shaping the core motif and represents the major contributor to the synaptic confinement of GlyRs by gephyrin. The N-terminal flanking sequence promotes lower affinity interactions by occupying newly identified binding sites on gephyrin. Despite its low affinity, this binding site plays a modulatory role in tuning the mobility of the receptor. Together, the GlyR β-loop sequences flanking the core-binding site differentially regulate the affinity of the receptor for gephyrin and its trapping at synapses. Our experimental approach thus bridges the gap between thermodynamic aspects of receptor-scaffold interactions and functional receptor stabilization at synapses in living cells. PMID:29464196

  14. Evaluation of simultaneous binding of Chromomycin A3 to the multiple sites of DNA by the new restriction enzyme assay.

    PubMed

    Murase, Hirotaka; Noguchi, Tomoharu; Sasaki, Shigeki

    2018-06-01

    Chromomycin A3 (CMA3) is an aureolic acid-type antitumor antibiotic. CMA3 forms dimeric complexes with divalent cations, such as Mg 2+ , which strongly binds to the GC rich sequence of DNA to inhibit DNA replication and transcription. In this study, the binding property of CMA3 to the DNA sequence containing multiple GC-rich binding sites was investigated by measuring the protection from hydrolysis by the restriction enzymes, AccII and Fnu4HI, for the center of the CGCG site and the 5'-GC↓GGC site, respectively. In contrast to the standard DNase I footprinting method, the DNA substrates are fully hydrolyzed by the restriction enzymes, therefore, the full protection of DNA at all the cleavable sites indicates that CMA3 simultaneously binds to all the binding sites. The restriction enzyme assay has suggested that CMA3 has a high tendency to bind the successive CGCG sites and the CGG repeat. Copyright © 2018 Elsevier Ltd. All rights reserved.

  15. Activation of erythropoietin receptor in the absence of hormone by a peptide that binds to a domain different from the hormone binding site

    PubMed Central

    Naranda, Tatjana; Wong, Kenneth; Kaufman, R. Ilene; Goldstein, Avram; Olsson, Lennart

    1999-01-01

    Applying a homology search method previously described, we identified a sequence in the extracellular dimerization site of the erythropoietin receptor, distant from the hormone binding site. A peptide identical to that sequence was synthesized. Remarkably, it activated receptor signaling in the absence of erythropoietin. Neither the peptide nor the hormone altered the affinity of the other for the receptor; thus, the peptide does not bind to the hormone binding site. The combined activation of signal transduction by hormone and peptide was strongly synergistic. In mice, the peptide acted like the hormone, protecting against the decrease in hematocrit caused by carboplatin. PMID:10377456

  16. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-08-01

    The speed of site-specific binding of transcription factor (TFs) proteins with genomic DNA seems to be strongly retarded by the randomly occurring sequence traps. Traps are those DNA sequences sharing significant similarity with the original specific binding sites (SBSs). It is an intriguing question how the naturally occurring TFs and their SBSs are designed to manage the retarding effects of such randomly occurring traps. We develop a simple random walk model on the site-specific binding of TFs with genomic DNA in the presence of sequence traps. Our dynamical model predicts that (a) the retarding effects of traps will be minimum when the traps are arranged around the SBS such that there is a negative correlation between the binding strength of TFs with traps and the distance of traps from the SBS and (b) the retarding effects of sequence traps can be appeased by the condensed conformational state of DNA. Our computational analysis results on the distribution of sequence traps around the putative binding sites of various TFs in mouse and human genome clearly agree well the theoretical predictions. We propose that the distribution of traps can be used as an additional metric to efficiently identify the SBSs of TFs on genomic DNA.

  17. Transcription initiation from the dihydrofolate reductase promoter is positioned by HIP1 binding at the initiation site.

    PubMed

    Means, A L; Farnham, P J

    1990-02-01

    We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Sp1-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Sp1-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).

  18. GenProBiS: web server for mapping of sequence variants to protein binding sites.

    PubMed

    Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka

    2017-07-03

    Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Nuclear proteins that bind the human gamma-globin gene promoter: alterations in binding produced by point mutations associated with hereditary persistence of fetal hemoglobin.

    PubMed Central

    Gumucio, D L; Rood, K L; Gray, T A; Riordan, M F; Sartor, C I; Collins, F S

    1988-01-01

    The molecular mechanisms responsible for the human fetal-to-adult hemoglobin switch have not yet been elucidated. Point mutations identified in the promoter regions of gamma-globin genes from individuals with nondeletion hereditary persistence of fetal hemoglobin (HPFH) may mark cis-acting sequences important for this switch, and the trans-acting factors which interact with these sequences may be integral parts in the puzzle of gamma-globin gene regulation. We have used gel retardation and footprinting strategies to define nuclear proteins which bind to the normal gamma-globin promoter and to determine the effect of HPFH mutations on the binding of a subset of these proteins. We have identified five proteins in human erythroleukemia cells (K562 and HEL) which bind to the proximal promoter region of the normal gamma-globin gene. One factor, gamma CAAT, binds the duplicated CCAAT box sequences; the -117 HPFH mutation increases the affinity of interaction between gamma CAAT and its cognate site. Two proteins, gamma CAC1 and gamma CAC2, bind the CACCC sequence. These proteins require divalent cations for binding. The -175 HPFH mutation interferes with the binding of a fourth protein, gamma OBP, which binds an octamer sequence (ATGCAAAT) in the normal gamma-globin promoter. The HPFH phenotype of the -175 mutation indicates that the octamer-binding protein may play a negative regulatory role in this setting. A fifth protein, EF gamma a, binds to sequences which overlap the octamer-binding site. The erythroid-specific distribution of EF gamma a and its close approximation to an apparent repressor-binding site suggest that it may be important in gamma-globin regulation. Images PMID:2468996

  20. Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

    PubMed Central

    Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-01-01

    Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039

  1. Preferential binding of daunomycin to 5'ATCG and 5'ATGC sequences revealed by footprinting titration experiments.

    PubMed

    Chaires, J B; Herrera, J E; Waring, M J

    1990-07-03

    Results from a high-resolution deoxyribonuclease I (DNase I) footprinting titration procedure are described that identify preferred daunomycin binding sites within the 160 bp tyr T DNA fragment. We have obtained single-bond resolution at 65 of the 160 potential binding sites within the tyr T fragment and have examined the effect of 0-3.0 microM total daunomycin concentration on the susceptibility of these sites toward digestion by DNase I. Four types of behavior are observed: (i) protection from DNase I cleavage; (ii) protection, but only after reaching a critical total daunomycin concentration; (iii) enhanced cleavage; (iv) no effect of added drug. Ten sites were identified as the most strongly protected on the basis of the magnitude of the reduction of their digestion product band areas in the presence of daunomycin. These were identified as the preferred daunomycin binding sites. Seven of these 10 sites are found at the end of the triplet sequences 5'ATGC and 5'ATCG, where the notation AT indicates that either A or T may occupy the position. The remaining three strongly protected sites are found at the ends of the triplet sequence 5'ATCAT. Of the preferred daunomycin binding sites we identify in this study, the sequence 5'ATCG is consistent with the specificity predicted by the theoretical studies of Chen et al. [Chen, K.-X., Gresh, N., & Pullman, B. (1985) J. Biomol. Struct. Dyn. 3, 445-466] and is the very sequence to which daunomycin is observed to be bound in two recent X-ray crystallographic studies. Solution studies, theoretical studies, and crystallographic studies have thus converged to provide a consistent and coherent picture of the sequence preference of this important anticancer antibiotic.

  2. Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.

    PubMed Central

    Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P

    1998-01-01

    The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery. PMID:9811800

  3. Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.

    PubMed

    Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P

    1998-11-01

    The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery.

  4. Direct inhibition of the DNA-binding activity of POU transcription factors Pit-1 and Brn-3 by selective binding of a phenyl-furan-benzimidazole dication.

    PubMed

    Peixoto, Paul; Liu, Yang; Depauw, Sabine; Hildebrand, Marie-Paule; Boykin, David W; Bailly, Christian; Wilson, W David; David-Cordonnier, Marie-Hélène

    2008-06-01

    The development of small molecules to control gene expression could be the spearhead of future-targeted therapeutic approaches in multiple pathologies. Among heterocyclic dications developed with this aim, a phenyl-furan-benzimidazole dication DB293 binds AT-rich sites as a monomer and 5'-ATGA sequence as a stacked dimer, both in the minor groove. Here, we used a protein/DNA array approach to evaluate the ability of DB293 to specifically inhibit transcription factors DNA-binding in a single-step, competitive mode. DB293 inhibits two POU-domain transcription factors Pit-1 and Brn-3 but not IRF-1, despite the presence of an ATGA and AT-rich sites within all three consensus sequences. EMSA, DNase I footprinting and surface-plasmon-resonance experiments determined the precise binding site, affinity and stoichiometry of DB293 interaction to the consensus targets. Binding of DB293 occurred as a cooperative dimer on the ATGA part of Brn-3 site but as two monomers on AT-rich sites of IRF-1 sequence. For Pit-1 site, ATGA or AT-rich mutated sequences identified the contribution of both sites for DB293 recognition. In conclusion, DB293 is a strong inhibitor of two POU-domain transcription factors through a cooperative binding to ATGA. These findings are the first to show that heterocyclic dications can inhibit major groove transcription factors and they open the door to the control of transcription factors activity by those compounds.

  5. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    PubMed

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  6. Systematic optimization model and algorithm for binding sequence selection in computational enzyme design

    PubMed Central

    Huang, Xiaoqiang; Han, Kehang; Zhu, Yushan

    2013-01-01

    A systematic optimization model for binding sequence selection in computational enzyme design was developed based on the transition state theory of enzyme catalysis and graph-theoretical modeling. The saddle point on the free energy surface of the reaction system was represented by catalytic geometrical constraints, and the binding energy between the active site and transition state was minimized to reduce the activation energy barrier. The resulting hyperscale combinatorial optimization problem was tackled using a novel heuristic global optimization algorithm, which was inspired and tested by the protein core sequence selection problem. The sequence recapitulation tests on native active sites for two enzyme catalyzed hydrolytic reactions were applied to evaluate the predictive power of the design methodology. The results of the calculation show that most of the native binding sites can be successfully identified if the catalytic geometrical constraints and the structural motifs of the substrate are taken into account. Reliably predicting active site sequences may have significant implications for the creation of novel enzymes that are capable of catalyzing targeted chemical reactions. PMID:23649589

  7. Genome-wide identification and characterization of Notch transcription complex-binding sequence paired sites in leukemia cells

    PubMed Central

    Severson, Eric; Arnett, Kelly L.; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S.; Liu, X. Shirley; Blacklow, Stephen C.; Aster, Jon C.

    2018-01-01

    Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and are linked to the Notch-responsiveness of a few genes, but their overall contribution to Notch-dependent gene regulation is unknown. To address this issue, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay, and applied insights from these in vitro studies to Notch-“addicted” leukemia cells. We find that SPSs contribute to the regulation of approximately a third of direct Notch target genes. While originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5. Our work provides a general method for identifying sequence-paired sites in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. PMID:28465412

  8. Transcriptional activation of the Escherichia coli adaptive response gene aidB is mediated by binding of methylated Ada protein. Evidence for a new consensus sequence for Ada-binding sites.

    PubMed

    Landini, P; Volkert, M R

    1995-04-07

    The Escherichia coli aidB gene is part of the adaptive response to DNA methylation damage. Genes belonging to the adaptive response are positively regulated by the ada gene; the Ada protein acts as a transcriptional activator when methylated in one of its cysteine residues at position 69. Through DNaseI protection assays, we show that methylated Ada (meAda) is able to bind a DNA sequence between 40 and 60 base pairs upstream of the aidB transcriptional startpoint. Binding of meAda is necessary to activate transcription of the adaptive response genes; accordingly, in vitro transcription of aidB is dependent on the presence of meAda. Unmethylated Ada protein shows no protection against DNaseI digestion in the aidB promoter region nor does it promote aidB in vitro transcription. The aidB Ada-binding site shows only weak homology to the proposed consensus sequences for Ada-binding sites in E. coli (AAANNAA and AAAGCGCA) but shares a higher degree of similarity with the Ada-binding regions from other bacterial species, such as Salmonella typhimurium and Bacillus subtilis. Based on the comparison of five different Ada-dependent promoter regions, we suggest that a possible recognition sequence for meAda might be AATnnnnnnG-CAA. Higher concentrations of Ada are required for the binding of aidB than for the ada promoter, suggesting lower affinity of the protein for the aidB Ada-binding site. Common features in the Ada-binding regions of ada and aidB are a high A/T content, the presence of an inverted repeat structure, and their position relative to the transcriptional start site. We propose that these elements, in addition to the proposed recognition sequence, are important for binding of the Ada protein.

  9. Accurate and sensitive quantification of protein-DNA binding affinity.

    PubMed

    Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J

    2018-04-17

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.

  10. Accurate and sensitive quantification of protein-DNA binding affinity

    PubMed Central

    Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.

    2018-01-01

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332

  11. Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchman, A.R.; Kimmerly, W.J.; Rine, J.

    1988-01-01

    Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less

  12. Specific minor groove solvation is a crucial determinant of DNA binding site recognition

    PubMed Central

    Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.

    2014-01-01

    The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976

  13. Novel DNA packaging recognition in the unusual bacteriophage N15

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feiss, Michael; Geyer, Henriette, E-mail: henriettegeyer@gmail.com; Division of Viral Infections, Robert Koch Institute, Berlin

    Phage lambda's cosB packaging recognition site is tripartite, consisting of 3 TerS binding sites, called R sequences. TerS binding to the critical R3 site positions the TerL endonuclease for nicking cosN to generate cohesive ends. The N15 cos (cos{sup N15}) is closely related to cos{sup λ}, but whereas the cosB{sup N15} subsite has R3, it lacks the R2 and R1 sites and the IHF binding site of cosB{sup λ}. A bioinformatic study of N15-like phages indicates that cosB{sup N15} also has an accessory, remote rR2 site, which is proposed to increase packaging efficiency, like R2 and R1 of lambda. N15more » plus five prophages all have the rR2 sequence, which is located in the TerS-encoding 1 gene, approximately 200 bp distal to R3. An additional set of four highly related prophages, exemplified by Monarch, has R3 sequence, but also has R2 and R1 sequences characteristic of cosB–λ. The DNA binding domain of TerS-N15 is a dimer. - Highlights: • There are two classes of DNA packaging signals in N15-related phages. • Phage N15's TerS binding site: a critical site and a possible remote accessory site. • Viral DNA recognition signals by the λ-like bacteriophages: the odd case of N15.« less

  14. CENP-B binds a novel centromeric sequence in the Asian mouse Mus caroli.

    PubMed Central

    Kipling, D; Mitchell, A R; Masumoto, H; Wilson, H E; Nicol, L; Cooke, H J

    1995-01-01

    Minor satellite DNA, found at Mus musculus centromeres, is not present in the genome of the Asian mouse Mus caroli. This repetitive sequence family is speculated to have a role in centromere function by providing an array of binding sites for the centromere-associated protein CENP-B. The apparent absence of CENP-B binding sites in the M. caroli genome poses a major challenge to this hypothesis. Here we describe two abundant satellite DNA sequences present at M. caroli centromeres. These satellites are organized as tandem repeat arrays, over 1 Mb in size, of either 60- or 79-bp monomers. All autosomes carry both satellites and small amounts of a sequence related to the M. musculus major satellite. The Y chromosome contains small amounts of both major satellite and the 60-bp satellite, whereas the X chromosome carries only major satellite sequences. M. caroli chromosomes segregate in M. caroli x M. musculus interspecific hybrid cell lines, indicating that the two sets of chromosomes can interact with the same mitotic spindle. Using a polyclonal CENP-B antiserum, we demonstrate that M. caroli centromeres can bind murine CENP-B in such an interspecific cell line, despite the absence of canonical 17-bp CENP-B binding sites in the M. caroli genome. Sequence analysis of the 79-bp M. caroli satellite reveals a 17-bp motif that contains all nine bases previously shown to be necessary for in vitro binding of CENP-B. This M. caroli motif binds CENP-B from HeLa cell nuclear extract in vitro, as indicated by gel mobility shift analysis. We therefore suggest that this motif also causes CENP-B to associate with M. caroli centromeres in vivo. Despite the sequence differences, M. caroli presents a third, novel mammalian centromeric sequence producing an array of binding sites for CENP-B. PMID:7623797

  15. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

    PubMed Central

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  16. SSMART: Sequence-structure motif identification for RNA-binding proteins.

    PubMed

    Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

    2018-06-11

    RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.

  17. Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2018-05-02

    RNA-binding proteins (RBPs) take over 5∼10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using pattern learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN run 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. https://github.com/xypan1232/iDeepE. xypan172436@gmail.com or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.

  18. Incorporating evolution of transcription factor binding sites into annotated alignments.

    PubMed

    Bais, Abha S; Grossmann, Stefen; Vingron, Martin

    2007-08-01

    Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.

  19. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.

    PubMed

    Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques

    2008-01-01

    This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

  20. NMR studies of DNA oligomers and their interactions with minor groove binding ligands

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fagan, Patricia A.

    1996-05-01

    The cationic peptide ligands distamycin and netropsin bind noncovalently to the minor groove of DNA. The binding site, orientation, stoichiometry, and qualitative affinity of distamycin binding to several short DNA oligomers were investigated by NMR spectroscopy. The oligomers studied contain A,T-rich or I,C-rich binding sites, where I = 2-desaminodeoxyguanosine. I•C base pairs are functional analogs of A•T base pairs in the minor groove. The different behaviors exhibited by distamycin and netropsin binding to various DNA sequences suggested that these ligands are sensitive probes of DNA structure. For sites of five or more base pairs, distamycin can form 1:1 or 2:1more » ligand:DNA complexes. Cooperativity in distamycin binding is low in sites such as AAAAA which has narrow minor grooves, and is higher in sites with wider minor grooves such as ATATAT. The distamycin binding and base pair opening lifetimes of I,C-containing DNA oligomers suggest that the I,C minor groove is structurally different from the A,T minor groove. Molecules which direct chemistry to a specific DNA sequence could be used as antiviral compounds, diagnostic probes, or molecular biology tools. The author studied two ligands in which reactive groups were tethered to a distamycin to increase the sequence specificity of the reactive agent.« less

  1. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    PubMed

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  2. Selection of the simplest RNA that binds isoleucine

    PubMed Central

    LOZUPONE, CATHERINE; CHANGAYIL, SHANKAR; MAJERFELD, IRENE; YARUS, MICHAEL

    2003-01-01

    We have identified the simplest RNA binding site for isoleucine using selection-amplification (SELEX), by shrinking the size of the randomized region until affinity selection is extinguished. Such a protocol can be useful because selection does not necessarily make the simplest active motif most prominent, as is often assumed. We find an isoleucine binding site that behaves exactly as predicted for the site that requires fewest nucleotides. This UAUU motif (16 highly conserved positions; 27 total), is also the most abundant site in successful selections on short random tracts. The UAUU site, now isolated independently at least 63 times, is a small asymmetric internal loop. Conserved loop sequences include isoleucine codon and anticodon triplets, whose nucleotides are required for amino acid binding. This reproducible association between isoleucine and its coding sequences supports the idea that the genetic code is, at least in part, a stereochemical residue of the most easily isolated RNA–amino acid binding structures. PMID:14561881

  3. Endogenous Hot Spots of De Novo Telomere Addition in the Yeast Genome Contain Proximal Enhancers That Bind Cdc13

    PubMed Central

    Obodo, Udochukwu C.; Epum, Esther A.; Platts, Margaret H.; Seloff, Jacob; Dahlson, Nicole A.; Velkovsky, Stoycho M.; Paul, Shira R.

    2016-01-01

    DNA double-strand breaks (DSBs) pose a threat to genome stability and are repaired through multiple mechanisms. Rarely, telomerase, the enzyme that maintains telomeres, acts upon a DSB in a mutagenic process termed telomere healing. The probability of telomere addition is increased at specific genomic sequences termed sites of repair-associated telomere addition (SiRTAs). By monitoring repair of an induced DSB, we show that SiRTAs on chromosomes V and IX share a bipartite structure in which a core sequence (Core) is directly targeted by telomerase, while a proximal sequence (Stim) enhances the probability of de novo telomere formation. The Stim and Core sequences are sufficient to confer a high frequency of telomere addition to an ectopic site. Cdc13, a single-stranded DNA binding protein that recruits telomerase to endogenous telomeres, is known to stimulate de novo telomere addition when artificially recruited to an induced DSB. Here we show that the ability of the Stim sequence to enhance de novo telomere addition correlates with its ability to bind Cdc13, indicating that natural sites at which telomere addition occurs at high frequency require binding by Cdc13 to a sequence 20 to 100 bp internal from the site at which telomerase acts to initiate de novo telomere addition. PMID:27044869

  4. BIPAD: A web server for modeling bipartite sequence elements

    PubMed Central

    Bi, Chengpeng; Rogan, Peter K

    2006-01-01

    Background Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. Results We introduce the Bipad Server [1], a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. Conclusion The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins. PMID:16503993

  5. Understanding the mechanisms of protein-DNA interactions

    NASA Astrophysics Data System (ADS)

    Lavery, Richard

    2004-03-01

    Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.

  6. Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development

    PubMed Central

    Kazemian, Majid; Pham, Hannah; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh

    2013-01-01

    Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein–protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action. PMID:23847101

  7. Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development.

    PubMed

    Kazemian, Majid; Pham, Hannah; Wolfe, Scot A; Brodsky, Michael H; Sinha, Saurabh

    2013-09-01

    Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.

  8. Localization and characterization of an alpha-thrombin-binding site on platelet glycoprotein Ib alpha.

    PubMed

    De Marco, L; Mazzucato, M; Masotti, A; Ruggeri, Z M

    1994-03-04

    Glycoprotein (GP) Ib alpha is required for expression of the highest affinity alpha-thrombin-binding site on platelets, possibly contributing to platelet activation through a pathway involving cleavage of a specific receptor. This function may be important for the initiation of hemostasis and may also play a role in the development of pathological vascular occlusion. We have now identified a discrete sequence in the extracytoplasmic domain of GP Ib alpha, including residues 271-284 of the mature protein, which appears to be part of the high affinity alpha-thrombin-binding site. Synthetic peptidyl mimetics of this sequence inhibit alpha-thrombin binding to GP Ib as well as platelet activation and aggregation induced by subnanomolar concentrations of the agonist; they also inhibit alpha-thrombin binding to purified glycocalicin, the isolated extracytoplasmic portion of GP Ib alpha. The inhibitory peptides interfere with the clotting of fibrinogen by alpha-thrombin but not with the amidolytic activity of the enzyme on a small synthetic substrate, a finding compatible with the concept that the identified GP Ib alpha sequence interacts with the anion-binding exosite of alpha-thrombin but not with its active proteolytic site. The crucial structural elements of this sequence necessary for thrombin binding appear to be a cluster of negatively charged residues as well as three tyrosine residues that, in the native protein, may be sulfated. GP Ib alpha has no significant overall sequence homology with the thrombin inhibitor, hirudin, nor with the specific thrombin receptor on platelets; all three molecules, however, possess a distinct region rich in negatively charged residues that appear to be involved in thrombin binding. This may represent a case of convergent evolution of unrelated proteins for high affinity interaction with the same ligand.

  9. Identification of high-specificity H-NS binding site in LEE5 promoter of enteropathogenic Esherichia coli (EPEC).

    PubMed

    Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E

    2014-07-01

    Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.

  10. Characterization of a protein that binds multiple sequences in mammalian type C retrovirus enhancers.

    PubMed Central

    Sun, W; O'Connell, M; Speck, N A

    1993-01-01

    Mammalian type C retrovirus enhancer factor 1 (MCREF-1) is a nuclear protein that binds several directly repeated sequences (CNGGN6CNGG) in the Moloney and Friend murine leukemia virus (MLV) enhancers (N. R. Manley, M. O'Connell, W. Sun, N. A. Speck, and N. Hopkins, J. Virol. 67:1967-1975, 1993). In this paper, we describe the partial purification of MCREF-1 from calf thymus nuclei and further characterize the binding properties of MCREF-1. MCREF-1 binds four sites in the Moloney MLV enhancer and three sites in the Friend MLV enhancer. Ethylation interference analysis suggests that the MCREF-1 binding site spans two adjacent minor grooves of DNA. Images PMID:8445719

  11. A mammary cell-specific enhancer in mouse mammary tumor virus DNA is composed of multiple regulatory elements including binding sites for CTF/NFI and a novel transcription factor, mammary cell-activating factor.

    PubMed Central

    Mink, S; Härtig, E; Jennewein, P; Doppler, W; Cato, A C

    1992-01-01

    Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus involved in the neoplastic transformation of mouse mammary gland cells. The expression of this virus is regulated by mammary cell type-specific factors, steroid hormones, and polypeptide growth factors. Sequences for mammary cell-specific expression are located in an enhancer element in the extreme 5' end of the long terminal repeat region of this virus. This enhancer, when cloned in front of the herpes simplex thymidine kinase promoter, endows the promoter with mammary cell-specific response. Using functional and DNA-protein-binding studies with constructs mutated in the MMTV long terminal repeat enhancer, we have identified two main regulatory elements necessary for the mammary cell-specific response. These elements consist of binding sites for a transcription factor in the family of CTF/NFI proteins and the transcription factor mammary cell-activating factor (MAF) that recognizes the sequence G Pu Pu G C/G A A G G/T. Combinations of CTF/NFI- and MAF-binding sites or multiple copies of either one of these binding sites but not solitary binding sites mediate mammary cell-specific expression. The functional activities of these two regulatory elements are enhanced by another factor that binds to the core sequence ACAAAG. Interdigitated binding sites for CTF/NFI, MAF, and/or the ACAAAG factor are also found in the 5' upstream regions of genes encoding whey milk proteins from different species. These findings suggest that mammary cell-specific regulation is achieved by a concerted action of factors binding to multiple regulatory sites. Images PMID:1328867

  12. Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.

    PubMed

    de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J

    2002-09-01

    The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.

  13. Changes in tau phosphorylation in hibernating rodents.

    PubMed

    León-Espinosa, Gonzalo; García, Esther; García-Escudero, Vega; Hernández, Félix; Defelipe, Javier; Avila, Jesús

    2013-07-01

    Tau is a cytoskeletal protein present mainly in the neurons of vertebrates. By comparing the sequence of tau molecule among different vertebrates, it was found that the variability of the N-terminal sequence in tau protein is higher than that of the C-terminal region. The N-terminal region is involved mainly in the binding of tau to cellular membranes, whereas the C-terminal region of the tau molecule contains the microtubule-binding sites. We have compared the sequence of Syrian hamster tau with the sequences of other hibernating and nonhibernating rodents and investigated how differences in the N-terminal region of tau could affect the phosphorylation level and tau binding to cell membranes. We also describe a change, in tau phosphorylation, on a casein kinase 1 (ck1)-dependent site that is found only in hibernating rodents. This ck1 site seems to play an important role in the regulation of tau binding to membranes. Copyright © 2013 Wiley Periodicals, Inc.

  14. Fibronectin tetrapeptide is target for syphilis spirochete cytadherence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thomas, D.D.; Baseman, J.B.; Alderete, J.F.

    1985-11-01

    The syphilis bacterium, Treponema pallidum, parasitizes host cells through recognition of fibronectin (Fn) on cell surfaces. The active site of the Fn molecule has been identified as a four-amino acid sequence, arg-gly-asp-ser (RGDS), located on each monomer of the cell-binding domain. The synthetic heptapeptide gly-arg-gly-asp-ser-pro-cys (GRGDSPC), with the active site sequence RGDS, specifically competed with SVI-labeled cell-binding domain acquisition by T. pallidum. Additionally, the same heptapeptide with the RGDS sequence diminished treponemal attachment to HEp-2 and HT1080 cell monolayers. Related heptapeptides altered in one key amino acid within the RGDS sequence failed to inhibit Fn cell-binding domain acquisition or parasitismmore » of host cells by T. pallidum. The data support the view that T. pallidum cytadherence of host cells is through recognition of the RGDS sequence also important for eukaryotic cell-Fn binding.« less

  15. Isolation from genomic DNA of sequences binding specific regulatory proteins by the acceleration of protein electrophoretic mobility upon DNA binding.

    PubMed

    Subrahmanyam, S; Cronan, J E

    1999-01-21

    We report an efficient and flexible in vitro method for the isolation of genomic DNA sequences that are the binding targets of a given DNA binding protein. This method takes advantage of the fact that binding of a protein to a DNA molecule generally increases the rate of migration of the protein in nondenaturing gel electrophoresis. By the use of a radioactively labeled DNA-binding protein and nonradioactive DNA coupled with PCR amplification from gel slices, we show that specific binding sites can be isolated from Escherichia coli genomic DNA. We have applied this method to isolate a binding site for FadR, a global regulator of fatty acid metabolism in E. coli. We have also isolated a second binding site for BirA, the biotin operon repressor/biotin ligase, from the E. coli genome that has a very low binding efficiency compared with the bio operator region.

  16. Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters

    PubMed Central

    Wozniak, Christopher E.; Hughes, Kelly T.

    2008-01-01

    Summary Computational searches for DNA binding sites often utilize consensus sequences. These search models make assumptions that the frequency of a base pair in an alignment relates to the base pair’s importance in binding and presume that base pairs contribute independently to the overall interaction with the DNA binding protein. These two assumptions have generally been found to be accurate for DNA binding sites. However, these assumptions are often not satisfied for promoters, which are involved in additional steps in transcription initiation after RNA polymerase has bound to the DNA. To test these assumptions for the flagellar regulatory hierarchy, class 2 and class 3 flagellar promoters were randomly mutagenized in Salmonella. Important positions were then saturated for mutagenesis and compared to scores calculated from the consensus sequence. Double mutants were constructed to determine how mutations combined for each promoter type. Mutations in the binding site for FlhD4C2, the activator of class 2 promoters, better satisfied the assumptions for the binding model than did mutations in the class 3 promoter, which is recognized by the σ28 transcription factor. These in vivo results indicate that the activator sites within flagellar promoters can be modeled using simple assumptions but that the DNA sequences recognized by the flagellar sigma factor require more complex models. PMID:18486950

  17. Adjacent DNA sequences modulate Sox9 transcriptional activation at paired Sox sites in three chondrocyte-specific enhancer elements

    PubMed Central

    Bridgewater, Laura C.; Walker, Marlan D.; Miller, Gwen C.; Ellison, Trevor A.; Holsinger, L. Daniel; Potter, Jennifer L.; Jackson, Todd L.; Chen, Reuben K.; Winkel, Vicki L.; Zhang, Zhaoping; McKinney, Sandra; de Crombrugghe, Benoit

    2003-01-01

    Expression of the type XI collagen gene Col11a2 is directed to cartilage by at least three chondrocyte-specific enhancer elements, two in the 5′ region and one in the first intron of the gene. The three enhancers each contain two heptameric sites with homology to the Sox protein-binding consensus sequence. The two sites are separated by 3 or 4 bp and arranged in opposite orientation to each other. Targeted mutational analyses of these three enhancers showed that in the intronic enhancer, as in the other two enhancers, both Sox sites in a pair are essential for enhancer activity. The transcription factor Sox9 binds as a dimer at the paired sites, and the introduction of insertion mutations between the sites demonstrated that physical interactions between the adjacently bound proteins are essential for enhancer activity. Additional mutational analyses demonstrated that although Sox9 binding at the paired Sox sites is necessary for enhancer activity, it alone is not sufficient. Adjacent DNA sequences in each enhancer are also required, and mutation of those sequences can eliminate enhancer activity without preventing Sox9 binding. The data suggest a new model in which adjacently bound proteins affect the DNA bend angle produced by Sox9, which in turn determines whether an active transcriptional enhancer complex is assembled. PMID:12595563

  18. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. Studies on the regulation of the human E1 subunit of the 2-oxoglutarate dehydrogenase complex, including the identification of a novel calcium-binding site.

    PubMed

    Armstrong, Craig T; Anderson, J L Ross; Denton, Richard M

    2014-04-15

    The regulation of the 2-oxoglutarate dehydrogenase complex is central to intramitochondrial energy metabolism. In the present study, the active full-length E1 subunit of the human complex has been expressed and shown to be regulated by Ca2+, adenine nucleotides and NADH, with NADH exerting a major influence on the K0.5 value for Ca2+. We investigated two potential Ca2+-binding sites on E1, which we term site 1 (D114ADLD) and site 2 (E139SDLD). Comparison of sequences from vertebrates with those from Ca2+-insensitive non-vertebrate complexes suggest that site 1 may be the more important. Consistent with this view, a mutated form of E1, D114A, shows a 6-fold decrease in sensitivity for Ca2+, whereas variant ∆site1 (in which the sequence of site 1 is replaced by A114AALA) exhibits an almost complete loss of Ca2+ activation. Variant ∆site2 (in which the sequence is replaced with A139SALA) shows no measurable change in Ca2+ sensitivity. We conclude that site 1, but not site 2, forms part of a regulatory Ca2+-binding site, which is distinct from other previously described Ca2+-binding sites.

  20. Human Lineage-Specific Transcriptional Regulation through GA-Binding Protein Transcription Factor Alpha (GABPa)

    PubMed Central

    Perdomo-Sabogal, Alvaro; Nowick, Katja; Piccini, Ilaria; Sudbrak, Ralf; Lehrach, Hans; Yaspo, Marie-Laure; Warnatz, Hans-Jörg; Querfurth, Robert

    2016-01-01

    A substantial fraction of phenotypic differences between closely related species are likely caused by differences in gene regulation. While this has already been postulated over 30 years ago, only few examples of evolutionary changes in gene regulation have been verified. Here, we identified and investigated binding sites of the transcription factor GA-binding protein alpha (GABPa) aiming to discover cis-regulatory adaptations on the human lineage. By performing chromatin immunoprecipitation-sequencing experiments in a human cell line, we found 11,619 putative GABPa binding sites. Through sequence comparisons of the human GABPa binding regions with orthologous sequences from 34 mammals, we identified substitutions that have resulted in 224 putative human-specific GABPa binding sites. To experimentally assess the transcriptional impact of those substitutions, we selected four promoters for promoter-reporter gene assays using human and African green monkey cells. We compared the activities of wild-type promoters to mutated forms, where we have introduced one or more substitutions to mimic the ancestral state devoid of the GABPa consensus binding sequence. Similarly, we introduced the human-specific substitutions into chimpanzee and macaque promoter backgrounds. Our results demonstrate that the identified substitutions are functional, both in human and nonhuman promoters. In addition, we performed GABPa knock-down experiments and found 1,215 genes as strong candidates for primary targets. Further analyses of our data sets link GABPa to cognitive disorders, diabetes, KRAB zinc finger (KRAB-ZNF), and human-specific genes. Thus, we propose that differences in GABPa binding sites played important roles in the evolution of human-specific phenotypes. PMID:26814189

  1. Functional genetic selection of Helix 66 in Escherichia coli 23S rRNA identified the eukaryotic-binding sequence for ribosomal protein L2

    PubMed Central

    Kitahara, Kei; Kajiura, Akimasa; Sato, Neuza Satomi; Suzuki, Tsutomu

    2007-01-01

    Ribosomal protein L2 is a highly conserved primary 23S rRNA-binding protein. L2 specifically recognizes the internal bulge sequence in Helix 66 (H66) of 23S rRNA and is localized to the intersubunit space through formation of bridge B7b with 16S rRNA. The L2-binding site in H66 is highly conserved in prokaryotic ribosomes, whereas the corresponding site in eukaryotic ribosomes has evolved into distinct classes of sequences. We performed a systematic genetic selection of randomized rRNA sequences in Escherichia coli, and isolated 20 functional variants of the L2-binding site. The isolated variants consisted of eukaryotic sequences, in addition to prokaryotic sequences. These results suggest that L2/L8e does not recognize a specific base sequence of H66, but rather a characteristic architecture of H66. The growth phenotype of the isolated variants correlated well with their ability of subunit association. Upon continuous cultivation of a deleterious variant, we isolated two spontaneous mutations within domain IV of 23S rRNA that compensated for its weak subunit association, and alleviated its growth defect, implying that functional interactions between intersubunit bridges compensate ribosomal function. PMID:17553838

  2. One recognition sequence, seven restriction enzymes, five reaction mechanisms

    PubMed Central

    Gowers, Darren M.; Bellamy, Stuart R.W.; Halford, Stephen E.

    2004-01-01

    The diversity of reaction mechanisms employed by Type II restriction enzymes was investigated by analysing the reactions of seven endonucleases at the same DNA sequence. NarI, KasI, Mly113I, SfoI, EgeI, EheI and BbeI cleave DNA at several different positions in the sequence 5′-GGCGCC-3′. Their reactions on plasmids with one or two copies of this sequence revealed five distinct mechanisms. These differ in terms of the number of sites the enzyme binds, and the number of phosphodiester bonds cleaved per turnover. NarI binds two sites, but cleaves only one bond per DNA-binding event. KasI also cuts only one bond per turnover but acts at individual sites, preferring intact to nicked sites. Mly113I cuts both strands of its recognition sites, but shows full activity only when bound to two sites, which are then cleaved concertedly. SfoI, EgeI and EheI cut both strands at individual sites, in the manner historically considered as normal for Type II enzymes. Finally, BbeI displays an absolute requirement for two sites in close physical proximity, which are cleaved concertedly. The range of reaction mechanisms for restriction enzymes is thus larger than commonly imagined, as is the number of enzymes needing two recognition sites. PMID:15226412

  3. Two distinct DNA sequences recognized by transcription factors represent enthalpy and entropy optima

    PubMed Central

    Yin, Yimeng; Das, Pratyush K; Jolma, Arttu; Zhu, Fangjie; Popov, Alexander; Xu, You; Nilsson, Lennart

    2018-01-01

    Most transcription factors (TFs) can bind to a population of sequences closely related to a single optimal site. However, some TFs can bind to two distinct sequences that represent two local optima in the Gibbs free energy of binding (ΔG). To determine the molecular mechanism behind this effect, we solved the structures of human HOXB13 and CDX2 bound to their two optimal DNA sequences, CAATAAA and TCGTAAA. Thermodynamic analyses by isothermal titration calorimetry revealed that both sites were bound with similar ΔG. However, the interaction with the CAA sequence was driven by change in enthalpy (ΔH), whereas the TCG site was bound with similar affinity due to smaller loss of entropy (ΔS). This thermodynamic mechanism that leads to at least two local optima likely affects many macromolecular interactions, as ΔG depends on two partially independent variables ΔH and ΔS according to the central equation of thermodynamics, ΔG = ΔH - TΔS. PMID:29638214

  4. Structural analysis of DNA binding by C.Csp231I, a member of a novel class of R-M controller proteins regulating gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shevtsov, M. B.; Streeter, S. D.; Thresh, S.-J.

    2015-02-01

    The structure of the new class of controller proteins (exemplified by C.Csp231I) in complex with its 21 bp DNA-recognition sequence is presented, and the molecular basis of sequence recognition in this class of proteins is discussed. An unusual extended spacer between the dimer binding sites suggests a novel interaction between the two C-protein dimers. In a wide variety of bacterial restriction–modification systems, a regulatory ‘controller’ protein (or C-protein) is required for effective transcription of its own gene and for transcription of the endonuclease gene found on the same operon. We have recently turned our attention to a new class ofmore » controller proteins (exemplified by C.Csp231I) that have quite novel features, including a much larger DNA-binding site with an 18 bp (∼60 Å) spacer between the two palindromic DNA-binding sequences and a very different recognition sequence from the canonical GACT/AGTC. Using X-ray crystallography, the structure of the protein in complex with its 21 bp DNA-recognition sequence was solved to 1.8 Å resolution, and the molecular basis of sequence recognition in this class of proteins was elucidated. An unusual aspect of the promoter sequence is the extended spacer between the dimer binding sites, suggesting a novel interaction between the two C-protein dimers when bound to both recognition sites correctly spaced on the DNA. A U-bend model is proposed for this tetrameric complex, based on the results of gel-mobility assays, hydrodynamic analysis and the observation of key contacts at the interface between dimers in the crystal.« less

  5. Contacts between the factor TUF and RPG sequences.

    PubMed

    Vignais, M L; Huet, J; Buhler, J M; Sentenac, A

    1990-08-25

    The yeast TUF factor binds specifically to RPG-like sequences involved in multiple functions at enhancers, silencers, and telomeres. We have characterized the interaction of TUF with its optimal binding sequence, rpg-1 (1-ACACCCATACATTT-14), using a gel DNA-binding assay in combination with methylation protection and mutagenesis experiments. As many as 10 base pairs appear to be engaged in factor binding. Analysis of a collection of 30 different RPG mutants demonstrated the importance of 8 base pairs at position 2, 3, 4, 5, 6, 7, 10, and 12 and the critical role of the central GC pair at position 5. Methylation protection data on four different natural sites confirmed a close contact at positions 4, 5, 6, and 10 and suggested additional contacts at base pairs 8, 12, and 13. The derived consensus sequence was RCAAYCCRYNCAYY. A quantitative band shift analysis was used to determine the equilibrium dissociation constant for the complex of TUF and its optimal binding site rpg-1. The specific dissociation constant (K8) was found to be 1.3 x 10(-11) M. The comparison of the K8 value with the dissociation constant obtained for nonspecific DNA sites (Kn8 = 8.7 x 10(-6) M) shows the high binding selectivity of TUF for its specific RPG target.

  6. Base substitutions at scissile bond sites are sufficient to alter RNA-binding and cleavage activity of RNase III.

    PubMed

    Kim, Kyungsub; Sim, Se-Hoon; Jeon, Che Ok; Lee, Younghoon; Lee, Kangseok

    2011-02-01

    RNase III, a double-stranded RNA-specific endoribonuclease, degrades bdm mRNA via cleavage at specific sites. To better understand the mechanism of cleavage site selection by RNase III, we performed a genetic screen for sequences containing mutations at the bdm RNA cleavage sites that resulted in altered mRNA stability using a transcriptional bdm'-'cat fusion construct. While most of the isolated mutants showed the increased bdm'-'cat mRNA stability that resulted from the inability of RNase III to cleave the mutated sequences, one mutant sequence (wt-L) displayed in vivo RNA stability similar to that of the wild-type sequence. In vivo and in vitro analyses of the wt-L RNA substrate showed that it was cut only once on the RNA strand to the 5'-terminus by RNase III, while the binding constant of RNase III to this mutant substrate was moderately increased. A base substitution at the uncleaved RNase III cleavage site in wt-L mutant RNA found in another mutant lowered the RNA-binding affinity by 11-fold and abolished the hydrolysis of scissile bonds by RNase III. Our results show that base substitutions at sites forming the scissile bonds are sufficient to alter RNA cleavage as well as the binding activity of RNase III. © 2010 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  7. Comparative analysis and molecular characterization of genomic sequences and proteins of FABP4 and FABP5 from the giant panda (Ailuropoda melanoleuca).

    PubMed

    Song, B; Hou, Y L; Ding, X; Wang, T; Wang, F; Zhong, J C; Xu, T; Zhong, J; Hou, W R; Shuai, S R

    2014-02-20

    Fatty acid binding proteins (FABPs) are a family of small, highly conserved cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. In this study, cDNA and genomic sequences of FABP4 and FABP5 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using reverse transcription polymerase chain reaction (RT-PCR) technology and touchdown-PCR. The cDNAs of FABP4 and FABP5 cloned from the giant panda were 400 and 413 bp in length, containing an open reading frame of 399 and 408 bp, encoding 132 and 135 amino acids, respectively. The genomic sequences of FABP4 and FABP5 were 3976 and 3962 bp, respectively, which each contained four exons and three introns. Sequence alignment indicated a high degree of homology with reported FABP sequences of other mammals at both the amino acid and DNA levels. Topology prediction revealed seven protein kinase C phosphorylation sites, two casein kinase II phosphorylation sites, two N-myristoylation sites, and one cytosolic fatty acid-binding protein signature in the FABP4 protein, and three N-glycosylation sites, three protein kinase C phosphorylation sites, one casein kinase II phosphorylation site, one N-myristoylation site, one amidation site, and one cytosolic fatty acid-binding protein signature in the FABP5 protein. The FABP4 and FABP5 genes were overexpressed in Escherichia coli BL21 and they produced the expected 16.8- and 17.0-kDa polypeptides. The results obtained in this study provide information for further in-depth research of this system, which has great value of both theoretical and practical significance.

  8. Influence of quasi-specific sites on kinetics of target DNA search by a sequence-specific DNA-binding protein.

    PubMed

    Kemme, Catherine A; Esadze, Alexandre; Iwahara, Junji

    2015-11-10

    Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such "quasi-specific" sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1's association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins.

  9. Influence of Quasi-Specific Sites on Kinetics of Target DNA Search by a Sequence-Specific DNA-Binding Protein

    PubMed Central

    2015-01-01

    Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such “quasi-specific” sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1’s association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins. PMID:26502071

  10. microRNA-122 target sites in the hepatitis C virus RNA NS5B coding region and 3' untranslated region: function in replication and influence of RNA secondary structure.

    PubMed

    Gerresheim, Gesche K; Dünnes, Nadia; Nieder-Röhrmann, Anika; Shalamova, Lyudmila A; Fricke, Markus; Hofacker, Ivo; Höner Zu Siederdissen, Christian; Marz, Manja; Niepmann, Michael

    2017-02-01

    We have analyzed the binding of the liver-specific microRNA-122 (miR-122) to three conserved target sites of hepatitis C virus (HCV) RNA, two in the non-structural protein 5B (NS5B) coding region and one in the 3' untranslated region (3'UTR). miR-122 binding efficiency strongly depends on target site accessibility under conditions when the range of flanking sequences available for the formation of local RNA secondary structures changes. Our results indicate that the particular sequence feature that contributes most to the correlation between target site accessibility and binding strength varies between different target sites. This suggests that the dynamics of miRNA/Ago2 binding not only depends on the target site itself but also on flanking sequence context to a considerable extent, in particular in a small viral genome in which strong selection constraints act on coding sequence and overlapping cis-signals and model the accessibility of cis-signals. In full-length genomes, single and combination mutations in the miR-122 target sites reveal that site 5B.2 is positively involved in regulating overall genome replication efficiency, whereas mutation of site 5B.3 showed a weaker effect. Mutation of the 3'UTR site and double or triple mutants showed no significant overall effect on genome replication, whereas in a translation reporter RNA, the 3'UTR target site inhibits translation directed by the HCV 5'UTR. Thus, the miR-122 target sites in the 3'-region of the HCV genome are involved in a complex interplay in regulating different steps of the HCV replication cycle.

  11. Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

    PubMed Central

    Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

    2016-01-01

    Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825

  12. Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats☆

    PubMed Central

    Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

    2013-01-01

    Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. PMID:23648487

  13. Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats.

    PubMed

    Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

    2013-08-01

    Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    PubMed

    Allevato, Michael; Bolotin, Eugene; Grossman, Mark; Mane-Padros, Daniel; Sladek, Frances M; Martinez, Ernest

    2017-01-01

    The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX) bind Enhancer box (E-box) DNA elements (CANNTG) and have the greatest affinity for the canonical MYC E-box (CME) CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87%) of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  15. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  16. Mapping Interaction Sites on Human Chemokine Receptors by Deep Mutational Scanning.

    PubMed

    Heredia, Jeremiah D; Park, Jihye; Brubaker, Riley J; Szymanski, Steven K; Gill, Kevin S; Procko, Erik

    2018-06-01

    Chemokine receptors CXCR4 and CCR5 regulate WBC trafficking and are engaged by the HIV-1 envelope glycoprotein gp120 during infection. We combine a selection of human CXCR4 and CCR5 libraries comprising nearly all of ∼7000 single amino acid substitutions with deep sequencing to define sequence-activity landscapes for surface expression and ligand interactions. After consideration of sequence constraints for surface expression, known interaction sites with HIV-1-blocking Abs were appropriately identified as conserved residues following library sorting for Ab binding, validating the use of deep mutational scanning to map functional interaction sites in G protein-coupled receptors. Chemokine CXCL12 was found to interact with residues extending asymmetrically into the CXCR4 ligand-binding cavity, similar to the binding surface of CXCR4 recognized by an antagonistic viral chemokine previously observed crystallographically. CXCR4 mutations distal from the chemokine binding site were identified that enhance chemokine recognition. This included disruptive mutations in the G protein-coupling site that diminished calcium mobilization, as well as conservative mutations to a membrane-exposed site (CXCR4 residues H79 2.45 and W161 4.50 ) that increased ligand binding without loss of signaling. Compared with CXCR4-CXCL12 interactions, CCR5 residues conserved for gp120 (HIV-1 BaL strain) interactions map to a more expansive surface, mimicking how the cognate chemokine CCL5 makes contacts across the entire CCR5 binding cavity. Acidic substitutions in the CCR5 N terminus and extracellular loops enhanced gp120 binding. This study demonstrates how comprehensive mutational scanning can define functional interaction sites on receptors, and novel mutations that enhance receptor activities can be found simultaneously. Copyright © 2018 by The American Association of Immunologists, Inc.

  17. Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.

    PubMed

    Majoros, William H; Ohler, Uwe

    2010-12-16

    The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.

  18. MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression

    PubMed Central

    Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.

    2016-01-01

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769

  19. Sequence of ligand binding and structure change in the diphtheria toxin repressor upon activation by divalent transition metals.

    PubMed

    Rangachari, Vijayaraghavan; Marin, Vedrana; Bienkiewicz, Ewa A; Semavina, Maria; Guerrero, Luis; Love, John F; Murphy, John R; Logan, Timothy M

    2005-04-19

    The diphtheria toxin repressor (DtxR) is an Fe(II)-activated transcriptional regulator of iron homeostatic and virulence genes in Corynebacterium diphtheriae. DtxR is a two-domain protein that contains two structurally and functionally distinct metal binding sites. Here, we investigate the molecular steps associated with activation by Ni(II)Cl(2) and Cd(II)Cl(2). Equilibrium binding energetics for Ni(II) were obtained from isothermal titration calorimetry, indicating apparent metal dissociation constants of 0.2 and 1.7 microM for two independent sites. The binding isotherms for Ni(II) and Cd(II) exhibited a characteristic exothermic-endothermic pattern that was used to infer the metal binding sequence by comparing the wild-type isotherm with those of several binding site mutants. These data were complemented by measuring the distance between specific backbone amide nitrogens and the first equivalent of metal through heteronuclear NMR relaxation measurements. Previous studies indicated that metal binding affects a disordered to ordered transition in the metal binding domain. The coupling between metal binding and structure change was investigated using near-UV circular dichroism spectroscopy. Together, the data show that the first equivalent of metal is bound by the primary metal binding site. This binding orients the DNA binding helices and begins to fold the N-terminal domain. Subsequent binding at the ancillary site completes the folding of this domain and formation of the dimer interface. This model is used to explain the behavior of several mutants.

  20. Human immunodeficiency virus type 1 LTR TATA and TAR region sequences required for transcriptional regulation.

    PubMed Central

    Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B

    1989-01-01

    The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501

  1. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis

    PubMed Central

    Moore, Michael; Zhang, Chaolin; Gantman, Emily Conn; Mele, Aldo; Darnell, Jennifer C.; Darnell, Robert B.

    2014-01-01

    Summary Identifying sites where RNA binding proteins (RNABPs) interact with target RNAs opens the door to understanding the vast complexity of RNA regulation. UV-crosslinking and immunoprecipitation (CLIP) is a transformative technology in which RNAs purified from in vivo cross-linked RNA-protein complexes are sequenced to reveal footprints of RNABP:RNA contacts. CLIP combined with high throughput sequencing (HITS-CLIP) is a generalizable strategy to produce transcriptome-wide RNA binding maps with higher accuracy and resolution than standard RNA immunoprecipitation (RIP) profiling or purely computational approaches. Applying CLIP to Argonaute proteins has expanded the utility of this approach to mapping binding sites for microRNAs and other small regulatory RNAs. Finally, recent advances in data analysis take advantage of crosslinked-induced mutation sites (CIMS) to refine RNA-binding maps to single-nucleotide resolution. Once IP conditions are established, HITS-CLIP takes approximately eight days to prepare RNA for sequencing. Established pipelines for data analysis, including for CIMS, take 3-4 days. PMID:24407355

  2. The Interaction of Integrin αIIbβ3 with Fibrin Occurs through Multiple Binding Sites in the αIIb β-Propeller Domain*

    PubMed Central

    Podolnikova, Nataly P.; Yakovlev, Sergiy; Yakubenko, Valentin P.; Wang, Xu; Gorkun, Oleg V.; Ugarova, Tatiana P.

    2014-01-01

    The currently available antithrombotic agents target the interaction of platelet integrin αIIbβ3 (GPIIb-IIIa) with fibrinogen during platelet aggregation. Platelets also bind fibrin formed early during thrombus growth. It was proposed that inhibition of platelet-fibrin interactions may be a necessary and important property of αIIbβ3 antagonists; however, the mechanisms by which αIIbβ3 binds fibrin are uncertain. We have previously identified the γ370–381 sequence (P3) in the γC domain of fibrinogen as the fibrin-specific binding site for αIIbβ3 involved in platelet adhesion and platelet-mediated fibrin clot retraction. In the present study, we have demonstrated that P3 can bind to several discontinuous segments within the αIIb β-propeller domain of αIIbβ3 enriched with negatively charged and aromatic residues. By screening peptide libraries spanning the sequence of the αIIb β-propeller, several sequences were identified as candidate contact sites for P3. Synthetic peptides duplicating these segments inhibited platelet adhesion and clot retraction but not platelet aggregation, supporting the role of these regions in fibrin recognition. Mutant αIIbβ3 receptors in which residues identified as critical for P3 binding were substituted for homologous residues in the I-less integrin αMβ2 exhibited reduced cell adhesion and clot retraction. These residues are different from those that are involved in the coordination of the fibrinogen γ404–411 sequence and from auxiliary sites implicated in binding of soluble fibrinogen. These results map the binding of fibrin to multiple sites in the αIIb β-propeller and further indicate that recognition specificity of αIIbβ3 for fibrin differs from that for soluble fibrinogen. PMID:24338009

  3. Transcription Factor Information System (TFIS): A Tool for Detection of Transcription Factor Binding Sites.

    PubMed

    Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C

    2017-09-01

    Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).

  4. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    PubMed

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2017-03-17

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Mass Spectrometric Determination of ILPR G-quadruplex Binding Sites in Insulin and IGF-2

    PubMed Central

    Xiao, JunFeng

    2009-01-01

    The insulin-linked polymorphic region (ILPR) of the human insulin gene promoter region forms G-quadruplex structures in vitro. Previous studies show that insulin and insulin-like growth factor-2 (IGF-2) exhibit high affinity binding in vitro to 2-repeat sequences of ILPR variants a and h, but negligible binding to variant i. Two-repeat sequences of variants a and h form intramolecular G-quadruplex structures that are not evidenced for variant i. Here we report on the use of protein digestion combined with affinity capture and MALDI-MS detection to pinpoint ILPR binding sites in insulin and IGF-2. Peptides captured by ILPR variants a and h were sequenced by MALDI-MS/MS, LC-MS and in silico digestion. On-bead digestion of insulin-ILPR variant a complexes supported the conclusions. The results indicate that the sequence VCG(N)RGF is generally present in the captured peptides and is likely involved in the affinity binding interactions of the proteins with the ILPR G-quadruplexes. The significance of arginine in the interactions was studied by comparing the affinities of synthesized peptides VCGERGF and VCGEAGF with ILPR variant a. Peptides from other regions of the proteins that are connected through disulfide linkages were also detected in some capture experiments. Identification of binding sites could facilitate design of DNA binding ligands for capture and detection of insulin and IGF-2. The interactions may have biological significance as well. PMID:19747845

  6. Recognition of AT-Rich DNA Binding Sites by the MogR Repressor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, Aimee; Higgins, Darren E.; Panne, Daniel

    2009-07-22

    The MogR transcriptional repressor of the intracellular pathogen Listeria monocytogenes recognizes AT-rich binding sites in promoters of flagellar genes to downregulate flagellar gene expression during infection. We describe here the 1.8 A resolution crystal structure of MogR bound to the recognition sequence 5' ATTTTTTAAAAAAAT 3' present within the flaA promoter region. Our structure shows that MogR binds as a dimer. Each half-site is recognized in the major groove by a helix-turn-helix motif and in the minor groove by a loop from the symmetry-related molecule, resulting in a 'crossover' binding mode. This oversampling through minor groove interactions is important for specificity.more » The MogR binding site has structural features of A-tract DNA and is bent by approximately 52 degrees away from the dimer. The structure explains how MogR achieves binding specificity in the AT-rich genome of L. monocytogenes and explains the evolutionary conservation of A-tract sequence elements within promoter regions of MogR-regulated flagellar genes.« less

  7. A peek into tropomyosin binding and unfolding on the actin filament.

    PubMed

    Singh, Abhishek; Hitchcock-Degregori, Sarah E

    2009-07-24

    Tropomyosin is a prototypical coiled coil along its length with subtle variations in structure that allow interactions with actin and other proteins. Actin binding globally stabilizes tropomyosin. Tropomyosin-actin interaction occurs periodically along the length of tropomyosin. However, it is not well understood how tropomyosin binds actin. Tropomyosin's periodic binding sites make differential contributions to two components of actin binding, cooperativity and affinity, and can be classified as primary or secondary sites. We show through mutagenesis and analysis of recombinant striated muscle alpha-tropomyosins that primary actin binding sites have a destabilizing coiled-coil interface, typically alanine-rich, embedded within a non-interface recognition sequence. Introduction of an Ala cluster in place of the native, more stable interface in period 2 and/or period 3 sites (of seven) increased the affinity or cooperativity of actin binding, analysed by cosedimentation and differential scanning calorimetry. Replacement of period 3 with period 5 sequence, an unstable region of known importance for cooperative actin binding, increased the cooperativity of binding. Introduction of the fluorescent probe, pyrene, near the mutation sites in periods 2 and 3 reported local instability, stabilization by actin binding, and local unfolding before or coincident with dissociation from actin (measured using light scattering), and chain dissociation (analyzed using circular dichroism). This, and previous work, suggests that regions of tropomyosin involved in binding actin have non-interface residues specific for interaction with actin and an unstable interface that is locally stabilized upon binding. The destabilized interface allows residues on the coiled-coil surface to obtain an optimal conformation for interaction with actin by increasing the number of local substates that the side chains can sample. We suggest that local disorder is a property typical of coiled coil binding sites and proteins that have multiple binding partners, of which tropomyosin is one type.

  8. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    PubMed

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  9. TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

    PubMed

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

  10. TFBSshape: a motif database for DNA shape features of transcription factor binding sites

    PubMed Central

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

  11. Binding site size limit of the 2:1 pyrrole-imidazole polyamide-DNA motif.

    PubMed Central

    Kelly, J J; Baird, E E; Dervan, P B

    1996-01-01

    Polyamides containing N-methylimidazole (Im) and N-methylpyrrole (Py) amino acids can be combined in antiparallel side-by-side dimeric complexes for sequence-specific recognition in the minor groove of DNA. Six polyamides containing three to eight rings bind DNA sites 5-10 bp in length, respectively. Quantitative DNase I footprint titration experiments demonstrate that affinity maximizes and is similar at ring sizes of five, six, and seven. Sequence specificity decreases as the length of the polyamides increases beyond five rings. These results provide useful guidelines for the design of new polyamides that bind longer DNA sites with enhanced affinity and specificity. Images Fig. 4 PMID:8692930

  12. Informative priors based on transcription factor structural class improve de novo motif discovery.

    PubMed

    Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J

    2006-07-15

    An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.

  13. CsrA Participates in a PNPase Autoregulatory Mechanism by Selectively Repressing Translation of pnp Transcripts That Have Been Previously Processed by RNase III and PNPase

    PubMed Central

    Park, Hongmarn; Yakhnin, Helen; Connolly, Michael; Romeo, Tony

    2015-01-01

    ABSTRACT Csr is a conserved global regulatory system that represses or activates gene expression posttranscriptionally. CsrA of Escherichia coli is a homodimeric RNA binding protein that regulates transcription elongation, translation initiation, and mRNA stability by binding to the 5′ untranslated leader or initial coding sequence of target transcripts. pnp mRNA, encoding the 3′ to 5′ exoribonuclease polynucleotide phosphorylase (PNPase), was previously identified as a CsrA target by transcriptome sequencing (RNA-seq). Previous studies also showed that RNase III and PNPase participate in a pnp autoregulatory mechanism in which RNase III cleavage of the untranslated leader, followed by PNPase degradation of the resulting 5′ fragment, leads to pnp repression by an undefined translational repression mechanism. Here we demonstrate that CsrA binds to two sites in pnp leader RNA but only after the transcript is fully processed by RNase III and PNPase. In the absence of processing, both of the binding sites are sequestered in an RNA secondary structure, which prevents CsrA binding. The CsrA dimer bridges the upstream high-affinity site to the downstream site that overlaps the pnp Shine-Dalgarno sequence such that bound CsrA causes strong repression of pnp translation. CsrA-mediated translational repression also leads to a small increase in the pnp mRNA decay rate. Although CsrA has been shown to regulate translation and mRNA stability of numerous genes in a variety of organisms, this is the first example in which prior mRNA processing is required for CsrA-mediated regulation. IMPORTANCE CsrA protein represses translation of numerous mRNA targets, typically by binding to multiple sites in the untranslated leader region preceding the coding sequence. We found that CsrA represses translation of pnp by binding to two sites in the pnp leader transcript but only after it is processed by RNase III and PNPase. Processing by these two ribonucleases alters the mRNA secondary structure such that it becomes accessible to the ribosome for translation as well as to CsrA. As one of the CsrA binding sites overlaps the pnp ribosome binding site, bound CsrA prevents ribosome binding. This is the first example in which regulation by CsrA requires prior mRNA processing and should link pnp expression to conditions affecting CsrA activity. PMID:26438818

  14. A Sequence in the loop domain of hepatitis C virus E2 protein identified in silico as crucial for the selective binding to human CD81

    PubMed Central

    Chang, Chun-Chun; Hsu, Hao-Jen; Yen, Jui-Hung; Lo, Shih-Yen

    2017-01-01

    Hepatitis C virus (HCV) is a species-specific pathogenic virus that infects only humans and chimpanzees. Previous studies have indicated that interactions between the HCV E2 protein and CD81 on host cells are required for HCV infection. To determine the crucial factors for species-specific interactions at the molecular level, this study employed in silico molecular docking involving molecular dynamic simulations of the binding of HCV E2 onto human and rat CD81s. In vitro experiments including surface plasmon resonance measurements and cellular binding assays were applied for simple validations of the in silico results. The in silico studies identified two binding regions on the HCV E2 loop domain, namely E2-site1 and E2-site2, as being crucial for the interactions with CD81s, with the E2-site2 as the determinant factor for human-specific binding. Free energy calculations indicated that the E2/CD81 binding process might follow a two-step model involving (i) the electrostatic interaction-driven initial binding of human-specific E2-site2, followed by (ii) changes in the E2 orientation to facilitate the hydrophobic and van der Waals interaction-driven binding of E2-site1. The sequence of the human-specific, stronger-binding E2-site2 could serve as a candidate template for the future development of HCV-inhibiting peptide drugs. PMID:28481946

  15. MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

    DOE PAGES

    Paugh, Steven W.; Coss, David R.; Bao, Ju; ...

    2016-02-04

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less

  16. MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paugh, Steven W.; Coss, David R.; Bao, Ju

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less

  17. Site-specific cleavage of the transactivation response site of human immunodeficiency virus RNA with a tat-based chemical nuclease.

    PubMed Central

    Jayasena, S D; Johnston, B H

    1992-01-01

    tat, an essential transactivator of gene transcription in the human immunodeficiency virus (HIV), is believed to activate viral gene expression by binding to the transactivation response (TAR) site located at the 5' end of all viral mRNAs. The TAR element forms a stem-loop structure containing a 3-nucleotide bulge that is the site for tat binding and is required for transactivation. Here we report the synthesis of a site-specific chemical ribonuclease based on the TAR binding domain of the HIV type 1 (HIV-1) tat. A peptide consisting of this 24-amino acid domain plus an additional C-terminal cysteine residue was chemically synthesized and covalently linked to 1,10-phenanthroline at the cysteine residue. The modified peptide binds to TAR sequences of both HIV-1 and HIV-2 and, in the presence of cupric ions and a reducing agent, cleaves these RNAs at specific sites. Cleavage sites on TAR sequences are consistent with peptide binding to the 3-nucleotide bulge, and the relative displacement of cleavage sites on the two strands suggests peptide binding to the major groove of the RNA. These results and existing evidence of the rapid cellular uptake of tat-derived peptides suggest that chemical nucleases based on tat may be useful for inactivating HIV mRNA in vivo. Images PMID:1565648

  18. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens

    PubMed Central

    Naz, Sadia; Ngo, Tony; Farooq, Umar

    2017-01-01

    Background The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis. The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Methods Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli, two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. Results High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis. Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Discussion Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner. PMID:28948099

  19. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens.

    PubMed

    Naz, Sadia; Ngo, Tony; Farooq, Umar; Abagyan, Ruben

    2017-01-01

    The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis . The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli , two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis . Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner.

  20. Crystallization and preliminary X-ray diffraction analysis of the Bacillus subtilis replication termination protein in complex with the 37-base-pair TerI-binding site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vivian, J. P.; Porter, C.; Wilce, J. A.

    2006-11-01

    A preparation of replication terminator protein (RTP) of B. subtilis and a 37-base-pair TerI sequence (comprising two binding sites for RTP) has been purified and crystallized. The replication terminator protein (RTP) of Bacillus subtilis binds to specific DNA sequences that halt the progression of the replisome in a polar manner. These terminator complexes flank a defined region of the chromosome into which they allow replication forks to enter but not exit. Forcing the fusion of replication forks in a specific zone is thought to allow the coordination of post-replicative processes. The functional terminator complex comprises two homodimers each of 29more » kDa bound to overlapping binding sites. A preparation of RTP and a 37-base-pair TerI sequence (comprising two binding sites for RTP) has been purified and crystallized. A data set to 3.9 Å resolution with 97.0% completeness and an R{sub sym} of 12% was collected from a single flash-cooled crystal using synchrotron radiation. The diffraction data are consistent with space group P622, with unit-cell parameters a = b = 118.8, c = 142.6 Å.« less

  1. Detecting cooperative sequences in the binding of RNA Polymerase-II

    NASA Astrophysics Data System (ADS)

    Glass, Kimberly; Rozenberg, Julian; Girvan, Michelle; Losert, Wolfgang; Ott, Ed; Vinson, Charles

    2008-03-01

    Regulation of the expression level of genes is a key biological process controlled largely by the 1000 base pair (bp) sequence preceding each gene (the promoter region). Within that region transcription factor binding sites (TFBS), 5-10 bp long sequences, act individually or cooperate together in the recruitment of, and therefore subsequent gene transcription by, RNA Polymerase-II (RNAP). We have measured the binding of RNAP to promoters on a genome-wide basis using Chromatin Immunoprecipitation (ChIP-on-Chip) microarray assays. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters with high RNAP binding values. We are able to demonstrate that virtually all sequences enriched in such promoters contain a CpG dinucleotide, indicating that TFBS that contain the CpG dinucleotide are involved in RNAP binding to promoters. Further analysis shows that the presence of pairs of CpG containing sequences cooperate to enhance the binding of RNAP to the promoter.

  2. A rapid, generally applicable method to engineer zinc fingers illustrated by targeting the HIV-1 promoter.

    PubMed

    Isalan, M; Klug, A; Choo, Y

    2001-07-01

    DNA-binding domains with predetermined sequence specificity are engineered by selection of zinc finger modules using phage display, allowing the construction of customized transcription factors. Despite remarkable progress in this field, the available protein-engineering methods are deficient in many respects, thus hampering the applicability of the technique. Here we present a rapid and convenient method that can be used to design zinc finger proteins against a variety of DNA-binding sites. This is based on a pair of pre-made zinc finger phage-display libraries, which are used in parallel to select two DNA-binding domains each of which recognizes given 5 base pair sequences, and whose products are recombined to produce a single protein that recognizes a composite (9 base pair) site of predefined sequence. Engineering using this system can be completed in less than two weeks and yields proteins that bind sequence-specifically to DNA with Kd values in the nanomolar range. To illustrate the technique, we have selected seven different proteins to bind various regions of the human immunodeficiency virus 1 (HIV-1) promoter.

  3. SELMAP - SELEX affinity landscape MAPping of transcription factor binding sites using integrated microfluidics

    PubMed Central

    Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron

    2016-01-01

    Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341

  4. Impact of germline and somatic missense variations on drug binding sites.

    PubMed

    Yan, C; Pattabiraman, N; Goecks, J; Lam, P; Nayak, A; Pan, Y; Torcivia-Rodriguez, J; Voskanian, A; Wan, Q; Mazumder, R

    2017-03-01

    Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.

  5. Dimeric PROP1 binding to diverse palindromic TAAT sequences promotes its transcriptional activity.

    PubMed

    Nakayama, Michie; Kato, Takako; Susa, Takao; Sano, Akiko; Kitahara, Kousuke; Kato, Yukio

    2009-08-13

    Mutations in the Prop1 gene are responsible for murine Ames dwarfism and human combined pituitary hormone deficiency with hypogonadism. Recently, we reported that PROP1 is a possible transcription factor for gonadotropin subunit genes through plural cis-acting sites composed of AT-rich sequences containing a TAAT motif which differs from its consensus binding sequence known as PRDQ9 (TAATTGAATTA). This study aimed to verify the binding specificity and sequence of PROP1 by applying the method of SELEX (Systematic Evolution of Ligands by EXponential enrichment), EMSA (electrophoretic mobility shift assay) and transient transfection assay. SELEX, after 5, 7 and 9 generations of selection using a random sequence library, showed that nucleotides containing one or two TAAT motifs were accumulated and accounted for 98.5% at the 9th generation. Aligned sequences and EMSA demonstrated that PROP1 binds preferentially to 11 nucleotides composed of an inverted TAAT motif separated by 3 nucleotides with variation in the half site of palindromic TAAT motifs and with preferential requirement of T at the nucleotide number 5 immediately 3' to a TAAT motif. Transient transfection assay demonstrated first that dimeric binding of PROP1 to an inverted TAAT motif and its cognates resulted in transcriptional activation, whereas monomeric binding of PROP1 to a single TAAT motif and an inverted ATTA motif did not mediate activation. Thus, this study demonstrated that dimeric binding of PROP1 is able to recognize diverse palindromic TAAT sequences separated by 3 nucleotides and to exhibit its transcriptional activity.

  6. Positive selection in octopus haemocyanin indicates functional links to temperature adaptation.

    PubMed

    Oellermann, Michael; Strugnell, Jan M; Lieb, Bernhard; Mark, Felix C

    2015-07-05

    Octopods have successfully colonised the world's oceans from the tropics to the poles. Yet, successful persistence in these habitats has required adaptations of their advanced physiological apparatus to compensate impaired oxygen supply. Their oxygen transporter haemocyanin plays a major role in cold tolerance and accordingly has undergone functional modifications to sustain oxygen release at sub-zero temperatures. However, it remains unknown how molecular properties evolved to explain the observed functional adaptations. We thus aimed to assess whether natural selection affected molecular and structural properties of haemocyanin that explains temperature adaptation in octopods. Analysis of 239 partial sequences of the haemocyanin functional units (FU) f and g of 28 octopod species of polar, temperate, subtropical and tropical origin revealed natural selection was acting primarily on charge properties of surface residues. Polar octopods contained haemocyanins with higher net surface charge due to decreased glutamic acid content and higher numbers of basic amino acids. Within the analysed partial sequences, positive selection was present at site 2545, positioned between the active copper binding centre and the FU g surface. At this site, methionine was the dominant amino acid in polar octopods and leucine was dominant in tropical octopods. Sites directly involved in oxygen binding or quaternary interactions were highly conserved within the analysed sequence. This study has provided the first insight into molecular and structural mechanisms that have enabled octopods to sustain oxygen supply from polar to tropical conditions. Our findings imply modulation of oxygen binding via charge-charge interaction at the protein surface, which stabilize quaternary interactions among functional units to reduce detrimental effects of high pH on venous oxygen release. Of the observed partial haemocyanin sequence, residue 2545 formed a close link between the FU g surface and the active centre, suggesting a role as allosteric binding site. The prevalence of methionine at this site in polar octopods, implies regulation of oxygen affinity via increased sensitivity to allosteric metal binding. High sequence conservation of sites directly involved in oxygen binding indicates that functional modifications of octopod haemocyanin rather occur via more subtle mechanisms, as observed in this study.

  7. De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

    PubMed

    Keilwagen, Jens; Grau, Jan; Paponov, Ivan A; Posch, Stefan; Strickert, Marc; Grosse, Ivo

    2011-02-10

    Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom.

  8. Transposable Elements and DNA Methylation Create in Embryonic Stem Cells Human-Specific Regulatory Sequences Associated with Distal Enhancers and Noncoding RNAs

    PubMed Central

    Glinsky, Gennadi V.

    2015-01-01

    Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements. PMID:25956794

  9. On the connection between inherent DNA flexure and preferred binding of hydroxymethyluracil-containing DNA by the type II DNA-binding protein TF1.

    PubMed

    Grove, A; Galeone, A; Mayol, L; Geiduschek, E P

    1996-07-12

    TF1 is a member of the family of type II DNA-binding proteins, which also includes the bacterial HU proteins and the Escherichia coli integration host factor (IHF). Distinctive to TF1, which is encoded by the Bacillus subtilis bacteriophage SPO1, is its preferential binding to DNA in which thymine is replaced by 5-hydroxymethyluracil (hmU), as it is in the phage genome. TF1 binds to preferred sites within the phage genome and generates pronounced DNA bending. The extent to which DNA flexibility contributes to the sequence-specific binding of TF1, and the connection between hmU preference and DNA flexibility has been examined. Model flexible sites, consisting of consecutive mismatches, increase the affinity of thymine-containing DNA for TF1. In particular, tandem mismatches separated by nine base-pairs generate an increase, by orders of magnitude, in the affinity of TF1 for T-containing DNA with the sequence of a preferred TF1 binding site, and fully match the affinity of TF1 for this cognate site in hmU-containing DNA (Kd approximately 3 nM). Other placements of loops generate suboptimal binding. This is consistent with a significant contribution of site-specific DNA flexibility to complex formation. Analysis of complexes with hmU-DNA of decreasing length shows that a major part of the binding affinity is generated within a central 19 bp segment (delta G0 = 41.7 kJ mol-1) with more-distal DNA contributing modestly to the affinity (delta delta G = -0.42 kJ mol-1 bp-1 on increasing duplex length to 37 bp). However, a previously characterised thermostable and more tightly binding mutant TF1, TF1(E15G/T32I), derives most of its extra affinity from interaction with flanking DNA. We propose that inherent but sequence-dependent deformability of hmU-containing DNA underlies the preferential binding of TF1 and that TF1-induced DNA bendings is a result of distortions at two distinct sites separated by 9 bp of duplex DNA.

  10. Characterisation of a DNA sequence element that directs Dictyostelium stalk cell-specific gene expression.

    PubMed

    Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J

    2000-12-01

    The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.

  11. aPPRove: An HMM-Based Method for Accurate Prediction of RNA-Pentatricopeptide Repeat Protein Binding Events

    PubMed Central

    Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina

    2016-01-01

    Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805

  12. Electrostatically Biased Binding of Kinesin to Microtubules

    PubMed Central

    Zheng, Wenjun; Alonso, Maria; Huber, Gary; Dlugosz, Maciej; McCammon, J. Andrew; Cross, Robert A.

    2011-01-01

    The minimum motor domain of kinesin-1 is a single head. Recent evidence suggests that such minimal motor domains generate force by a biased binding mechanism, in which they preferentially select binding sites on the microtubule that lie ahead in the progress direction of the motor. A specific molecular mechanism for biased binding has, however, so far been lacking. Here we use atomistic Brownian dynamics simulations combined with experimental mutagenesis to show that incoming kinesin heads undergo electrostatically guided diffusion-to-capture by microtubules, and that this produces directionally biased binding. Kinesin-1 heads are initially rotated by the electrostatic field so that their tubulin-binding sites face inwards, and then steered towards a plus-endwards binding site. In tethered kinesin dimers, this bias is amplified. A 3-residue sequence (RAK) in kinesin helix alpha-6 is predicted to be important for electrostatic guidance. Real-world mutagenesis of this sequence powerfully influences kinesin-driven microtubule sliding, with one mutant producing a 5-fold acceleration over wild type. We conclude that electrostatic interactions play an important role in the kinesin stepping mechanism, by biasing the diffusional association of kinesin with microtubules. PMID:22140358

  13. A 5′ Splice Site-Proximal Enhancer Binds SF1 and Activates Exon Bridging of a Microexon

    PubMed Central

    Carlo, Troy; Sierra, Rebecca; Berget, Susan M.

    2000-01-01

    Internal exon size in vertebrates occurs over a narrow size range. Experimentally, exons shorter than 50 nucleotides are poorly included in mRNA unless accompanied by strengthened splice sites or accessory sequences that act as splicing enhancers, suggesting steric interference between snRNPs and other splicing factors binding simultaneously to the 3′ and 5′ splice sites of microexons. Despite these problems, very small naturally occurring exons exist. Here we studied the factors and mechanism involved in recognizing a constitutively included six-nucleotide exon from the cardiac troponin T gene. Inclusion of this exon is dependent on an enhancer located downstream of the 5′ splice site. This enhancer contains six copies of the simple sequence GGGGCUG. The enhancer activates heterologous microexons and will work when located either upstream or downstream of the target exon, suggesting an ability to bind factors that bridge splicing units. A single copy of this sequence is sufficient for in vivo exon inclusion and is the binding site for the known bridging mammalian splicing factor 1 (SF1). The enhancer and its bound SF1 act to increase recognition of the upstream exon during exon definition, such that competition of in vitro reactions with RNAs containing the GGGGCUG repeated sequence depress splicing of the upstream intron, assembly of the spliceosome on the 3′ splice site of the exon, and cross-linking of SF1. These results suggest a model in which SF1 bridges the small exon during initial assembly, thereby effectively extending the domain of the exon. PMID:10805741

  14. Digital Biological Converter

    DTIC Science & Technology

    2013-06-28

    of cuts that each fragment should be cut into so the fragments are no greater than a specific length threshold. Additionally, vector sequences and...restriction sites are attached to each fragment while ensuring the restriction sites are unique to each sequence. The vector sequences serve as hooks...for assembly into vector for cloning purposes, and also as primer binding domains for PCR ampl ification. The restriction sites are added to

  15. Flexible DNA binding of the BTB/POZ-domain protein FBI-1.

    PubMed

    Pessler, Frank; Hernandez, Nouria

    2003-08-01

    POZ-domain transcription factors are characterized by the presence of a protein-protein interaction domain called the POZ or BTB domain at their N terminus and zinc fingers at their C terminus. Despite the large number of POZ-domain transcription factors that have been identified to date and the significant insights that have been gained into their cellular functions, relatively little is known about their DNA binding properties. FBI-1 is a BTB/POZ-domain protein that has been shown to modulate HIV-1 Tat trans-activation and to repress transcription of some cellular genes. We have used various viral and cellular FBI-1 binding sites to characterize the interaction of a POZ-domain protein with DNA in detail. We find that FBI-1 binds to inverted sequence repeats downstream of the HIV-1 transcription start site. Remarkably, it binds efficiently to probes carrying these repeats in various orientations and spacings with no particular rotational alignment, indicating that its interaction with DNA is highly flexible. Indeed, FBI-1 binding sites in the adenovirus 2 major late promoter, the c-fos gene, and the c-myc P1 and P2 promoters reveal variously spaced direct, inverted, and everted sequence repeats with the consensus sequence G(A/G)GGG(T/C)(C/T)(T/C)(C/T) for each repeat.

  16. Identification of the HrpS binding site in the hrpL promoter and effect of the RpoN binding site of HrpS on the regulation of the type III secretion system in Erwinia amylovora.

    PubMed

    Lee, Jae Hoon; Sundin, George W; Zhao, Youfu

    2016-06-01

    The type III secretion system (T3SS) is a key pathogenicity factor in Erwinia amylovora. Previous studies have demonstrated that the T3SS in E. amylovora is transcriptionally regulated by an RpoN-HrpL sigma factor cascade, which is activated by the bacterial alarmone (p)ppGpp. In this study, the binding site of HrpS, an enhancer binding protein, was identified for the first time in plant-pathogenic bacteria. Complementation of the hrpL mutant with promoter deletion constructs of the hrpL gene and promoter activity analyses using various lengths of the hrpL promoter fused to a promoter-less green fluorescent protein (gfp) reporter gene delineated the upstream region for HrpS binding. Sequence analysis revealed a dyad symmetry sequence between -138 and -125 nucleotides (TGCAA-N4-TTGCA) as the potential HrpS binding site, which is conserved in the promoter of the hrpL gene among plant enterobacterial pathogens. Results of quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR) and electrophoresis mobility shift assay coupled with site-directed mutagenesis (SDM) analysis showed that the intact dyad symmetry sequence was essential for HrpS binding, full activation of T3SS gene expression and virulence. In addition, the role of the GAYTGA motif (RpoN binding site) of HrpS in the regulation of T3SS gene expression in E. amylovora was characterized by complementation of the hrpS mutant using mutant variants generated by SDM. Results showed that a Y100F substitution of HrpS complemented the hrpS mutant, whereas Y100A and Y101A substitutions did not. These results suggest that tyrosine (Y) and phenylalanine (F) function interchangeably in the conserved GAYTGA motif of HrpS in E. amylovora. © 2015 BSPP AND JOHN WILEY & SONS LTD.

  17. Genome-wide identification and characterization of Notch transcription complex-binding sequence-paired sites in leukemia cells.

    PubMed

    Severson, Eric; Arnett, Kelly L; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S; Shirley Liu, X; Blacklow, Stephen C; Aster, Jon C

    2017-05-02

    Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and have been linked to the Notch responsiveness of a few genes. To assess the overall contribution of SPSs to Notch-dependent gene regulation, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay and applied insights from these in vitro studies to Notch-"addicted" T cell acute lymphoblastic leukemia (T-ALL) cells. We found that SPSs contributed to the regulation of about a third of direct Notch target genes. Although originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5 expression. Our work provides a general method for identifying SPSs in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. Copyright © 2017, American Association for the Advancement of Science.

  18. footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

    PubMed

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2014-01-15

    Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.

  19. Interaction of Zn(II)bleomycin-A2 and Zn(II)peplomycin with a DNA hairpin containing the 5'-GT-3' binding site in comparison with the 5'-GC-3' binding site studied by NMR spectroscopy.

    PubMed

    Follett, Shelby E; Ingersoll, Azure D; Murray, Sally A; Reilly, Teresa M; Lehmann, Teresa E

    2017-10-01

    Bleomycins are a group of glycopeptide antibiotics synthesized by Streptomyces verticillus that are widely used for the treatment of various neoplastic diseases. These antibiotics have the ability to chelate a metal center, mainly Fe(II), and cause site-specific DNA cleavage. Bleomycins are differentiated by their C-terminal regions. Although this antibiotic family is a successful course of treatment for some types of cancers, it is known to cause pulmonary fibrosis. Previous studies have identified that bleomycin-related pulmonary toxicity is linked to the C-terminal region of these drugs. This region has been shown to closely interact with DNA. We examined the binding of Zn(II)peplomycin and Zn(II)bleomycin-A 2 to a DNA hairpin of sequence 5'-CCAGTATTTTTACTGG-3', containing the binding site 5'-GT-3', and compared the results with those obtained from our studies of the same MBLMs bound to a DNA hairpin containing the binding site 5'-GC-3'. We provide evidence that the DNA base sequence has a strong impact in the final structure of the drug-target complex.

  20. Characterization of the ligand-binding site of the transferrin receptor in Trypanosoma brucei demonstrates a structural relationship with the N-terminal domain of the variant surface glycoprotein.

    PubMed

    Salmon, D; Hanocq-Quertier, J; Paturiaux-Hanocq, F; Pays, A; Tebabi, P; Nolan, D P; Michel, A; Pays, E

    1997-12-15

    The Trypanosoma brucei transferrin (Tf) receptor is a heterodimer encoded by ESAG7 and ESAG6, two genes contained in the different polycistronic transcription units of the variant surface glycoprotein (VSG) gene. The sequence of ESAG7/6 differs slightly between different units, so that receptors with different affinities for Tf are expressed alternatively following transcriptional switching of VSG expression sites during antigenic variation of the parasite. Based on the sequence homology between pESAG7/6 and the N-terminal domain of VSGs, it can be predicted that the four blocks containing the major sequence differences between pESAG7 and pESAG6 form surface-exposed loops and generate the ligand-binding site. The exchange of a few amino acids in this region between pESAG6s encoded by different VSG units greatly increased the affinity for bovine Tf. Similar changes in other regions were ineffective, while mutations predicted to alter the VSG-like structure abolished the binding. Chimeric proteins containing the N-terminal dimerization domain of VSG and the C-terminal half of either pESAG7 or pESAG6, which contains the ligand-binding domain, can form heterodimers that bind Tf. Taken together, these data provided evidence that the T.brucei Tf receptor is structurally related to the N-terminal domain of the VSG and that the ligand-binding site corresponds to the exposed surface loops of the protein.

  1. Expression of simian virus 40 T antigen in Escherichia coli: localization of T-antigen origin DNA-binding domain to within 129 amino acids.

    PubMed Central

    Arthur, A K; Höss, A; Fanning, E

    1988-01-01

    The genomic coding sequence of the large T antigen of simian virus 40 (SV40) was cloned into an Escherichia coli expression vector by joining new restriction sites, BglII and BamHI, introduced at the intron boundaries of the gene. Full-length large T antigen, as well as deletion and amino acid substitution mutants, were inducibly expressed from the lac promoter of pUC9, albeit with different efficiencies and protein stabilities. Specific interaction with SV40 origin DNA was detected for full-length T antigen and certain mutants. Deletion mutants lacking T-antigen residues 1 to 130 and 260 to 708 retained specific origin-binding activity, demonstrating that the region between residues 131 and 259 must carry the essential binding domain for DNA-binding sites I and II. A sequence between residues 302 and 320 homologous to a metal-binding "finger" motif is therefore not required for origin-specific binding. However, substitution of serine for either of two cysteine residues in this motif caused a dramatic decrease in origin DNA-binding activity. This region, as well as other regions of the full-length protein, may thus be involved in stabilizing the DNA-binding domain and altering its preference for binding to site I or site II DNA. Images PMID:2835505

  2. Exploiting three kinds of interface propensities to identify protein binding sites.

    PubMed

    Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

    2009-08-01

    Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. In this study, we present a building block of proteins called order profiles to use the evolutionary information of the protein sequence frequency profiles and apply this building block to produce a class of propensities called order profile interface propensities. For comparisons, we revisit the usage of residue interface propensities and binary profile interface propensities for protein binding site prediction. Each kind of propensities combined with sequence profiles and accessible surface areas are inputted into SVM. When tested on four types of complexes (hetero-permanent complexes, hetero-transient complexes, homo-permanent complexes and homo-transient complexes), experimental results show that the order profile interface propensities are better than residue interface propensities and binary profile interface propensities. Therefore, order profile is a suitable profile-level building block of the protein sequences and can be widely used in many tasks of computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the protein remote homology detection.

  3. Xenopus origin recognition complex (ORC) initiates DNA replication preferentially at sequences targeted by Schizosaccharomyces pombe ORC

    PubMed Central

    Kong, Daochun; Coleman, Thomas R.; DePamphilis, Melvin L.

    2003-01-01

    Budding yeast (Saccharomyces cerevisiae) origin recognition complex (ORC) requires ATP to bind specific DNA sequences, whereas fission yeast (Schizosaccharomyces pombe) ORC binds to specific, asymmetric A:T-rich sites within replication origins, independently of ATP, and frog (Xenopus laevis) ORC seems to bind DNA non-specifically. Here we show that despite these differences, ORCs are functionally conserved. Firstly, SpOrc1, SpOrc4 and SpOrc5, like those from other eukaryotes, bound ATP and exhibited ATPase activity, suggesting that ATP is required for pre-replication complex (pre-RC) assembly rather than origin specificity. Secondly, SpOrc4, which is solely responsible for binding SpORC to DNA, inhibited up to 70% of XlORC-dependent DNA replication in Xenopus egg extract by preventing XlORC from binding to chromatin and assembling pre-RCs. Chromatin-bound SpOrc4 was located at AT-rich sequences. XlORC in egg extract bound preferentially to asymmetric A:T-sequences in either bare DNA or in sperm chromatin, and it recruited XlCdc6 and XlMcm proteins to these sequences. These results reveal that XlORC initiates DNA replication preferentially at the same or similar sites to those targeted in S.pombe. PMID:12840006

  4. Regulated expression of a repressor protein: FadR activates iclR.

    PubMed Central

    Gui, L; Sunnarborg, A; LaPorte, D C

    1996-01-01

    The control of the glyoxylate bypass operon (aceBAK) of Escherichia coli is mediated by two regulatory proteins, IclMR and FadR. IclMR is a repressor protein which has previously been shown to bind to a site which overlaps the aceBAK promoter. FAR is a repressor/activator protein which participates in control of the genes of fatty acid metabolism. A sequence just upstream of the iclR promoter bears a striking resemblance to FadR binding sites found in the fatty acid metabolic genes. The in vitro binding specificity of FadR, determined by oligonucleotide selection, was in good agreement with the sequences of these sites. The ability of FadR to bind to the site associated with iclR was demonstrated by gel shift and DNase I footprint analyses. Disruption of FadR or inactivation of the FadR binding site of iclR decreased the expression of an iclR::lacZ operon fusion, indicating that FadR activates the expression of iclR. It has been reported that disruption of fadR increases the expression of aceBAK. We observed a similar increase when we inactivated the FadR binding site of an iclR+ allele. This result suggests that FadR regulates aceBAK indirectly by altering the expression of IclR. PMID:8755903

  5. Major versus minor groove DNA binding of a bisarginylporphyrin hybrid molecule: A molecular mechanics investigation

    NASA Astrophysics Data System (ADS)

    Gresh, Nohad; Perrée-fauvet, Martine

    1999-03-01

    On the basis of theoretical computations, we have recently synthesised [Perrée-Fauvet, M. and Gresh, N., Tetrahedron Lett., 36 (1995) 4227] a bisarginyl conjugate of a tricationic porphyrin (BAP), designed to target, in the major groove of DNA, the d(GGC GCC)2 sequence which is part of the primary binding site of the HIV-1 retrovirus site [Wain-Hobson, S. et al., Cell, 40 (1985) 9]. In the theoretical model, the chromophore intercalates at the central d(CpG)2 step and each of the arginyl arms targets O6/N7belonging to guanine bases flanking the intercalation site. Recent IR and UV-visible spectroscopic studies have confirmed the essential features of these theoretical predictions [Mohammadi, S. et al., Biochemistry, 37 (1998) 6165]. In the present study, we compare the energies of competing intercalation modes of BAP to several double-stranded oligonucleotides, according to whether one, two or three N- methylpyridinium rings project into the major groove. Correspondingly, three minor groove binding modes were considered, the arginyl arms now targeting N3, O2 sites belonging to the purine or pyrimidine bases flanking the intercalation site. This investigation has shown that: (i) in both the major and minor grooves, the best-bound complexes have the three N-methylpyridinium rings in the groove opposite to that of the phenyl group bearing the arginyl arms; (ii) major groove binding is preferred over minor groove binding by a significant energy (29 kcal/mol); and (iii) the best-bound sequence in the major groove is d(GGC GCC)2 with two successive guanines upstream from the intercalation. On the other hand, due to the flexibility of the arginyl arms, other GC-rich sequences have close binding energies, two of them being less stable than it by less than 8 kcal/mol. These results serve as the basis for the design of derivatives of BAP with enhanced sequence selectivities in the major groove.

  6. The 1.3 A resolution structure of the RNA tridecamer r(GCGUUUGAAACGC): metal ion binding correlates with base unstacking and groove contraction.

    PubMed

    Timsit, Youri; Bombard, Sophie

    2007-12-01

    Metal ions play a key role in RNA folding and activity. Elucidating the rules that govern the binding of metal ions is therefore an essential step for better understanding the RNA functions. High-resolution data are a prerequisite for a detailed structural analysis of ion binding on RNA and, in particular, the observation of monovalent cations. Here, the high-resolution crystal structures of the tridecamer duplex r(GCGUUUGAAACGC) crystallized under different conditions provides new structural insights on ion binding on GAAA/UUU sequences that exhibit both unusual structural and functional properties in RNA. The present study extends the repertory of RNA ion binding sites in showing that the two first bases of UUU triplets constitute a specific site for sodium ions. A striking asymmetric pattern of metal ion binding in the two equivalent halves of the palindromic sequence demonstrates that sequence and its environment act together to bind metal ions. A highly ionophilic half that binds six metal ions allows, for the first time, the observation of a disodium cluster in RNA. The comparison of the equivalent halves of the duplex provides experimental evidences that ion binding correlates with structural alterations and groove contraction.

  7. Degenerate Pax2 and Senseless binding motifs improve detection of low-affinity sites required for enhancer specificity

    PubMed Central

    Zandvakili, Arya; Campbell, Ian; Weirauch, Matthew T.

    2018-01-01

    Cells use thousands of regulatory sequences to recruit transcription factors (TFs) and produce specific transcriptional outcomes. Since TFs bind degenerate DNA sequences, discriminating functional TF binding sites (TFBSs) from background sequences represents a significant challenge. Here, we show that a Drosophila regulatory element that activates Epidermal Growth Factor signaling requires overlapping, low-affinity TFBSs for competing TFs (Pax2 and Senseless) to ensure cell- and segment-specific activity. Testing available TF binding models for Pax2 and Senseless, however, revealed variable accuracy in predicting such low-affinity TFBSs. To better define parameters that increase accuracy, we developed a method that systematically selects subsets of TFBSs based on predicted affinity to generate hundreds of position-weight matrices (PWMs). Counterintuitively, we found that degenerate PWMs produced from datasets depleted of high-affinity sequences were more accurate in identifying both low- and high-affinity TFBSs for the Pax2 and Senseless TFs. Taken together, these findings reveal how TFBS arrangement can be constrained by competition rather than cooperativity and that degenerate models of TF binding preferences can improve identification of biologically relevant low affinity TFBSs. PMID:29617378

  8. RNA from the 5' end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site.

    PubMed

    Christensen, Shawn M; Ye, Junqiang; Eickbush, Thomas H

    2006-11-21

    Non-LTR retrotransposons insert into eukaryotic genomes by target-primed reverse transcription (TPRT), a process in which cleaved DNA targets are used to prime reverse transcription of the element's RNA transcript. Many of the steps in the integration pathway of these elements can be characterized in vitro for the R2 element because of the rigid sequence specificity of R2 for both its DNA target and its RNA template. R2 retrotransposition involves identical subunits of the R2 protein bound to different DNA sequences upstream and downstream of the insertion site. The key determinant regulating which DNA-binding conformation the protein adopts was found to be a 320-nt RNA sequence from near the 5' end of the R2 element. In the absence of this 5' RNA the R2 protein binds DNA sequences upstream of the insertion site, cleaves the first DNA strand, and conducts TPRT when RNA containing the 3' untranslated region of the R2 transcript is present. In the presence of the 320-nt 5' RNA, the R2 protein binds DNA sequences downstream of the insertion site. Cleavage of the second DNA strand by the downstream subunit does not appear to occur until after the 5' RNA is removed from this subunit. We postulate that the removal of the 5' RNA normally occurs during reverse transcription, and thus provides a critical temporal link to first- and second-strand DNA cleavage in the R2 retrotransposition reaction.

  9. Sequence-Based Prediction of RNA-Binding Residues in Proteins.

    PubMed

    Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.

  10. Sequence-Based Prediction of RNA-Binding Residues in Proteins

    PubMed Central

    Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829

  11. The pig CYP2E1 promoter is activated by COUP-TF1 and HNF-1 and is inhibited by androstenone.

    PubMed

    Tambyrajah, Winston S; Doran, Elena; Wood, Jeffrey D; McGivan, John D

    2004-11-15

    Functional analysis of the pig cytochrome P4502E1 (CYP2E1) promoter identified two major activating elements. One corresponded to the hepatic nuclear factor 1 (HNF-1) consensus binding sequence at nucleotides -128/-98 and the other was located in the region -292/-266. The binding of proteins in pig liver nuclear extracts to a synthetic double-stranded oligonucleotide corresponding to this more distal activating sequence was studied by electrophoretic mobility shift assay. The minimum protein binding sequence was identified as TGTTCTGACCTCTGGG. Gel super-shift assays identified the protein binding to this site as chick ovalbumin upstream promoter transcription factor 1 (COUP-TF1). Androstenone inhibited promoter activity in transfection experiments only with constructs which included the COUP-TF1 binding site. Androstenone inhibited COUP-TF1 binding to synthetic oligonucleotides but did not affect HNF-1 binding. The results offer an explanation for the inhibition of CYP2E1 protein expression by androstenone in isolated pig hepatocytes and may be relevant to the low expression of hepatic CYP2E1 in those pigs which accumulate high levels of androstenone in vivo.

  12. Epigallocatechin-3-gallate preferentially induces aggregation of amyloidogenic immunoglobulin light chains

    PubMed Central

    Hora, Manuel; Carballo-Pacheco, Martin; Weber, Benedikt; Morris, Vanessa K.; Wittkopf, Antje; Buchner, Johannes; Strodel, Birgit; Reif, Bernd

    2017-01-01

    Antibody light chain amyloidosis is a rare disease caused by fibril formation of secreted immunoglobulin light chains (LCs). The huge variety of antibody sequences puts a serious challenge to drug discovery. The green tea polyphenol epigallocatechin-3-gallate (EGCG) is known to interfere with fibril formation in general. Here we present solution- and solid-state NMR studies as well as MD simulations to characterise the interaction of EGCG with LC variable domains. We identified two distinct EGCG binding sites, both of which include a proline as an important recognition element. The binding sites were confirmed by site-directed mutagenesis and solid-state NMR analysis. The EGCG-induced protein complexes are unstructured. We propose a general mechanistic model for EGCG binding to a conserved site in LCs. We find that EGCG reacts selectively with amyloidogenic mutants. This makes this compound a promising lead structure, that can handle the immense sequence variability of antibody LCs. PMID:28128355

  13. A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.

    PubMed

    Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A

    1999-12-20

    The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.

  14. Nuclear factors that bind to the enhancer region of nondefective Friend murine leukemia virus.

    PubMed Central

    Manley, N R; O'Connell, M A; Sharp, P A; Hopkins, N

    1989-01-01

    Nondefective Friend murine leukemia virus (MuLV) causes erythroleukemia when injected into newborn NFS mice, while Moloney MuLV causes T-cell lymphoma. Exchange of the Friend virus enhancer region, a sequence of about 180 nucleotides including the direct repeat and a short 3'-adjacent segment, for the corresponding region in Moloney MuLV confers the ability to cause erythroid disease on Moloney MuLV. We have used the electrophoretic mobility shift assay and methylation interference analysis to identify cellular factors which bind to the Friend virus enhancer region and compared these with factors, previously identified, that bind to the Moloney virus direct repeat (N. A. Speck and D. Baltimore, Mol. Cell. Biol. 7:1101-1110, 1987). We identified five binding sites for sequence-specific DNA-binding proteins in the Friend virus enhancer region. While some binding sites are present in both the Moloney and Friend virus enhancers, both viruses contain unique sites not present in the other. Although none of the factors identified in this report which bind to these unique sites are present exclusively in T cells or erythroid cells, they bind to three regions of the enhancer shown by genetic analysis to encode disease specificity and thus are candidates to mediate the tissue-specific expression and distinct disease specificities encoded by these virus enhancer elements. Images PMID:2778872

  15. Conservation of CD44 exon v3 functional elements in mammals

    PubMed Central

    Vela, Elena; Hilari, Josep M; Delclaux, María; Fernández-Bellon, Hugo; Isamat, Marcos

    2008-01-01

    Background The human CD44 gene contains 10 variable exons (v1 to v10) that can be alternatively spliced to generate hundreds of different CD44 protein isoforms. Human CD44 variable exon v3 inclusion in the final mRNA depends on a multisite bipartite splicing enhancer located within the exon itself, which we have recently described, and provides the protein domain responsible for growth factor binding to CD44. Findings We have analyzed the sequence of CD44v3 in 95 mammalian species to report high conservation levels for both its splicing regulatory elements (the 3' splice site and the exonic splicing enhancer), and the functional glycosaminglycan binding site coded by v3. We also report the functional expression of CD44v3 isoforms in peripheral blood cells of different mammalian taxa with both consensus and variant v3 sequences. Conclusion CD44v3 mammalian sequences maintain all functional splicing regulatory elements as well as the GAG binding site with the same relative positions and sequence identity previously described during alternative splicing of human CD44. The sequence within the GAG attachment site, which in turn contains the Y motif of the exonic splicing enhancer, is more conserved relative to the rest of exon. Amplification of CD44v3 sequence from mammalian species but not from birds, fish or reptiles, may lead to classify CD44v3 as an exclusive mammalian gene trait. PMID:18710510

  16. DNA Recognition by a σ 54 Transcriptional Activator from Aquifex aeolicus

    DOE PAGES

    Vidangos, Natasha K.; Heideker, Johanna; Lyubimov, Artem; ...

    2014-08-23

    Transcription initiation by bacterial σ 54-polymerase requires the action of a transcriptional activator protein. Activators bind sequence-specifically upstream of the transcription initiation site via a DNA-binding domain. The structurally characterized DNA-binding domains from activators all belong to the Factor for Inversion Stimulation (Fis) family of helix-turn-helix DNA-binding proteins. We report here structures of the free and DNA-bound forms of the DNA-binding domain of NtrC4 (4DBD) from Aquifex aeolicus, a member of the NtrC family of σ 54 activators. Two NtrC4 binding sites were identified upstream (-145 and -85 base pairs) from the start of the lpxC gene, which is responsiblemore » for the first committed step in Lipid A biosynthesis. This is the first experimental evidence for σ 54 regulation in lpxC expression. 4DBD was crystallized both without DNA and in complex with the -145 binding site. The structures, together with biochemical data, indicate that NtrC4 binds to DNA in a manner that is similar to that of its close homologue, Fis. Ultimately, the greater sequence specificity for the binding of 4DBD relative to Fis seems to arise from a larger number of base specific contacts contributing to affinity than for Fis.« less

  17. Inhibition of HMGA2 binding to DNA by netropsin

    PubMed Central

    Miao, Yi; Cui, Tengjiao; Leng, Fenfei; Wilson, W. David

    2008-01-01

    The design of small synthetic molecules that can be used to affect gene expression is an area of active interest for development of agents in therapeutic and biotechnology applications. Many compounds that target the minor groove in AT sequences in DNA are well characterized and are promising reagents for use as modulators of protein-DNA complexes. The mammalian high mobility group transcriptional factor, HMGA2, also targets the DNA minor groove and plays critical roles in disease processes from cancer to obesity. Biosensor-surface plasmon resonance methods were used to monitor HMGA2 binding to target sites on immobilized DNA and a competition assay for inhibition of the HMGA2-DNA complex was designed. HMGA2 binds strongly to the DNA through AT hook domains with KD values of 20 - 30 nM depending on the DNA sequence. The well-characterized minor groove binder, netropsin, was used to develop and test the assay. The compound has two binding sites in the protein-DNA interaction sequence and this provides an advantage for inhibition. An equation for analysis of results when the inhibitor has two binding sites in the biopolymer recognition surface is presented with the results. The assay provides a platform for discovery of HMGA2 inhibitors. PMID:18023407

  18. ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

    DOE PAGES

    Li, Jie; Overall, Christopher C.; Johnson, Rudd C.; ...

    2015-09-21

    The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less

  19. ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Jie; Overall, Christopher C.; Johnson, Rudd C.

    The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less

  20. A novel site contributing to growth-arrest-specific gene 6 binding to its receptors as revealed by a human monoclonal antibody

    PubMed Central

    2004-01-01

    Gas6 (growth-arrest-specific gene 6) is a vitamin K-dependent protein known to activate the Axl family of receptor tyrosine kinases. It is an important regulator of thrombosis and many other biological functions. The C-terminus of Gas6 binds to receptors and consists of two laminin-like globular domains LG1 and LG2. It has been reported that a Ca2+-binding site at the junction of LG1 and LG2 domains and a hydrophobic patch at the LG2 domain are important for receptor binding [Sasaki, Knyazev, Cheburkin, Gohring, Tisi, Ullrich, Timpl and Hohenester (2002) J. Biol. Chem. 277, 44164–44170]. In the present study, we developed a neutralizing human monoclonal antibody, named CNTO300, for Gas6. The antibody was generated by immunization of human IgG-expressing transgenic mice with recombinant human Gas6 protein and the anti-Gas6 IgG sequences were rescued from an unstable hybridoma clone. Binding of Gas6 to its receptors was partially inhibited by the CNTO300 antibody in a dose-dependent manner. To characterize further the interaction between Gas6 and this antibody, the binding kinetics of CNTO300 for recombinant Gas6 were compared with independently expressed LG1 and LG2. The CNTO300 antibody showed comparable binding affinity, yet different dependence on Ca2+, to Gas6 and LG1. No binding to LG2 was detected. In the presence of EDTA, binding of the antibody to Gas6 was disrupted, but no significant effect of EDTA on LG1 binding was evident. Further epitope mapping identified a Gas6 peptide sequence recognized by the CNTO300 antibody. This peptide sequence was found to be located at the LG1 domain distant from the Ca2+-binding site and the hydrophobic patch. Co-interaction of Gas6 with its receptor and CNTO300 antibody was detected by BIAcore analysis, suggesting a second receptor-binding site on the LG1 domain. This hypothesis was further supported by direct binding of Gas6 receptors to an independently expressed LG1 domain. Our results revealed, for the first time, a second binding site for Gas6–receptor interaction. PMID:15579134

  1. A single amino-acid substitution in the Ets domain alters core DNA binding specificity of Ets1 to that of the related transcription factors Elf1 and E74.

    PubMed

    Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J

    1993-11-11

    Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.

  2. Concerted formation of macromolecular Suppressor–mutator transposition complexes

    PubMed Central

    Raina, Ramesh; Schläppi, Michael; Karunanandaa, Balasulojini; Elhofy, Adam; Fedoroff, Nina

    1998-01-01

    Transposition of the maize Suppressor–mutator (Spm) transposon requires two element-encoded proteins, TnpA and TnpD. Although there are multiple TnpA binding sites near each element end, binding of TnpA to DNA is not cooperative, and the binding affinity is not markedly affected by the number of binding sites per DNA fragment. However, intermolecular complexes form cooperatively between DNA fragments with three or more TnpA binding sites. TnpD, itself not a sequence-specific DNA-binding protein, binds to TnpA and stabilizes the TnpA–DNA complex. The high redundancy of TnpA binding sites at both element ends and the protein–protein interactions between DNA-bound TnpA complexes and between these and TnpD imply a concerted transition of the element from a linear to a protein crosslinked transposition complex within a very narrow protein concentration range. PMID:9671711

  3. Study of DNA binding sites using the Rényi parametric entropy measure.

    PubMed

    Krishnamachari, A; moy Mandal, Vijnan; Karmeshu

    2004-04-07

    Shannon's definition of uncertainty or surprisal has been applied extensively to measure the information content of aligned DNA sequences and characterizing DNA binding sites. In contrast to Shannon's uncertainty, this study investigates the applicability and suitability of a parametric uncertainty measure due to Rényi. It is observed that this measure also provides results in agreement with Shannon's measure, pointing to its utility in analysing DNA binding site region. For facilitating the comparison between these uncertainty measures, a dimensionless quantity called "redundancy" has been employed. It is found that Rényi's measure at low parameter values possess a better delineating feature of binding sites (of binding regions) than Shannon's measure. The critical value of the parameter is chosen with an outlier criterion.

  4. Structure-affinity relationships for the binding of actinomycin D to DNA

    NASA Astrophysics Data System (ADS)

    Gallego, José; Ortiz, Angel R.; de Pascual-Teresa, Beatriz; Gago, Federico

    1997-03-01

    Molecular models of the complexes between actinomycin D and 14 different DNA hexamers were built based on the X-ray crystal structure of the actinomycin-d(GAAGCTTC)2 complex. The DNA sequences included the canonical GpC binding step flanked by different base pairs, nonclassical binding sites such as GpG and GpT, and sites containing 2,6-diamino- purine. A good correlation was found between the intermolecular interaction energies calculated for the refined complexes and the relative preferences of actinomycin binding to standard and modified DNA. A detailed energy decomposition into van der Waals and electrostatic components for the interactions between the DNA base pairs and either the chromophore or the peptidic part of the antibiotic was performed for each complex. The resulting energy matrix was then subjected to principal component analysis, which showed that actinomycin D discriminates among different DNA sequences by an interplay of hydrogen bonding and stacking interactions. The structure-affinity relationships for this important antitumor drug are thus rationalized and may be used to advantage in the design of novel sequence-specific DNA-binding agents.

  5. Direct activation of a notochord cis-regulatory module by Brachyury and FoxA in the ascidian Ciona intestinalis.

    PubMed

    Passamaneck, Yale J; Katikala, Lavanya; Perrone, Lorena; Dunn, Matthew P; Oda-Ishii, Izumi; Di Gregorio, Anna

    2009-11-01

    The notochord is a defining feature of the chordate body plan. Experiments in ascidian, frog and mouse embryos have shown that co-expression of Brachyury and FoxA class transcription factors is required for notochord development. However, studies on the cis-regulatory sequences mediating the synergistic effects of these transcription factors are complicated by the limited knowledge of notochord genes and cis-regulatory modules (CRMs) that are directly targeted by both. We have identified an easily testable model for such investigations in a 155-bp notochord-specific CRM from the ascidian Ciona intestinalis. This CRM contains functional binding sites for both Ciona Brachyury (Ci-Bra) and FoxA (Ci-FoxA-a). By combining point mutation analysis and misexpression experiments, we demonstrate that binding of both transcription factors to this CRM is necessary and sufficient to activate transcription. To gain insights into the cis-regulatory criteria controlling its activity, we investigated the organization of the transcription factor binding sites within the 155-bp CRM. The 155-bp sequence contains two Ci-Bra binding sites with identical core sequences but opposite orientations, only one of which is required for enhancer activity. Changes in both orientation and spacing of these sites substantially affect the activity of the CRM, as clusters of identical sites found in the Ciona genome with different arrangements are unable to activate transcription in notochord cells. This work presents the first evidence of a synergistic interaction between Brachyury and FoxA in the activation of an individual notochord CRM, and highlights the importance of transcription factor binding site arrangement for its function.

  6. STAT1:DNA sequence-dependent binding modulation by phosphorylation, protein:protein interactions and small-molecule inhibition

    PubMed Central

    Bonham, Andrew J.; Wenta, Nikola; Osslund, Leah M.; Prussin, Aaron J.; Vinkemeier, Uwe; Reich, Norbert O.

    2013-01-01

    The DNA-binding specificity and affinity of the dimeric human transcription factor (TF) STAT1, were assessed by total internal reflectance fluorescence protein-binding microarrays (TIRF-PBM) to evaluate the effects of protein phosphorylation, higher-order polymerization and small-molecule inhibition. Active, phosphorylated STAT1 showed binding preferences consistent with prior characterization, whereas unphosphorylated STAT1 showed a weak-binding preference for one-half of the GAS consensus site, consistent with recent models of STAT1 structure and function in response to phosphorylation. This altered-binding preference was further tested by use of the inhibitor LLL3, which we show to disrupt STAT1 binding in a sequence-dependent fashion. To determine if this sequence-dependence is specific to STAT1 and not a general feature of human TF biology, the TF Myc/Max was analysed and tested with the inhibitor Mycro3. Myc/Max inhibition by Mycro3 is sequence independent, suggesting that the sequence-dependent inhibition of STAT1 may be specific to this system and a useful target for future inhibitor design. PMID:23180800

  7. Identification of metal ion binding sites based on amino acid sequences

    PubMed Central

    Cao, Xiaoyong; Zhang, Xiaojin; Gao, Sujuan; Ding, Changjiang; Feng, Yonge; Bao, Weihua

    2017-01-01

    The identification of metal ion binding sites is important for protein function annotation and the design of new drug molecules. This study presents an effective method of analyzing and identifying the binding residues of metal ions based solely on sequence information. Ten metal ions were extracted from the BioLip database: Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, K+ and Co2+. The analysis showed that Zn2+, Cu2+, Fe2+, Fe3+, and Co2+ were sensitive to the conservation of amino acids at binding sites, and promising results can be achieved using the Position Weight Scoring Matrix algorithm, with an accuracy of over 79.9% and a Matthews correlation coefficient of over 0.6. The binding sites of other metals can also be accurately identified using the Support Vector Machine algorithm with multifeature parameters as input. In addition, we found that Ca2+ was insensitive to hydrophobicity and hydrophilicity information and Mn2+ was insensitive to polarization charge information. An online server was constructed based on the framework of the proposed method and is freely available at http://60.31.198.140:8081/metal/HomePage/HomePage.html. PMID:28854211

  8. Identification of metal ion binding sites based on amino acid sequences.

    PubMed

    Cao, Xiaoyong; Hu, Xiuzhen; Zhang, Xiaojin; Gao, Sujuan; Ding, Changjiang; Feng, Yonge; Bao, Weihua

    2017-01-01

    The identification of metal ion binding sites is important for protein function annotation and the design of new drug molecules. This study presents an effective method of analyzing and identifying the binding residues of metal ions based solely on sequence information. Ten metal ions were extracted from the BioLip database: Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, K+ and Co2+. The analysis showed that Zn2+, Cu2+, Fe2+, Fe3+, and Co2+ were sensitive to the conservation of amino acids at binding sites, and promising results can be achieved using the Position Weight Scoring Matrix algorithm, with an accuracy of over 79.9% and a Matthews correlation coefficient of over 0.6. The binding sites of other metals can also be accurately identified using the Support Vector Machine algorithm with multifeature parameters as input. In addition, we found that Ca2+ was insensitive to hydrophobicity and hydrophilicity information and Mn2+ was insensitive to polarization charge information. An online server was constructed based on the framework of the proposed method and is freely available at http://60.31.198.140:8081/metal/HomePage/HomePage.html.

  9. Sequence- and Interactome-Based Prediction of Viral Protein Hotspots Targeting Host Proteins: A Case Study for HIV Nef

    PubMed Central

    Sarmady, Mahdi; Dampier, William; Tozeren, Aydin

    2011-01-01

    Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk. PMID:21738584

  10. A murine host cell factor required for nicking of the dimer bridge of MVM recognizes two CG nucleotides displaced by 10 basepairs.

    PubMed

    Liu, Q; Astell, C R

    1996-10-01

    During replication of the minute virus of mice (MVM) genome, a dimer replicative form (RF) intermediate is resolved into two monomer RF molecules in such a way as to retain a unique sequence within the left hand hairpin terminus of the viral genome. Although the proposed mechanism for resolution of the dimer RF remains uncertain, it likely involves site-specific nicking of the dimer bridge. The RF contains two double-stranded copies of the viral genome joined by the extended 3' hairpin. Minor sequence asymmetries within the 3' hairpin allow the two halves of the dimer bridge to be distinguished. The A half contains the sequence [sequence: see text], whereas the B half contains the sequence [sequence: see text]. Using an in vitro assay, we show that only the B half of the MVM dimer bridge is nicked site-specifically when incubated with crude NS-1 protein (expressed in insect cells) and mouse LA9 cellular extract. When highly purified NS-1, the major nonstructural protein of MVM, is used in this nicking reaction, there is an absolute requirement for the LA9 cellular extract, suggesting a cellular factor (or factors) is (are) required. A series of mutations were created in the putative host factor binding region (HFBR) on the B half of the MVM dimer bridge adjacent to the NS-1 binding site. Nicking assays of these B half mutants showed that two CG motifs displaced by 10 nucleotides are important for nicking. Gel mobility shift assays demonstrated that a host factor(s) can bind to the HFBR of the B half of the dimer bridge and efficient binding depends on the presence of both CG motifs. Competitor DNA containing the wild-type HFBR sequence is able to specifically inhibit nicking of the B half, indicating that the host factor(s) bound to the HFBR is(are) essential for site-specific nicking to occur.

  11. Mojo Hand, a TALEN design tool for genome editing applications.

    PubMed

    Neff, Kevin L; Argue, David P; Ma, Alvin C; Lee, Han B; Clark, Karl J; Ekker, Stephen C

    2013-01-16

    Recent studies of transcription activator-like (TAL) effector domains fused to nucleases (TALENs) demonstrate enormous potential for genome editing. Effective design of TALENs requires a combination of selecting appropriate genetic features, finding pairs of binding sites based on a consensus sequence, and, in some cases, identifying endogenous restriction sites for downstream molecular genetic applications. We present the web-based program Mojo Hand for designing TAL and TALEN constructs for genome editing applications (http://www.talendesign.org). We describe the algorithm and its implementation. The features of Mojo Hand include (1) automatic download of genomic data from the National Center for Biotechnology Information, (2) analysis of any DNA sequence to reveal pairs of binding sites based on a user-defined template, (3) selection of restriction-enzyme recognition sites in the spacer between the TAL monomer binding sites including options for the selection of restriction enzyme suppliers, and (4) output files designed for subsequent TALEN construction using the Golden Gate assembly method. Mojo Hand enables the rapid identification of TAL binding sites for use in TALEN design. The assembly of TALEN constructs, is also simplified by using the TAL-site prediction program in conjunction with a spreadsheet management aid of reagent concentrations and TALEN formulation. Mojo Hand enables scientists to more rapidly deploy TALENs for genome editing applications.

  12. Diethylpyrocarbonate and permanganate provide evidence for an unusual DNA conformation induced by binding of the antitumour antibiotics bleomycin and phleomycin.

    PubMed Central

    Fox, K R; Grigg, G W

    1988-01-01

    DNA structural changes induced by bleomycin have been investigated using diethylpyrocarbonate and permanganate as probes under conditions in which the antibiotic binds to, but does not cut the DNA. Diethyl-pyrocarbonate shows an enhanced reaction with adenines in the presence of the antibiotic in the sequences GTA greater than GCA greater than GAA, on the 3' side of the drug cutting site (GPy). Permanganate ions display an enhanced reactivity at the second pyrimidine of the sequence GPyPy. The results are consistent with a model in which bleomycin distorts the structure of the base pair on the 3' side of its binding site. Images PMID:2451809

  13. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites.

    PubMed

    Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong

    2015-01-01

    Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.

  14. HMG I(Y) interferes with the DNA binding of NF-AT factors and the induction of the interleukin 4 promoter in T cells

    PubMed Central

    Klein-Hessling, Stefan; Schneider, Günter; Heinfling, Annette; Chuvpilo, Sergei; Serfling, Edgar

    1996-01-01

    HMG I(Y) proteins bind to double-stranded A+T oligonucleotides longer than three base pairs. Such motifs form part of numerous NF-AT-binding sites of lymphokine promoters, including the interleukin 4 (IL-4) promoter. NF-AT factors share short homologous peptide sequences in their DNA-binding domain with NF-κB factors and bind to certain NF-κB sites. It has been shown that HMG I(Y) proteins enhance NF-κB binding to the interferon β promoter and virus-mediated interferon β promoter induction. We show that HMG I(Y) proteins exert an opposite effect on the DNA binding of NF-AT factors and the induction of the IL-4 promoter in T lymphocytes. Introduction of mutations into a high-affinity HMG I(Y)-binding site of the IL-4 promoter, which decreased HMG I(Y)-binding to a NF-AT-binding sequence, the Pu-bB (or P) site, distinctly increased the induction of the IL-4 promoter in Jurkat T leukemia cells. High concentrations of HMG I(Y) proteins are able to displace NF-ATp from its binding to the Pu-bB site. High HMG I(Y) concentrations are typical for Jurkat cells and peripheral blood T lymphocytes, whereas El4 T lymphoma cells and certain T helper type 2 cell clones contain relatively low HMG I(Y) concentrations. Our results indicate that HMG I(Y) proteins do not cooperate, but instead compete with NF-AT factors for the binding to DNA even though NF-AT factors share some DNA-binding properties with NF-kB factors. This competition between HMG I(Y) and NF-AT proteins for DNA binding might be due to common contacts with minor groove nucleotides of DNA and may be one mechanism contributing to the selective IL-4 expression in certain T lymphocyte populations, such as T helper type 2 cells. PMID:8986808

  15. msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding

    PubMed Central

    Gilad, Yoav; Pritchard, Jonathan K.; Stephens, Matthew

    2015-01-01

    Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework fails to account for the substantial variation in the DNase I cleavage profiles across different binding sites. Neither does it account for variation in the profiles at the same binding site across multiple replicate DNase I experiments, which are increasingly available. In this work, we introduce new methods, based on multi-scale models for inhomogeneous Poisson processes, to account for such variation in DNase I cleavage patterns both within and across binding sites. These models account for the spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-seq peaks for those factors. Finally, we explore the effects of DNase I sequence bias on inference of factor binding using a simple extension to our framework that allows for a more flexible background model. The proposed model can also be easily applied to paired-end ATAC-seq and DNase-seq data. msCentipede, a Python implementation of our algorithm, is available at http://rajanil.github.io/msCentipede. PMID:26406244

  16. msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding.

    PubMed

    Raj, Anil; Shim, Heejung; Gilad, Yoav; Pritchard, Jonathan K; Stephens, Matthew

    2015-01-01

    Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework fails to account for the substantial variation in the DNase I cleavage profiles across different binding sites. Neither does it account for variation in the profiles at the same binding site across multiple replicate DNase I experiments, which are increasingly available. In this work, we introduce new methods, based on multi-scale models for inhomogeneous Poisson processes, to account for such variation in DNase I cleavage patterns both within and across binding sites. These models account for the spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-seq peaks for those factors. Finally, we explore the effects of DNase I sequence bias on inference of factor binding using a simple extension to our framework that allows for a more flexible background model. The proposed model can also be easily applied to paired-end ATAC-seq and DNase-seq data. msCentipede, a Python implementation of our algorithm, is available at http://rajanil.github.io/msCentipede.

  17. DNA binding site characterization by means of Rényi entropy measures on nucleotide transitions.

    PubMed

    Perera, A; Vallverdu, M; Claria, F; Soria, J M; Caminal, P

    2008-06-01

    In this work, parametric information-theory measures for the characterization of binding sites in DNA are extended with the use of transitional probabilities on the sequence. We propose the use of parametric uncertainty measures such as Rényi entropies obtained from the transition probabilities for the study of the binding sites, in addition to nucleotide frequency-based Rényi measures. Results are reported in this work comparing transition frequencies (i.e., dinucleotides) and base frequencies for Shannon and parametric Rényi entropies for a number of binding sites found in E. Coli, lambda and T7 organisms. We observe that the information provided by both approaches is not redundant. Furthermore, under the presence of noise in the binding site matrix we observe overall improved robustness of nucleotide transition-based algorithms when compared with nucleotide frequency-based method.

  18. Modeling the Embrace of a Mutator: APOBEC Selection of Nucleic Acid Ligands.

    PubMed

    Salter, Jason D; Smith, Harold C

    2018-05-23

    The 11-member APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like) family of zinc-dependent cytidine deaminases bind to RNA and single-stranded DNA (ssDNA) and, in specific contexts, modify select (deoxy)cytidines to (deoxy)uridines. In this review, we describe advances made through high-resolution co-crystal structures of APOBECs bound to mono- or oligonucleotides that reveal potential substrate-specific binding sites at the active site and non-sequence-specific nucleic acid binding sites distal to the active site. We also discuss the effect of APOBEC oligomerization on functionality. Future structural studies will need to address how ssDNA binding away from the active site may enhance catalysis and the mechanism by which RNA binding may modulate catalytic activity on ssDNA. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  19. Isolation of a thyroid hormone-responsive gene by immunoprecipitation of thyroid hormone receptor-DNA complexes.

    PubMed Central

    Bigler, J; Eisenman, R N

    1994-01-01

    Thyroid hormone (T3) receptor (TR) is a ligand-dependent transcription factor that acts through specific binding sites in the promoter region of target genes. In order to identify new genes that are regulated by T3, we used anti-TR antiserum to immunoprecipitate TR-DNA complexes from GH4 cell nuclei that had previously been treated with a restriction enzyme. Screening of the immunopurified, cloned DNA for TR binding sites by electrophoretic mobility shift assay yielded 53 positive clones. A subset of these clones was specifically immunoprecipitated with anti-TR antiserum and may therefore represent biologically significant binding sites. One of these clones, clone 122, was characterized in detail. It includes sequences highly related to the NICER long terminal repeat-like element and contains three TR binding sites as determined by DNase I footprinting. Two of the clone 122 TR binding sites are located upstream of the TATA box, and one is located downstream. The TR binding site downstream from the promoter was necessary and sufficient to confer T3-dependent regulation in transient transfection experiments. Expression of a reporter construct under the control of the clone 122 promoter region was activated by TR in the absence of ligand and returned to basal levels after T3 addition. Clone 122 sequences hybridize to at least two different mRNAs of approximately 6 and 10 kb from GH4 cells. The levels of both of these mRNAs increased upon removal of T3. Our studies suggest that specific immunoprecipitation of chromatin allows identification of binding sites and target genes for transcription factors. Images PMID:7935476

  20. Arabidopsis Polycomb Repressive Complex 2 binding sites contain putative GAGA factor binding motifs within coding regions of genes

    PubMed Central

    2013-01-01

    Background Polycomb Repressive Complex 2 (PRC2) is an essential regulator of gene expression that maintains genes in a repressed state by marking chromatin with trimethylated Histone H3 lysine 27 (H3K27me3). In Arabidopsis, loss of PRC2 function leads to pleiotropic effects on growth and development thought to be due to ectopic expression of seed and embryo-specific genes. While there is some understanding of the mechanisms by which specific genes are targeted by PRC2 in animal systems, it is still not clear how PRC2 is recruited to specific regions of plant genomes. Results We used ChIP-seq to determine the genome-wide distribution of hemagglutinin (HA)-tagged FERTLIZATION INDEPENDENT ENDOSPERM (FIE-HA), the Extra Sex Combs homolog protein present in all Arabidopsis PRC2 complexes. We found that the FIE-HA binding sites co-locate with a subset of the H3K27me3 sites in the genome and that the associated genes were more likely to be de-repressed in mutants of PRC2 components. The FIE-HA binding sites are enriched for three sequence motifs including a putative GAGA factor binding site that is also found in Drosophila Polycomb Response Elements (PREs). Conclusions Our results suggest that PRC2 binding sites in plant genomes share some sequence features with Drosophila PREs. However, unlike Drosophila PREs which are located in promoters and devoid of H3K27me3, Arabidopsis FIE binding sites tend to be in gene coding regions and co-localize with H3K27me3. PMID:24001316

  1. Structural basis of DNA bending and oriented heterodimer binding by the basic leucine zipper domains of Fos and Jun.

    PubMed

    Leonard, D A; Rajaram, N; Kerppola, T K

    1997-05-13

    Interactions among transcription factors that bind to separate sequence elements require bending of the intervening DNA and juxtaposition of interacting molecular surfaces in an appropriate orientation. Here, we examine the effects of single amino acid substitutions adjacent to the basic regions of Fos and Jun as well as changes in sequences flanking the AP-1 site on DNA bending. Substitution of charged amino acid residues at positions adjacent to the basic DNA-binding domains of Fos and Jun altered DNA bending. The change in DNA bending was directly proportional to the change in net charge for all heterodimeric combinations between these proteins. Fos and Jun induced distinct DNA bends at different binding sites. Exchange of a single base pair outside of the region contacted in the x-ray crystal structure altered DNA bending. Substitution of base pairs flanking the AP-1 site had converse effects on the opposite directions of DNA bending induced by homodimers and heterodimers. These results suggest that Fos and Jun induce DNA bending in part through electrostatic interactions between amino acid residues adjacent to the basic region and base pairs flanking the AP-1 site. DNA bending by Fos and Jun at inverted binding sites indicated that heterodimers bind to the AP-1 site in a preferred orientation. Mutation of a conserved arginine within the basic regions of Fos and transversion of the central C:G base pair in the AP-1 site to G:C had complementary effects on the orientation of heterodimer binding and DNA bending. The conformational variability of the Fos-Jun-AP-1 complex may contribute to its functional versatility at different promoters.

  2. The NS1 polypeptide of the murine parvovirus minute virus of mice binds to DNA sequences containing the motif [ACCA]2-3.

    PubMed Central

    Cotmore, S F; Christensen, J; Nüesch, J P; Tattersall, P

    1995-01-01

    A DNA fragment containing the minute virus of mice 3' replication origin was specifically coprecipitated in immune complexes containing the virally coded NS1, but not the NS2, polypeptide. Antibodies directed against the amino- or carboxy-terminal regions of NS1 precipitated the NS1-origin complexes, but antibodies directed against NS1 amino acids 284 to 459 blocked complex formation. Using affinity-purified histidine-tagged NS1 preparations, we have shown that the specific protein-DNA interaction is of moderate affinity, being stable in 0.1 M salt but rapidly lost at higher salt concentrations. In contrast, generalized (or nonspecific) DNA binding by NS1 could be demonstrated only in low salt. Addition of ATP or gamma S-ATP enhanced specific DNA binding by wild-type NS1 severalfold, but binding was lost under conditions which favored ATP hydrolysis. NS1 molecules with mutations in a critical lysine residue (amino acid 405) in the consensus ATP-binding site bound to the origin, but this binding could not be enhanced by ATP addition. DNase I protection assays carried out with wild-type NS1 in the presence of gamma S-ATP gave footprints which extended over 43 nucleotides on both DNA strands, from the middle of the origin bubble sequence to a position some 14 bp beyond the nick site. The DNA-binding site for NS1 was mapped to a 22-bp fragment from the middle of the 3' replication origin which contains the sequence ACCAACCA. This conforms to a reiterated motif (ACCA)2-3, which occurs, in more or less degenerate form, at many sites throughout the minute virus of mice genome (J. W. Bodner, Virus Genes 2:167-182, 1989). Insertion of a single copy of the sequence (ACCA)3 was shown to be sufficient to confer NS1 binding on an otherwise unrecognized plasmid fragment. The functions of NS1 in the viral life cycle are reevaluated in the light of this result. PMID:7853501

  3. The NS1 polypeptide of the murine parvovirus minute virus of mice binds to DNA sequences containing the motif [ACCA]2-3.

    PubMed

    Cotmore, S F; Christensen, J; Nüesch, J P; Tattersall, P

    1995-03-01

    A DNA fragment containing the minute virus of mice 3' replication origin was specifically coprecipitated in immune complexes containing the virally coded NS1, but not the NS2, polypeptide. Antibodies directed against the amino- or carboxy-terminal regions of NS1 precipitated the NS1-origin complexes, but antibodies directed against NS1 amino acids 284 to 459 blocked complex formation. Using affinity-purified histidine-tagged NS1 preparations, we have shown that the specific protein-DNA interaction is of moderate affinity, being stable in 0.1 M salt but rapidly lost at higher salt concentrations. In contrast, generalized (or nonspecific) DNA binding by NS1 could be demonstrated only in low salt. Addition of ATP or gamma S-ATP enhanced specific DNA binding by wild-type NS1 severalfold, but binding was lost under conditions which favored ATP hydrolysis. NS1 molecules with mutations in a critical lysine residue (amino acid 405) in the consensus ATP-binding site bound to the origin, but this binding could not be enhanced by ATP addition. DNase I protection assays carried out with wild-type NS1 in the presence of gamma S-ATP gave footprints which extended over 43 nucleotides on both DNA strands, from the middle of the origin bubble sequence to a position some 14 bp beyond the nick site. The DNA-binding site for NS1 was mapped to a 22-bp fragment from the middle of the 3' replication origin which contains the sequence ACCAACCA. This conforms to a reiterated motif (ACCA)2-3, which occurs, in more or less degenerate form, at many sites throughout the minute virus of mice genome (J. W. Bodner, Virus Genes 2:167-182, 1989). Insertion of a single copy of the sequence (ACCA)3 was shown to be sufficient to confer NS1 binding on an otherwise unrecognized plasmid fragment. The functions of NS1 in the viral life cycle are reevaluated in the light of this result.

  4. Diversity of Functionally Permissive Sequences in the Receptor-Binding Site of Influenza Hemagglutinin.

    PubMed

    Wu, Nicholas C; Xie, Jia; Zheng, Tianqing; Nycholat, Corwin M; Grande, Geramie; Paulson, James C; Lerner, Richard A; Wilson, Ian A

    2017-06-14

    Influenza A virus hemagglutinin (HA) initiates viral entry by engaging host receptor sialylated glycans via its receptor-binding site (RBS). The amino acid sequence of the RBS naturally varies across avian and human influenza virus subtypes and is also evolvable. However, functional sequence diversity in the RBS has not been fully explored. Here, we performed a large-scale mutational analysis of the RBS of A/WSN/33 (H1N1) and A/Hong Kong/1/1968 (H3N2) HAs. Many replication-competent mutants not yet observed in nature were identified, including some that could escape from an RBS-targeted broadly neutralizing antibody. This functional sequence diversity is made possible by pervasive epistasis in the RBS 220-loop and can be buffered by avidity in viral receptor binding. Overall, our study reveals that the HA RBS can accommodate a much greater range of sequence diversity than previously thought, which has significant implications for the complex evolutionary interrelationships between receptor specificity and immune escape. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Regulation of CYBB Gene Expression in Human Phagocytes by a Distant Upstream NF-κB Binding Site.

    PubMed

    Frazão, Josias B; Thain, Alison; Zhu, Zhiqing; Luengo, Marcos; Condino-Neto, Antonio; Newburger, Peter E

    2015-09-01

    The human CYBB gene encodes the gp91-phox component of the phagocyte oxidase enzyme complex, which is responsible for generating superoxide and other downstream reactive oxygen species essential to microbial killing. In the present study, we have identified by sequence analysis a putative NF-κB binding site in a DNase I hypersensitive site, termed HS-II, located in the distant 5' flanking region of the CYBB gene. Electrophoretic mobility assays showed binding of the sequence element by recombinant NF-κB protein p50 and by proteins in nuclear extract from the HL-60 myeloid leukemia cell line corresponding to p50 and to p50/p65 heterodimers. Chromatin immunoprecipitation demonstrated NF-κB binding to the site in intact HL-60 cells. Chromosome conformation capture (3C) assays demonstrated physical interaction between the NF-κB binding site and the CYBB promoter region. Inhibition of NF-κB activity by salicylate reduced CYBB expression in peripheral blood neutrophils and differentiated U937 monocytic leukemia cells. U937 cells transfected with a mutant inhibitor of κB "super-repressor" showed markedly diminished CYBB expression. Luciferase reporter analysis of the NF-κB site linked to the CYBB 5' flanking promoter region revealed enhanced expression, augmented by treatment with interferon-γ. These studies indicate a role for this distant, 15 kb upstream, binding site in NF-κB regulation of the CYBB gene, an essential component of phagocyte-mediated host defense. © 2015 Wiley Periodicals, Inc.

  6. Structure and specificity of the RNA-guided endonuclease Cas9 during DNA interrogation, target binding and cleavage

    PubMed Central

    Josephs, Eric A.; Kocak, D. Dewran; Fitzgibbon, Christopher J.; McMenemy, Joshua; Gersbach, Charles A.; Marszalek, Piotr E.

    2015-01-01

    CRISPR-associated endonuclease Cas9 cuts DNA at variable target sites designated by a Cas9-bound RNA molecule. Cas9's ability to be directed by single ‘guide RNA’ molecules to target nearly any sequence has been recently exploited for a number of emerging biological and medical applications. Therefore, understanding the nature of Cas9's off-target activity is of paramount importance for its practical use. Using atomic force microscopy (AFM), we directly resolve individual Cas9 and nuclease-inactive dCas9 proteins as they bind along engineered DNA substrates. High-resolution imaging allows us to determine their relative propensities to bind with different guide RNA variants to targeted or off-target sequences. Mapping the structural properties of Cas9 and dCas9 to their respective binding sites reveals a progressive conformational transformation at DNA sites with increasing sequence similarity to its target. With kinetic Monte Carlo (KMC) simulations, these results provide evidence of a ‘conformational gating’ mechanism driven by the interactions between the guide RNA and the 14th–17th nucleotide region of the targeted DNA, the stabilities of which we find correlate significantly with reported off-target cleavage rates. KMC simulations also reveal potential methodologies to engineer guide RNA sequences with improved specificity by considering the invasion of guide RNAs into targeted DNA duplex. PMID:26384421

  7. A Feature-Based Approach to Modeling Protein–DNA Interactions

    PubMed Central

    Segal, Eran

    2008-01-01

    Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF–DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP) dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/. PMID:18725950

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steiner, B.; Cousot, D.; Trzeciak, A.

    The platelet glycoprotein IIb-IIIa complex (GP IIb-IIIa) is a member of the integrin receptor family that recognizes adhesive proteins containing the Arg-Gly-Asp (RGD) sequence. In the present study the binding characteristics of the synthetic hexapeptide Tyr-Asn-Arg-Gly-Asp-Ser (YNRGDS, a sequence present in the fibrinogen alpha-chain at position 570-575) to purified GP IIb-IIIa were determined by equilibrium dialysis. The binding of 125I-YNRGDS to GP IIb-IIIa was specific, saturable, and reversible. The apparent dissociation constant was 1.0 +/- 0.2 microM, and the maximal binding capacity was 0.92 +/- 0.02 mol of 125I-YNRGDS/mol of GP IIb-IIIa, indicating that GP IIb-IIIa contains a single bindingmore » site for RGD peptides. The binding of 125I-YNRGDS to purified GP IIb-IIIa showed many of the characteristics of fibrinogen binding to activated platelets: the binding was inhibited by fibrinogen, by the monoclonal antibody A2A9, and by the dodecapeptide from the C terminus of the fibrinogen gamma-chain. In addition, the binding of 125I-YNRGDS to GP IIb-IIIa was divalent cation-dependent. Our data suggest that two divalent cation binding sites must be occupied for YNRGDS to bind: one site is specific for calcium and is saturated at 1 microM free Ca2+, whereas the other site is less specific and reaches saturation at millimolar concentrations of either Ca2+ or Mg2+. The results of the present study support the hypothesis that the RGD domains within the adhesive proteins are responsible for their binding to GP IIb-IIIa.« less

  9. A tool for calculating binding-site residues on proteins from PDB structures.

    PubMed

    Hu, Jing; Yan, Changhui

    2009-08-03

    In the research on protein functional sites, researchers often need to identify binding-site residues on a protein. A commonly used strategy is to find a complex structure from the Protein Data Bank (PDB) that consists of the protein of interest and its interacting partner(s) and calculate binding-site residues based on the complex structure. However, since a protein may participate in multiple interactions, the binding-site residues calculated based on one complex structure usually do not reveal all binding sites on a protein. Thus, this requires researchers to find all PDB complexes that contain the protein of interest and combine the binding-site information gleaned from them. This process is very time-consuming. Especially, combing binding-site information obtained from different PDB structures requires tedious work to align protein sequences. The process becomes overwhelmingly difficult when researchers have a large set of proteins to analyze, which is usually the case in practice. In this study, we have developed a tool for calculating binding-site residues on proteins, TCBRP http://yanbioinformatics.cs.usu.edu:8080/ppbindingsubmit. For an input protein, TCBRP can quickly find all binding-site residues on the protein by automatically combining the information obtained from all PDB structures that consist of the protein of interest. Additionally, TCBRP presents the binding-site residues in different categories according to the interaction type. TCBRP also allows researchers to set the definition of binding-site residues. The developed tool is very useful for the research on protein binding site analysis and prediction.

  10. Sequence Discrimination by Alternatively Spliced Isoforms of a DNA Binding Zinc Finger Domain

    NASA Astrophysics Data System (ADS)

    Gogos, Joseph A.; Hsu, Tien; Bolton, Jesse; Kafatos, Fotis C.

    1992-09-01

    Two major developmentally regulated isoforms of the Drosophila chorion transcription factor CF2 differ by an extra zinc finger within the DNA binding domain. The preferred DNA binding sites were determined and are distinguished by an internal duplication of TAT in the site recognized by the isoform with the extra finger. The results are consistent with modular interactions between zinc fingers and trinucleotides and also suggest rules for recognition of AT-rich DNA sites by zinc finger proteins. The results show how modular finger interactions with trinucleotides can be used, in conjunction with alternative splicing, to alter the binding specificity and increase the spectrum of sites recognized by a DNA binding domain. Thus, CF2 may potentially regulate distinct sets of target genes during development.

  11. Functional specificity of a Hox protein mediated by the recognition of minor groove structure.

    PubMed

    Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S

    2007-11-02

    The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.

  12. Structural basis of UGUA recognition by the Nudix protein CFIm25 and implications for a regulatory role in mRNA 3′ processing

    PubMed Central

    Yang, Qin; Gilmartin, Gregory M.; Doublié, Sylvie

    2010-01-01

    Human Cleavage Factor Im (CFIm) is an essential component of the pre-mRNA 3′ processing complex that functions in the regulation of poly(A) site selection through the recognition of UGUA sequences upstream of the poly(A) site. Although the highly conserved 25 kDa subunit (CFIm25) of the CFIm complex possesses a characteristic α/β/α Nudix fold, CFIm25 has no detectable hydrolase activity. Here we report the crystal structures of the human CFIm25 homodimer in complex with UGUAAA and UUGUAU RNA sequences. CFIm25 is the first Nudix protein to be reported to bind RNA in a sequence-specific manner. The UGUA sequence contributes to binding specificity through an intramolecular G:A Watson–Crick/sugar-edge base interaction, an unusual pairing previously found to be involved in the binding specificity of the SAM-III riboswitch. The structures, together with mutational data, suggest a novel mechanism for the simultaneous sequence-specific recognition of two UGUA elements within the pre-mRNA. Furthermore, the mutually exclusive binding of RNA and the signaling molecule Ap4A (diadenosine tetraphosphate) by CFIm25 suggests a potential role for small molecules in the regulation of mRNA 3′ processing. PMID:20479262

  13. Structural basis of UGUA recognition by the Nudix protein CFI(m)25 and implications for a regulatory role in mRNA 3' processing.

    PubMed

    Yang, Qin; Gilmartin, Gregory M; Doublié, Sylvie

    2010-06-01

    Human Cleavage Factor Im (CFI(m)) is an essential component of the pre-mRNA 3' processing complex that functions in the regulation of poly(A) site selection through the recognition of UGUA sequences upstream of the poly(A) site. Although the highly conserved 25 kDa subunit (CFI(m)25) of the CFI(m) complex possesses a characteristic alpha/beta/alpha Nudix fold, CFI(m)25 has no detectable hydrolase activity. Here we report the crystal structures of the human CFI(m)25 homodimer in complex with UGUAAA and UUGUAU RNA sequences. CFI(m)25 is the first Nudix protein to be reported to bind RNA in a sequence-specific manner. The UGUA sequence contributes to binding specificity through an intramolecular G:A Watson-Crick/sugar-edge base interaction, an unusual pairing previously found to be involved in the binding specificity of the SAM-III riboswitch. The structures, together with mutational data, suggest a novel mechanism for the simultaneous sequence-specific recognition of two UGUA elements within the pre-mRNA. Furthermore, the mutually exclusive binding of RNA and the signaling molecule Ap(4)A (diadenosine tetraphosphate) by CFI(m)25 suggests a potential role for small molecules in the regulation of mRNA 3' processing.

  14. Glucocorticoids suppress tumor necrosis factor-alpha expression by human monocytic THP-1 cells by suppressing transactivation through adjacent NF-kappa B and c-Jun-activating transcription factor-2 binding sites in the promoter.

    PubMed

    Steer, J H; Kroeger, K M; Abraham, L J; Joyce, D A

    2000-06-16

    Glucocorticoid drugs suppress tumor necrosis factor-alpha (TNF-alpha) synthesis by activated monocyte/macrophages, contributing to an anti-inflammatory action in vivo. In lipopolysaccharide (LPS)-activated human monocytic THP-1 cells, glucocorticoids acted primarily on the TNF-alpha promoter to suppress a burst of transcriptional activity that occurred between 90 min and 3 h after LPS exposure. LPS increased nuclear c-Jun/ATF-2, NF-kappaB(1)/Rel-A, and Rel-A/C-Rel transcription factor complexes, which bound specifically to oligonucleotide sequences from the -106 to -88 base pair (bp) region of the promoter. The glucocorticoid, dexamethasone, suppressed nuclear binding activity of these complexes prior to and during the critical phase of TNF-alpha transcription. Site-directed mutagenesis in TNF-alpha promoter-luciferase reporter constructs showed that the adjacent c-Jun/ATF-2 (-106 to -99 bp) and NF-kappaB (-97 to -88 bp) binding sites each contributed to the LPS-stimulated expression. Mutating both sites largely prevented dexamethasone from suppressing TNF-alpha promoter-luciferase reporters. LPS exposure also increased nuclear Egr-1 and PU.1 abundance. The Egr-1/Sp1 (-172 to -161 bp) binding sites and the PU.1-binding Ets site (-116 to -110 bp) each contributed to the LPS-stimulated expression but not to glucocorticoid response. Dexamethasone suppressed the abundance of the c-Fos/c-Jun complex in THP-1 cell nuclei, but there was no direct evidence for c-Fos/c-Jun transactivation through sites in the -172 to -52 bp region. Small contributions to glucocorticoid response were attributable to promoter sequences outside the -172 to -88 bp region and to sequences in the TNF-alpha 3'-untranslated region. We conclude that glucocorticoids suppress LPS-stimulated secretion of TNF-alpha from human monocytic cells largely through antagonizing transactivation by c-Jun/ATF-2 and NF-kappaB complexes at binding sites in the -106 to -88 bp region of the TNF-alpha promoter.

  15. Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

    PubMed Central

    Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

    2011-01-01

    DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738

  16. H-2RIIBP, a member of the nuclear hormone receptor superfamily that binds to both the regulatory element of major histocompatibility class I genes and the estrogen response element.

    PubMed

    Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K

    1989-11-01

    Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, but not to other MHC cis-acting sequences or to mutant region II sequences, similar to the naturally occurring region II factor in mouse cells. The deduced amino acid sequence of H-2RIIBP revealed two putative zinc fingers homologous to the DNA-binding domain of steroid/thyroid hormone receptors. Although sequence similarity in other regions was minimal, H-2RIIBP has apparent modular domains characteristic of the nuclear hormone receptors. Further analyses showed that both H-2RIIBP and the natural region II factor bind to the estrogen response element (ERE) of the vitellogenin A2 gene. The ERE is composed of a palindrome, and half of this palindrome resembles the region II binding site of the MHC CRE. These results indicate that H-2RIIBP (i) is a member of the superfamily of nuclear hormone receptors and (ii) may regulate not only MHC class I genes but also genes containing the ERE and related sequences. Sequences homologous to the H-2RIIBP gene are widely conserved in the animal kingdom. H-2RIIBP mRNA is expressed in many mouse tissues, in agreement with the distribution of the natural region II factor.

  17. sc-PDB: a database for identifying variations and multiplicity of 'druggable' binding sites in proteins.

    PubMed

    Meslamani, Jamel; Rognan, Didier; Kellenberger, Esther

    2011-05-01

    The sc-PDB database is an annotated archive of druggable binding sites extracted from the Protein Data Bank. It contains all-atoms coordinates for 8166 protein-ligand complexes, chosen for their geometrical and physico-chemical properties. The sc-PDB provides a functional annotation for proteins, a chemical description for ligands and the detailed intermolecular interactions for complexes. The sc-PDB now includes a hierarchical classification of all the binding sites within a functional class. The sc-PDB entries were first clustered according to the protein name indifferent of the species. For each cluster, we identified dissimilar sites (e.g. catalytic and allosteric sites of an enzyme). SCOPE AND APPLICATIONS: The classification of sc-PDB targets by binding site diversity was intended to facilitate chemogenomics approaches to drug design. In ligand-based approaches, it avoids comparing ligands that do not share the same binding site. In structure-based approaches, it permits to quantitatively evaluate the diversity of the binding site definition (variations in size, sequence and/or structure). The sc-PDB database is freely available at: http://bioinfo-pharma.u-strasbg.fr/scPDB.

  18. Computational analysis of protein-protein interfaces involving an alpha helix: insights for terphenyl-like molecules binding.

    PubMed

    Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A

    2013-06-14

    Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.

  19. Global Analysis of Transcription Factor-Binding Sites in Yeast Using ChIP-Seq

    PubMed Central

    Lefrançois, Philippe; Gallagher, Jennifer E. G.; Snyder, Michael

    2016-01-01

    Transcription factors influence gene expression through their ability to bind DNA at specific regulatory elements. Specific DNA-protein interactions can be isolated through the chromatin immunoprecipitation (ChIP) procedure, in which DNA fragments bound by the protein of interest are recovered. ChIP is followed by high-throughput DNA sequencing (Seq) to determine the genomic provenance of ChIP DNA fragments and their relative abundance in the sample. This chapter describes a ChIP-Seq strategy adapted for budding yeast to enable the genome-wide characterization of binding sites of transcription factors (TFs) and other DNA-binding proteins in an efficient and cost-effective way. Yeast strains with epitope-tagged TFs are most commonly used for ChIP-Seq, along with their matching untagged control strains. The initial step of ChIP involves the cross-linking of DNA and proteins. Next, yeast cells are lysed and sonicated to shear chromatin into smaller fragments. An antibody against an epitope-tagged TF is used to pull down chromatin complexes containing DNA and the TF of interest. DNA is then purified and proteins degraded. Specific barcoded adapters for multiplex DNA sequencing are ligated to ChIP DNA. Short DNA sequence reads (28–36 base pairs) are parsed according to the barcode and aligned against the yeast reference genome, thus generating a nucleotide-resolution map of transcription factor-binding sites and their occupancy. PMID:25213249

  20. Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsodikov, Oleg V.; Biswas, Tapan

    An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less

  1. Does TATA matter? A structural exploration of the selectivity determinants in its complexes with TATA box-binding protein.

    PubMed Central

    Pastor, N; Pardo, L; Weinstein, H

    1997-01-01

    The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783

  2. Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals.

    PubMed

    Hu, Xiuzhen; Dong, Qiwen; Yang, Jianyi; Zhang, Yang

    2016-11-01

    More than half of proteins require binding of metal and acid radical ions for their structure and function. Identification of the ion-binding locations is important for understanding the biological functions of proteins. Due to the small size and high versatility of the metal and acid radical ions, however, computational prediction of their binding sites remains difficult. We proposed a new ligand-specific approach devoted to the binding site prediction of 13 metal ions (Zn 2+ , Cu 2+ , Fe 2+ , Fe 3+ , Ca 2+ , Mg 2+ , Mn 2+ , Na + , K + ) and acid radical ion ligands (CO3 2- , NO2 - , SO4 2- , PO4 3- ) that are most frequently seen in protein databases. A sequence-based ab initio model is first trained on sequence profiles, where a modified AdaBoost algorithm is extended to balance binding and non-binding residue samples. A composite method IonCom is then developed to combine the ab initio model with multiple threading alignments for further improving the robustness of the binding site predictions. The pipeline was tested using 5-fold cross validations on a comprehensive set of 2,100 non-redundant proteins bound with 3,075 small ion ligands. Significant advantage was demonstrated compared with the state of the art ligand-binding methods including COACH and TargetS for high-accuracy ion-binding site identification. Detailed data analyses show that the major advantage of IonCom lies at the integration of complementary ab initio and template-based components. Ion-specific feature design and binding library selection also contribute to the improvement of small ion ligand binding predictions. http://zhanglab.ccmb.med.umich.edu/IonCom CONTACT: hxz@imut.edu.cn or zhng@umich.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    PubMed Central

    2012-01-01

    Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We suggest that small differences in our discovered motif could confer specificity for one or more homologous GTF proteins. We offer a free implementation of the MotifCatcher software package at http://www.bme.ucdavis.edu/facciotti/resources_data/software/. PMID:23181585

  4. Alignment-independent comparison of binding sites based on DrugScore potential fields encoded by 3D Zernike descriptors.

    PubMed

    Nisius, Britta; Gohlke, Holger

    2012-09-24

    Analyzing protein binding sites provides detailed insights into the biological processes proteins are involved in, e.g., into drug-target interactions, and so is of crucial importance in drug discovery. Herein, we present novel alignment-independent binding site descriptors based on DrugScore potential fields. The potential fields are transformed to a set of information-rich descriptors using a series expansion in 3D Zernike polynomials. The resulting Zernike descriptors show a promising performance in detecting similarities among proteins with low pairwise sequence identities that bind identical ligands, as well as within subfamilies of one target class. Furthermore, the Zernike descriptors are robust against structural variations among protein binding sites. Finally, the Zernike descriptors show a high data compression power, and computing similarities between binding sites based on these descriptors is highly efficient. Consequently, the Zernike descriptors are a useful tool for computational binding site analysis, e.g., to predict the function of novel proteins, off-targets for drug candidates, or novel targets for known drugs.

  5. LTRs of endogenous retroviruses as a source of Tbx6 binding sites

    NASA Astrophysics Data System (ADS)

    Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

    2017-06-01

    Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box transcription factors. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.

  6. LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites

    PubMed Central

    Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

    2017-01-01

    Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/−) and Tbx6-deficient mice (Tbx6 −/−), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis. PMID:28664156

  7. LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites.

    PubMed

    Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi

    2017-01-01

    Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1 , and Nfxl1 , are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.

  8. Mutation of mapped TIA-1/TIAR binding sites in the 3' terminal stem-loop of West Nile virus minus-strand RNA in an infectious clone negatively affects genomic RNA amplification.

    PubMed

    Emara, Mohamed M; Liu, Hsuan; Davis, William G; Brinton, Margo A

    2008-11-01

    Previous data showed that the cellular proteins TIA-1 and TIAR bound specifically to the West Nile virus 3' minus-strand stem-loop [WNV3'(-)SL] RNA (37) and colocalized with flavivirus replication complexes in WNV- and dengue virus-infected cells (21). In the present study, the sites on the WNV3'(-)SL RNA required for efficient in vitro T-cell intracellular antigen-related (TIAR) and T-cell intracellular antigen-1 (TIA-1) protein binding were mapped to short AU sequences (UAAUU) located in two internal loops of the WNV3'(-)SL RNA structure. Infectious clone RNAs with all or most of the binding site nucleotides in one of the 3' (-)SL loops deleted or substituted did not produce detectable virus after transfection or subsequent passage. With one exception, deletion/mutation of a single terminal nucleotide in one of the binding sequences had little effect on the efficiency of protein binding or virus production, but mutation of a nucleotide in the middle of a binding sequence reduced both the in vitro protein binding efficiency and virus production. Plaque size, intracellular genomic RNA levels, and virus production progressively decreased with decreasing in vitro TIAR/TIA-1 binding activity, but the translation efficiency of the various mutant RNAs was similar to that of the parental RNA. Several of the mutant RNAs that inefficiently interacted with TIAR/TIA-1 in vitro rapidly reverted in vivo, indicating that they could replicate at a low level and suggesting that an interaction between TIAR/TIA-1 and the viral 3'(-)SL RNA is not required for initial low-level symmetric RNA replication but instead facilitates the subsequent asymmetric amplification of genome RNA from the minus-strand template.

  9. DNA binding sites characterization by means of Rényi entropy measures on nucleotide transitions.

    PubMed

    Perera, Alexandre; Vallverdu, Montserrat; Claria, Francesc; Soria, José Manuel; Caminal, Pere

    2006-01-01

    In this work, parametric information-theory measures for the characterization of binding sites in DNA are extended with the use of transitional probabilities on the sequence. We propose the use of parametric uncertainty measure such as Renyi entropies obtained from the transition probabilities for the study of the binding sites, in addition to nucleotide frequency based Renyi measures. Results are reported in this manuscript comparing transition frequencies (i.e. dinucelotides) and base frequencies for Shannon and parametric Renyi for a number of binding sites found in E. Coli, lambda and T7 organisms. We observe that, for the evaluated datasets, the information provided by both approaches is not redundant, as they evolve differently under increasing Renyi orders.

  10. Single-molecule DNA unzipping reveals asymmetric modulation of a transcription factor by its binding site sequence and context

    PubMed Central

    Rudnizky, Sergei; Khamis, Hadeel; Malik, Omri; Squires, Allison H; Meller, Amit; Melamed, Philippa

    2018-01-01

    Abstract Most functional transcription factor (TF) binding sites deviate from their ‘consensus’ recognition motif, although their sites and flanking sequences are often conserved across species. Here, we used single-molecule DNA unzipping with optical tweezers to study how Egr-1, a TF harboring three zinc fingers (ZF1, ZF2 and ZF3), is modulated by the sequence and context of its functional sites in the Lhb gene promoter. We find that both the core 9 bp bound to Egr-1 in each of the sites, and the base pairs flanking them, modulate the affinity and structure of the protein–DNA complex. The effect of the flanking sequences is asymmetric, with a stronger effect for the sequence flanking ZF3. Characterization of the dissociation time of Egr-1 revealed that a local, mechanical perturbation of the interactions of ZF3 destabilizes the complex more effectively than a perturbation of the ZF1 interactions. Our results reveal a novel role for ZF3 in the interaction of Egr-1 with other proteins and the DNA, providing insight on the regulation of Lhb and other genes by Egr-1. Moreover, our findings reveal the potential of small changes in DNA sequence to alter transcriptional regulation, and may shed light on the organization of regulatory elements at promoters. PMID:29253225

  11. Twin hydroxymethyluracil-A base pair steps define the binding site for the DNA-binding protein TF1.

    PubMed

    Grove, A; Figueiredo, M L; Galeone, A; Mayol, L; Geiduschek, E P

    1997-05-16

    The DNA-bending protein TF1 is the Bacillus subtilis bacteriophage SPO1-encoded homolog of the bacterial HU proteins and the Escherichia coli integration host factor. We recently proposed that TF1, which binds with high affinity (Kd was approximately 3 nM) to preferred sites within the hydroxymethyluracil (hmU)-containing phage genome, identifies its binding sites based on sequence-dependent DNA flexibility. Here, we show that two hmU-A base pair steps coinciding with two previously proposed sites of DNA distortion are critical for complex formation. The affinity of TF1 is reduced 10-fold when both of these hmU-A base pair steps are replaced with A-hmU, G-C, or C-G steps; only modest changes in affinity result when substitutions are made at other base pairs of the TF1 binding site. Replacement of all hmU residues with thymine decreases the affinity of TF1 greatly; remarkably, the high affinity is restored when the two hmU-A base pair steps corresponding to previously suggested sites of distortion are reintroduced into otherwise T-containing DNA. T-DNA constructs with 3-base bulges spaced apart by 9 base pairs of duplex also generate nM affinity of TF1. We suggest that twin hmU-A base pair steps located at the proposed sites of distortion are key to target site selection by TF1 and that recognition is based largely, if not entirely, on sequence-dependent DNA flexibility.

  12. Global Mapping of Transcription Factor Binding Sites by Sequencing Chromatin Surrogates: a Perspective on Experimental Design, Data Analysis, and Open Problems.

    PubMed

    Wei, Yingying; Wu, George; Ji, Hongkai

    2013-05-01

    Mapping genome-wide binding sites of all transcription factors (TFs) in all biological contexts is a critical step toward understanding gene regulation. The state-of-the-art technologies for mapping transcription factor binding sites (TFBSs) couple chromatin immunoprecipitation (ChIP) with high-throughput sequencing (ChIP-seq) or tiling array hybridization (ChIP-chip). These technologies have limitations: they are low-throughput with respect to surveying many TFs. Recent advances in genome-wide chromatin profiling, including development of technologies such as DNase-seq, FAIRE-seq and ChIP-seq for histone modifications, make it possible to predict in vivo TFBSs by analyzing chromatin features at computationally determined DNA motif sites. This promising new approach may allow researchers to monitor the genome-wide binding sites of many TFs simultaneously. In this article, we discuss various experimental design and data analysis issues that arise when applying this approach. Through a systematic analysis of the data from the Encyclopedia Of DNA Elements (ENCODE) project, we compare the predictive power of individual and combinations of chromatin marks using supervised and unsupervised learning methods, and evaluate the value of integrating information from public ChIP and gene expression data. We also highlight the challenges and opportunities for developing novel analytical methods, such as resolving the one-motif-multiple-TF ambiguity and distinguishing functional and non-functional TF binding targets from the predicted binding sites. The online version of this article (doi:10.1007/s12561-012-9066-5) contains supplementary material, which is available to authorized users.

  13. Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data.

    PubMed

    Waszak, Sebastian M; Kilpinen, Helena; Gschwind, Andreas R; Orioli, Andrea; Raghav, Sunil K; Witwicki, Robert M; Migliavacca, Eugenia; Yurovsky, Alisa; Lappalainen, Tuuli; Hernandez, Nouria; Reymond, Alexandre; Dermitzakis, Emmanouil T; Deplancke, Bart

    2014-01-15

    High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays. The R package abs filter for library clonality simulations and detection of amplification-biased sites is available from http://updepla1srv1.epfl.ch/waszaks/absfilter

  14. Analysis of LexA binding sites and transcriptomics in response to genotoxic stress in Leptospira interrogans.

    PubMed

    Schons-Fonseca, Luciane; da Silva, Josefa B; Milanez, Juliana S; Domingos, Renan H; Smith, Janet L; Nakaya, Helder I; Grossman, Alan D; Ho, Paulo L; da Costa, Renata M A

    2016-02-18

    We determined the effects of DNA damage caused by ultraviolet radiation on gene expression in Leptospira interrogans using DNA microarrays. These data were integrated with DNA binding in vivo of LexA1, a regulator of the DNA damage response, assessed by chromatin immunoprecipitation and massively parallel DNA sequencing (ChIP-seq). In response to DNA damage, Leptospira induced expression of genes involved in DNA metabolism, in mobile genetic elements and defective prophages. The DNA repair genes involved in removal of photo-damage (e.g. nucleotide excision repair uvrABC, recombinases recBCD and resolvases ruvABC) were not induced. Genes involved in various metabolic pathways were down regulated, including genes involved in cell growth, RNA metabolism and the tricarboxylic acid cycle. From ChIP-seq data, we observed 24 LexA1 binding sites located throughout chromosome 1 and one binding site in chromosome 2. Expression of many, but not all, genes near those sites was increased following DNA damage. Binding sites were found as far as 550 bp upstream from the start codon, or 1 kb into the coding sequence. Our findings indicate that there is a shift in gene expression following DNA damage that represses genes involved in cell growth and virulence, and induces genes involved in mutagenesis and recombination. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Sequence information gain based motif analysis.

    PubMed

    Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre

    2015-11-09

    The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.

  16. Characterization of the rat RALDH1 promoter. A functional CCAAT and octamer motif are critical for basal promoter activity.

    PubMed

    Guimond, Julie; Devost, Dominic; Brodeur, Helene; Mader, Sylvie; Bhat, Pangala V

    2002-12-12

    Retinal dehydrogenase type 1 (RALDH1) catalyzes the oxidation of retinal to retinoic acid (RA), a metabolite of vitamin A important for embryogenesis and tissue differentiation. Rat RALDH1 is expressed to high levels in developing kidney, and in stomach, intestine epithelia. To understand the mechanisms of the transcriptional regulation of rat RALDH1, we cloned a 1360-base pair (bp) 5'-flanking region of RALDH1 gene. Using luciferase reporter constructs transfected into HEK 293 and LLCPK (kidney-derived) cells, basal promoter activity was associated with sequences between -80 and +43. In this minimal promoter region, TATA and CCAAT cis-acting elements as well as SP1, AP1 and octamer (Oct)-binding sites were present. The CCAAT box and Oct-binding site, located between positions -72 and -68 and -56 and -49, respectively, were shown by deletion analysis and site-directed mutation to be critical for promoter activity. Nuclear extracts from kidney cells contain proteins specifically binding the Oct and CCAAT sequences, resulting in the formation of six complexes, while different patterns of complexes were observed with non-kidney cell extracts. Gel shift assays using either single or double mutations of the Oct and CCAAT sequences as well as super shift assays demonstrated single and double occupancy of these two sites by Oct-1 and CBF-A. In addition, unidentified proteins also bound the Oct motif specifically in the absence of CBF-A binding. These results demonstrate specific involvement of Oct and CCAAT-binding proteins in the regulation of RALDH1 gene.

  17. Nucleosome regulatory dynamics in response to TGFβ

    PubMed Central

    Enroth, Stefan; Andersson, Robin; Bysani, Madhusudhan; Wallerman, Ola; Termén, Stefan; Tuch, Brian B.; De La Vega, Francisco M.; Heldin, Carl-Henrik; Moustakas, Aristidis; Komorowski, Jan; Wadelius, Claes

    2014-01-01

    Nucleosomes play important roles in a cell beyond their basal functionality in chromatin compaction. Their placement affects all steps in transcriptional regulation, from transcription factor (TF) binding to messenger ribonucleic acid (mRNA) synthesis. Careful profiling of their locations and dynamics in response to stimuli is important to further our understanding of transcriptional regulation by the state of chromatin. We measured nucleosome occupancy in human hepatic cells before and after treatment with transforming growth factor beta 1 (TGFβ1), using massively parallel sequencing. With a newly developed method, SuMMIt, for precise positioning of nucleosomes we inferred dynamics of the nucleosomal landscape. Distinct nucleosome positioning has previously been described at transcription start site and flanking TF binding sites. We found that the average pattern is present at very few sites and, in case of TF binding, the double peak surrounding the sites is just an artifact of averaging over many loci. We systematically searched for depleted nucleosomes in stimulated cells compared to unstimulated cells and identified 24 318 loci. Depending on genomic annotation, 44–78% of them were over-represented in binding motifs for TFs. Changes in binding affinity were verified for HNF4α by qPCR. Strikingly many of these loci were associated with expression changes, as measured by RNA sequencing. PMID:24771338

  18. Structural Analysis of HMGD-DNA Complexes Reveal Influence of Intercalation on Sequence Selectivity and DNA Bending

    PubMed Central

    Churchill, Mair E.A.; Klass, Janet; Zoetewey, David L.

    2010-01-01

    The ubiquitous eukaryotic High-Mobility-Group-Box (HMGB) chromosomal proteins promote many chromatin-mediated cellular activities through their non-sequence-specific binding and bending of DNA. Minor groove DNA binding by the HMG box results in substantial DNA bending toward the major groove owing to electrostatic interactions, shape complementarity and DNA intercalation that occurs at two sites. Here, the structures of the complexes formed with DNA by a partially DNA intercalation-deficient mutant of Drosophila melanogaster HMGD have been determined by X-ray crystallography at a resolution of 2.85 Å. The six proteins and fifty base pairs of DNA in the crystal structure revealed a variety of bound conformations. All of the proteins bound in the minor groove, bridging DNA molecules, presumably because these DNA regions are easily deformed. The loss of the primary site of DNA intercalation decreased overall DNA bending and shape complementarity. However, DNA bending at the secondary site of intercalation was retained and most protein-DNA contacts were preserved. The mode of binding resembles the HMGB1-boxA-cisplatin-DNA complex, which also lacks a primary intercalating residue. This study provides new insights into the binding mechanisms used by HMG boxes to recognize varied DNA structures and sequences as well as modulate DNA structure and DNA bending. PMID:20800069

  19. Human La binds mRNAs through contacts to the poly(A) tail.

    PubMed

    Vinayak, Jyotsna; Marrella, Stefano A; Hussain, Rawaa H; Rozenfeld, Leonid; Solomon, Karine; Bayfield, Mark A

    2018-05-04

    In addition to a role in the processing of nascent RNA polymerase III transcripts, La proteins are also associated with promoting cap-independent translation from the internal ribosome entry sites of numerous cellular and viral coding RNAs. La binding to RNA polymerase III transcripts via their common UUU-3'OH motif is well characterized, but the mechanism of La binding to coding RNAs is poorly understood. Using electromobility shift assays and cross-linking immunoprecipitation, we show that in addition to a sequence specific UUU-3'OH binding mode, human La exhibits a sequence specific and length dependent poly(A) binding mode. We demonstrate that this poly(A) binding mode uses the canonical nucleic acid interaction winged helix face of the eponymous La motif, previously shown to be vacant during uridylate binding. We also show that cytoplasmic, but not nuclear La, engages poly(A) RNA in human cells, that La entry into polysomes utilizes the poly(A) binding mode, and that La promotion of translation from the cyclin D1 internal ribosome entry site occurs in competition with cytoplasmic poly(A) binding protein (PABP). Our data are consistent with human La functioning in translation through contacts to the poly(A) tail.

  20. A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

    PubMed

    Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

    2002-12-01

    There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.

  1. Definition of IgG- and albumin-binding regions of streptococcal protein G.

    PubMed

    Akerström, B; Nielsen, E; Björck, L

    1987-10-05

    Protein G, the immunoglobin G-binding surface protein of group C and G streptococci, also binds serum albumin. The albumin-binding site on protein G is distinct from the immunoglobulin G-binding site. By mild acid hydrolysis of the papain-liberated protein G fragment (35 kDa), a 28-kDa fragment was produced which retained full immunoglobulin G-binding activity (determined by Scatchard plotting) but had lost all albumin-binding capacity. A protein G (65 kDa), isolated after cloning and expression of the protein G gene in Escherichia coli, had comparable affinity to immunoglobulin G (5-10 X 10(10)M-1), but much higher affinity to albumin than the 35- and 28-kDa protein G fragments (31, 2.6, and 0 X 10(9)M-1, respectively). The amino-terminal amino acid sequences of the 65-, 35-, and 28-kDa fragments allowed us to exactly locate the three fragments in an overall sequence map of protein G, based on the partial gene sequences published by Guss et al. (Guss, B., Eliasson, M., Olsson, A., Uhlen, M., Frej, A.-K., Jörnvall, H., Flock, J.-I., and Lindberg, M. (1986) EMBO J. 5, 1567-1575) and Fahnestock et al. (Fahnestock, S. R., Alexander, P., Nagle, J., and Filpula, D. (1986) J. Bacteriol. 167, 870-880). In this map could then be deduced the location of three homologous albumin-binding regions and three homologous immunoglobulin G-binding regions.

  2. Detecting Coevolution in and among Protein Domains

    PubMed Central

    Yeang, Chen-Hsiang; Haussler, David

    2007-01-01

    Correlated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis, previous methods often have to trade off between generality, simplicity, phylogenetic information, and specific knowledge about interactions. Furthermore, despite the evidence of coevolution in selected protein families, a comprehensive screening of coevolution among all protein domains is still lacking. We propose an augmented continuous-time Markov process model for sequence coevolution. The model can handle different types of interactions, incorporate phylogenetic information and sequence substitution, has only one extra free parameter, and requires no knowledge about interaction rules. We employ this model to large-scale screenings on the entire protein domain database (Pfam). Strikingly, with 0.1 trillion tests executed, the majority of the inferred coevolving protein domains are functionally related, and the coevolving amino acid residues are spatially coupled. Moreover, many of the coevolving positions are located at functionally important sites of proteins/protein complexes, such as the subunit linkers of superoxide dismutase, the tRNA binding sites of ribosomes, the DNA binding region of RNA polymerase, and the active and ligand binding sites of various enzymes. The results suggest sequence coevolution manifests structural and functional constraints of proteins. The intricate relations between sequence coevolution and various selective constraints are worth pursuing at a deeper level. PMID:17983264

  3. Unusually weak oxygen binding, physical properties, partial sequence, autoxidation rate and a potential phosphorylation site of beluga whale (Delphinapterus leucas) myoglobin.

    PubMed

    Stewart, J M; Blakely, J A; Karpowicz, P A; Kalanxhi, E; Thatcher, B J; Martin, B M

    2004-03-01

    We purified myoglobin from beluga whale (Delphinapterus leucas) muscle (longissimus dorsi) with size exclusion and cation exchange chromatographies. The molecular mass was determined by mass spectrometry (17,081 Da) and the isoelectric pH (9.4) by capillary isoelectric focusing. The near-complete amino acid sequence was determined and a phylogeny indicated that beluga was in the same clad as Dall's and harbor porpoises. There were consensus motifs for a phosphorylation site on the protein surface with the most likely site at serine-117. This motif was common to all cetacean myoglobins examined. Two oxygen-binding studies at 37 degrees C indicated dissociation constants (20.5 and 23.6 microM) 5.7-6.6 times larger than horse myoglobin (3.6 microM). The autoxidation rate of beluga myoglobin at 37 degrees C, pH 7.2 was 0.218+/-0.028 h(-1), 1/3 larger than reported for myoglobin of terrestrial mammals. There was no clear sequence change to explain the difference in oxygen binding or autoxidation although substitutions (N66 and T67) in an invariant rich sequence (HGNTV) distal to the heme may play a role. Structural models based on the protein sequence and constructed on topologies of known templates (horse and sperm whale crystal structures) were not adequate to assess perturbation of the heme pocket.

  4. Genomic structure of the human D-site binding protein (DBP) gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shutler, G.; Glassco, T.; Kang, Xiaolin

    1996-06-15

    The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less

  5. Copper and the oxidation of hemoglobin: a comparison of horse and human hemoglobins.

    PubMed

    Rifkind, J M; Lauer, L D; Chiang, S C; Li, N C

    1976-11-30

    Oxidation studies of hemoglobin by Cu(II) indicate that for horse hemoglobin, up to a Cu(II)/heme molar ratio of 0.5, all of the Cu(II) added is used to rapidly oxidize the heme. On the other hand, most of the Cu(II) added to human hemoglobin at low Cu(II)/heme molar ratios is unable to oxidize the heme. Only at Cu(II)/heme molar ratios greater than 0.5 does the amount of oxidation per added Cu(II) approach that of horse hemoglobin. At the same time, binding studies indicate that human hemoglobin has an additional binding site involving one copper for every two hemes, which has a higher copper affinity than the single horse hemoglobin binding site. The Cu(II) oxidation of human hemoglobin is explained utilizing this additional binding site by a mechanism where a transfer of electrons cannot occur between the heme and the Cu(II) bound to the high affinity human binding site. The electron transfer must involve the Cu(II) bound to the lower affinity human hemoglobin binding site, which is similar to the only horse hemoglobin site. The involvement of beta-2 histidine in the binding of this additional copper is indicated by a comparison of the amino acid sequences of various hemoglobins which possess the additional site, with the amino acid sequences of hemoglobins which do not possess the additional site. Zn(II), Hg(II), and N-ethylmaleimide (NEM) are found to decrease the Cu(II) oxidation of hemoglobin. The sulfhydryl reagents, Hg(II) and NEM, produce a very dramatic decrease in the rate of oxidation, which can only be explained by an effect on the rate for the actual transfer of electrons between the Cu(II) and the Fe(II). The effect of Zn(II) is much smaller and can, for the most part, be explained by the increased oxygen affinity, which affects the ligand dissociation process that must precede the electron transfer process.

  6. Programmable RNA recognition and cleavage by CRISPR/Cas9.

    PubMed

    O'Connell, Mitchell R; Oakes, Benjamin L; Sternberg, Samuel H; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A

    2014-12-11

    The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA-DNA complementarity to identify target sites for sequence-specific double-stranded DNA (dsDNA) cleavage. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, known as the protospacer adjacent motif (PAM), next to and on the strand opposite the twenty-nucleotide target site in dsDNA. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in a large range of prokaryotic and eukaryotic cell types, and in whole organisms, but it has been thought to be incapable of targeting RNA. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalysed DNA cleavage. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous messenger RNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable transcript recognition without the need for tags.

  7. LexA Binds to Transcription Regulatory Site of Cell Division Gene ftsZ in Toxic Cyanobacterium Microcystis aeruginosa.

    PubMed

    Honda, Takashi; Morimoto, Daichi; Sako, Yoshihiko; Yoshida, Takashi

    2018-05-17

    Previously, we showed that DNA replication and cell division in toxic cyanobacterium Microcystis aeruginosa are coordinated by transcriptional regulation of cell division gene ftsZ and that an unknown protein specifically bound upstream of ftsZ (BpFz; DNA-binding protein to an upstream site of ftsZ) during successful DNA replication and cell division. Here, we purified BpFz from M. aeruginosa strain NIES-298 using DNA-affinity chromatography and gel-slicing combined with gel electrophoresis mobility shift assay (EMSA). The N-terminal amino acid sequence of BpFz was identified as TNLESLTQ, which was identical to that of transcription repressor LexA from NIES-843. EMSA analysis using mutant probes showed that the sequence GTACTAN 3 GTGTTC was important in LexA binding. Comparison of the upstream regions of lexA in the genomes of closely related cyanobacteria suggested that the sequence TASTRNNNNTGTWC could be a putative LexA recognition sequence (LexA box). Searches for TASTRNNNNTGTWC as a transcriptional regulatory site (TRS) in the genome of M. aeruginosa NIES-843 showed that it was present in genes involved in cell division, photosynthesis, and extracellular polysaccharide biosynthesis. Considering that BpFz binds to the TRS of ftsZ during normal cell division, LexA may function as a transcriptional activator of genes related to cell reproduction in M. aeruginosa, including ftsZ. This may be an example of informality in the control of bacterial cell division.

  8. Programmable RNA recognition and cleavage by CRISPR/Cas9

    PubMed Central

    O’Connell, Mitchell R.; Oakes, Benjamin L.; Sternberg, Samuel H.; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A.

    2014-01-01

    The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA:DNA complementarity to identify target sites for sequence-specific doublestranded DNA (dsDNA) cleavage1-5. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, the protospacer adjacent motif (PAM), next to and on the strand opposite the 20-nucleotide target site in dsDNA4-7. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in many cell types and organisms8, but it has been thought to be incapable of targeting RNA5. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalyzed DNA cleavage7. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous mRNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable and tagless transcript recognition. PMID:25274302

  9. Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478

  10. Identification and functional characterization of an Src homology domain 3 domain-binding site on Cbl.

    PubMed

    Sanjay, Archana; Miyazaki, Tsuyoshi; Itzstein, Cecile; Purev, Enkhtsetseg; Horne, William C; Baron, Roland

    2006-12-01

    Cbl is an adaptor protein and ubiquitin ligase that binds and is phosphorylated by the nonreceptor tyrosine kinase Src. We previously showed that the primary interaction between Src and Cbl is mediated by the Src homology domain 3 (SH3) of Src binding to proline-rich sequences of Cbl. The peptide Cbl RDLPPPPPPDRP(540-551), which corresponds to residues 540-551 of Cbl, inhibited the binding of a GST-Src SH3 fusion protein to Cbl, whereas RDLAPPAPPPDR(540-551) did not, suggesting that Src binds to this site on Cbl in a class I orientation. Mutating prolines 543-548 reduced Src binding to the Cbl 479-636 fragment significantly more than mutating the prolines in the PPVPPR(494-499) motif, which was previously reported to bind Src SH3. Mutating Cbl prolines 543-548 to alanines substantially reduced Src binding to Cbl, Src-induced phosphorylation of Cbl, and the inhibition of Src kinase activity by Cbl. Expressing the mutated Cbl in osteoclasts induced a moderate reduction in bone-resorbing activity and increased amounts of Src protein. In contrast, disabling the tyrosine kinase-binding domain of full-length Cbl by mutating glycine 306 to glutamic acid, and thereby preventing the previously described binding of the tyrosine kinase-binding domain to the Src phosphotyrosine 416, had no effect on Cbl phosphorylation, the inhibition of Src activity by full-length Cbl, or bone resorption. These data indicate that the Cbl RDLPPPP(540-546) sequence is a functionally important binding site for Src.

  11. Binding of pixantrone to DNA at CpA dinucleotide sequences and bulge structures.

    PubMed

    Konda, Shyam K; Wang, Haiqiang; Cutts, Suzanne M; Phillips, Don R; Collins, J Grant

    2015-06-07

    The binding of the anti-cancer drug pixantrone to three oligonucleotide sequences, d(TCATATGA)2, d(CCGAGAATTCCGG)2 {double bulge = DB} and the non-self complementary d(TACGATGAGTA) : d(TACCATCGTA) {single bulge = SB}, has been studied by NMR spectroscopy and molecular modelling. The upfield shifts observed for the aromatic resonances of pixantrone upon addition of the drug to each oligonucleotide confirmed the drug bound by intercalation. For the duplex sequence d(TCATATGA)2, NOEs were observed from the pixantrone aromatic H7/8 and aliphatic Ha/Hb protons to the H6/H8 and H1' protons of the C2, A3, T6 and G7 nucleotides, demonstrating that pixantrone preferentially binds at the symmetric CpA sites. However, weaker NOEs observed to various protons from the T4 and A5 residues indicated alternative minor binding sites. NOEs from the H7/H8 and Ha/Hb protons to both major (H6/H8) and minor groove (H1') protons indicated approximately equal proportions of intercalation was from the major and minor groove at the CpA sites. Intermolecular NOEs were observed between the H7/H8 and H4 protons of pixantrone and the A4H1' and G3H1' protons of the oligonucleotide that contains two symmetrically related bulge sites (DB), indicative of binding at the adenine bulge sites. For the oligonucleotide that only contains a single bulge site (SB), NOEs were observed from pixantrone protons to the SB G7H1', A8H1' and G9H1' protons, confirming that the drug bound selectively at the adenine bulge site. A molecular model of pixantrone-bound SB could be constructed with the drug bound from the minor groove at the A8pG9 site that was consistent with the observed NMR data. The results demonstrate that pixantrone preferentially intercalates at adenine bulge sites, compared to duplex DNA, and predominantly from the minor groove.

  12. Human germline and pan-cancer variomes and their distinct functional profiles

    PubMed Central

    Pan, Yang; Karagiannis, Konstantinos; Zhang, Haichen; Dingerdissen, Hayley; Shamsaddini, Amirhossein; Wan, Quan; Simonyan, Vahan; Mazumder, Raja

    2014-01-01

    Identification of non-synonymous single nucleotide variations (nsSNVs) has exponentially increased due to advances in Next-Generation Sequencing technologies. The functional impacts of these variations have been difficult to ascertain because the corresponding knowledge about sequence functional sites is quite fragmented. It is clear that mapping of variations to sequence functional features can help us better understand the pathophysiological role of variations. In this study, we investigated the effect of nsSNVs on more than 17 common types of post-translational modification (PTM) sites, active sites and binding sites. Out of 1 705 285 distinct nsSNVs on 259 216 functional sites we identified 38 549 variations that significantly affect 10 major functional sites. Furthermore, we found distinct patterns of site disruptions due to germline and somatic nsSNVs. Pan-cancer analysis across 12 different cancer types led to the identification of 51 genes with 106 nsSNV affected functional sites found in 3 or more cancer types. 13 of the 51 genes overlap with previously identified Significantly Mutated Genes (Nature. 2013 Oct 17;502(7471)). 62 mutations in these 13 genes affecting functional sites such as DNA, ATP binding and various PTM sites occur across several cancers and can be prioritized for additional validation and investigations. PMID:25232094

  13. Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA-protein binding sites.

    PubMed

    Li, Yang Eric; Xiao, Mu; Shi, Binbin; Yang, Yu-Cheng T; Wang, Dong; Wang, Fei; Marcia, Marco; Lu, Zhi John

    2017-09-08

    Crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled researchers to characterize transcriptome-wide binding sites of RNA-binding protein (RBP) with high resolution. We apply a soft-clustering method, RBPgroup, to various CLIP-seq datasets to group together RBPs that specifically bind the same RNA sites. Such combinatorial clustering of RBPs helps interpret CLIP-seq data and suggests functional RNA regulatory elements. Furthermore, we validate two RBP-RBP interactions in cell lines. Our approach links proteins and RNA motifs known to possess similar biochemical and cellular properties and can, when used in conjunction with additional experimental data, identify high-confidence RBP groups and their associated RNA regulatory elements.

  14. Sequences required for induction of neurotensin receptor gene expression during neuronal differentiation of N1E-115 neuroblastoma cells.

    PubMed

    Tavares, D; Tully, K; Dobner, P R

    1999-10-15

    The promoter region of the mouse high affinity neurotensin receptor (Ntr-1) gene was characterized, and sequences required for expression in neuroblastoma cell lines that express high affinity NT-binding sites were characterized. Me(2)SO-induced neuronal differentiation of N1E-115 neuroblastoma cells increased both the expression of the endogenous Ntr-1 gene and reporter genes driven by NTR-1 promoter sequences by 3-4-fold. Deletion analysis revealed that an 83-base pair promoter region containing the transcriptional start site is required for Me(2)SO activation. Detailed mutational analysis of this region revealed that a CACCC box and the central region of a large GC-rich palindrome are the crucial cis-regulatory elements required for Me(2)SO induction. The CACCC box is bound by at least one factor that is induced upon Me(2)SO treatment of N1E-115 cells. The Me(2)SO effect was found to be both selective and cell type-restricted. Basal expression in the neuroblastoma cell lines required a distinct set of sequences, including an Sp1-like sequence, and a sequence resembling an NGFI-A-binding site; however, a more distal 5' sequence was found to repress basal activity in N1E-115 cells. These results provide evidence that Ntr-1 gene regulation involves both positive and negative regulatory elements located in the 5'-flanking region and that Ntr-1 gene activation involves the coordinate activation or induction of several factors, including a CACCC box binding complex.

  15. Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins

    PubMed Central

    Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie

    2005-01-01

    Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486

  16. Non-B-Form DNA Is Enriched at Centromeres

    PubMed Central

    Henikoff, Steven

    2018-01-01

    Abstract Animal and plant centromeres are embedded in repetitive “satellite” DNA, but are thought to be epigenetically specified. To define genetic characteristics of centromeres, we surveyed satellite DNA from diverse eukaryotes and identified variation in <10-bp dyad symmetries predicted to adopt non-B-form conformations. Organisms lacking centromeric dyad symmetries had binding sites for sequence-specific DNA-binding proteins with DNA-bending activity. For example, human and mouse centromeres are depleted for dyad symmetries, but are enriched for non-B-form DNA and are associated with binding sites for the conserved DNA-binding protein CENP-B, which is required for artificial centromere function but is paradoxically nonessential. We also detected dyad symmetries and predicted non-B-form DNA structures at neocentromeres, which form at ectopic loci. We propose that centromeres form at non-B-form DNA because of dyad symmetries or are strengthened by sequence-specific DNA binding proteins. This may resolve the CENP-B paradox and provide a general basis for centromere specification. PMID:29365169

  17. Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

    PubMed Central

    Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

    2015-01-01

    The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143

  18. Characterization of molecular interactions between Zika virus protease and peptides derived from the C-terminus of NS2B.

    PubMed

    Li, Yan; Loh, Ying Ru; Hung, Alvin W; Kang, CongBao

    2018-06-21

    Zika virus (ZIKV) protease is a two-component complex in which NS3 contains the catalytic triad and NS2B cofactor region is important for protease folding and activity. A protease construct-eZiPro without the transmembrane domains of NS2B was designed. Structural study on eZiPro reveals that the Thr-Gly-Lys-Arg (TGKR) sequence at the C-terminus of NS2B binds to the active site after cleavage. The bZiPro construct only contains NS2B cofactor region and the N-terminus of NS3 without any artificial linker or protease cleavage site, giving rise to an empty pocket accessible to substrate and inhibitor binding. Herein, we demonstrate that the TGKR sequence of NS2B in eZiPro is dynamic. Peptides from NS2B with various lengths exhibit different binding affinities to bZiPro. TGKR binding to the active site in eZiPro does not affect protease binding to small-molecule compounds. Our results suggest that eZiPro will also be useful for evaluating small-molecule protease inhibitors. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. Computational design of enzyme-ligand binding using a combined energy function and deterministic sequence optimization algorithm.

    PubMed

    Tian, Ye; Huang, Xiaoqiang; Zhu, Yushan

    2015-08-01

    Enzyme amino-acid sequences at ligand-binding interfaces are evolutionarily optimized for reactions, and the natural conformation of an enzyme-ligand complex must have a low free energy relative to alternative conformations in native-like or non-native sequences. Based on this assumption, a combined energy function was developed for enzyme design and then evaluated by recapitulating native enzyme sequences at ligand-binding interfaces for 10 enzyme-ligand complexes. In this energy function, the electrostatic interaction between polar or charged atoms at buried interfaces is described by an explicitly orientation-dependent hydrogen-bonding potential and a pairwise-decomposable generalized Born model based on the general side chain in the protein design framework. The energy function is augmented with a pairwise surface-area based hydrophobic contribution for nonpolar atom burial. Using this function, on average, 78% of the amino acids at ligand-binding sites were predicted correctly in the minimum-energy sequences, whereas 84% were predicted correctly in the most-similar sequences, which were selected from the top 20 sequences for each enzyme-ligand complex. Hydrogen bonds at the enzyme-ligand binding interfaces in the 10 complexes were usually recovered with the correct geometries. The binding energies calculated using the combined energy function helped to discriminate the active sequences from a pool of alternative sequences that were generated by repeatedly solving a series of mixed-integer linear programming problems for sequence selection with increasing integer cuts.

  20. Using Carbohydrate Interaction Assays to Reveal Novel Binding Sites in Carbohydrate Active Enzymes.

    PubMed

    Cockburn, Darrell; Wilkens, Casper; Dilokpimol, Adiphol; Nakai, Hiroyuki; Lewińska, Anna; Abou Hachem, Maher; Svensson, Birte

    2016-01-01

    Carbohydrate active enzymes often contain auxiliary binding sites located either on independent domains termed carbohydrate binding modules (CBMs) or as so-called surface binding sites (SBSs) on the catalytic module at a certain distance from the active site. The SBSs are usually critical for the activity of their cognate enzyme, though they are not readily detected in the sequence of a protein, but normally require a crystal structure of a complex for their identification. A variety of methods, including affinity electrophoresis (AE), insoluble polysaccharide pulldown (IPP) and surface plasmon resonance (SPR) have been used to study auxiliary binding sites. These techniques are complementary as AE allows monitoring of binding to soluble polysaccharides, IPP to insoluble polysaccharides and SPR to oligosaccharides. Here we show that these methods are useful not only for analyzing known binding sites, but also for identifying new ones, even without structural data available. We further verify the chosen assays discriminate between known SBS/CBM containing enzymes and negative controls. Altogether 35 enzymes are screened for the presence of SBSs or CBMs and several novel binding sites are identified, including the first SBS ever reported in a cellulase. This work demonstrates that combinations of these methods can be used as a part of routine enzyme characterization to identify new binding sites and advance the study of SBSs and CBMs, allowing them to be detected in the absence of structural data.

  1. Using Carbohydrate Interaction Assays to Reveal Novel Binding Sites in Carbohydrate Active Enzymes

    PubMed Central

    Wilkens, Casper; Dilokpimol, Adiphol; Nakai, Hiroyuki; Lewińska, Anna; Abou Hachem, Maher; Svensson, Birte

    2016-01-01

    Carbohydrate active enzymes often contain auxiliary binding sites located either on independent domains termed carbohydrate binding modules (CBMs) or as so-called surface binding sites (SBSs) on the catalytic module at a certain distance from the active site. The SBSs are usually critical for the activity of their cognate enzyme, though they are not readily detected in the sequence of a protein, but normally require a crystal structure of a complex for their identification. A variety of methods, including affinity electrophoresis (AE), insoluble polysaccharide pulldown (IPP) and surface plasmon resonance (SPR) have been used to study auxiliary binding sites. These techniques are complementary as AE allows monitoring of binding to soluble polysaccharides, IPP to insoluble polysaccharides and SPR to oligosaccharides. Here we show that these methods are useful not only for analyzing known binding sites, but also for identifying new ones, even without structural data available. We further verify the chosen assays discriminate between known SBS/CBM containing enzymes and negative controls. Altogether 35 enzymes are screened for the presence of SBSs or CBMs and several novel binding sites are identified, including the first SBS ever reported in a cellulase. This work demonstrates that combinations of these methods can be used as a part of routine enzyme characterization to identify new binding sites and advance the study of SBSs and CBMs, allowing them to be detected in the absence of structural data. PMID:27504624

  2. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP

    PubMed Central

    Hafner, Markus; Landthaler, Markus; Burger, Lukas; Khorshid, Mohsen; Hausser, Jean; Berninger, Philipp; Rothballer, Andrea; Ascano, Manuel; Jungkamp, Anna-Carina; Munschauer, Mathias; Ulrich, Alexander; Wardle, Greg S.; Dewell, Scott; Zavolan, Mihaela; Tuschl, Thomas

    2010-01-01

    Summary RNA transcripts are subject to post-transcriptional gene regulation involving hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) expressed in a cell-type dependent fashion. We developed a cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs. The crosslinked sites are revealed by thymidine to cytidine transitions in the cDNAs prepared from immunopurified RNPs of 4-thiouridine-treated cells. We determined the binding sites and regulatory consequences for several intensely studied RBPs and miRNPs, including PUM2, QKI, IGF2BP1-3, AGO/EIF2C1-4 and TNRC6A-C. Our study revealed that these factors bind thousands of sites containing defined sequence motifs and have distinct preferences for exonic versus intronic or coding versus untranslated transcript regions. The precise mapping of binding sites across the transcriptome will be critical to the interpretation of the rapidly emerging data on genetic variation between individuals and how these variations contribute to complex genetic diseases. PMID:20371350

  3. Structural Basis of the Interaction of a Trypanosoma cruzi Surface Molecule Implicated in Oral Infection with Host Cells and Gastric Mucin

    PubMed Central

    Cortez, Cristian; Yoshida, Nobuko; Bahia, Diana; Sobreira, Tiago J.P.

    2012-01-01

    Host cell invasion and dissemination within the host are hallmarks of virulence for many pathogenic microorganisms. As concerns Trypanosoma cruzi, which causes Chagas disease, the insect vector-derived metacyclic trypomastigotes (MT) initiate infection by invading host cells, and later blood trypomastigotes disseminate to diverse organs and tissues. Studies with MT generated in vitro and tissue culture-derived trypomastigotes (TCT), as counterparts of insect-borne and bloodstream parasites, have implicated members of the gp85/trans-sialidase superfamily, MT gp82 and TCT Tc85-11, in cell invasion and interaction with host factors. Here we analyzed the gp82 structure/function characteristics and compared them with those previously reported for Tc85-11. One of the gp82 sequences identified as a cell binding site consisted of an α-helix, which connects the N-terminal β-propeller domain to the C-terminal β-sandwich domain where the second binding site is nested. In the gp82 structure model, both sites were exposed at the surface. Unlike gp82, the Tc85-11 cell adhesion sites are located in the N-terminal β-propeller region. The gp82 sequence corresponding to the epitope for a monoclonal antibody that inhibits MT entry into target cells was exposed on the surface, upstream and contiguous to the α-helix. Located downstream and close to the α-helix was the gp82 gastric mucin binding site, which plays a central role in oral T. cruzi infection. The sequences equivalent to Tc85-11 laminin-binding sites, which have been associated with the parasite ability to overcome extracellular matrices and basal laminae, was poorly conserved in gp82, compatible with its reduced capacity to bind laminin. Our study indicates that gp82 is structurally suited for MT to initiate infection by the oral route, whereas Tc85-11, with its affinity for laminin, would facilitate the parasite dissemination through diverse organs and tissues. PMID:22860068

  4. Identification of an inducible regulator of c-myb expression during T-cell activation.

    PubMed Central

    Phan, S C; Feeley, B; Withers, D; Boxer, L M

    1996-01-01

    Resting T cells express very low levels of c-Myb protein. During T-cell activation, c-myb expression is induced and much of the increase in expression occurs at the transcriptional level. We identified a region of the c-myb 5' flanking sequence that increased c-myb expression during T-cell activation. In vivo footprinting by ligation-mediated PCR was performed to correlate in vivo protein binding with functional activity. A protein footprint was visible over this region of the c-myb 5' flanking sequence in activated T cells but not in unactivated T cells. An electrophoretic mobility shift assay (EMSA) with nuclear extract from activated T cells and an oligonucleotide of this binding site demonstrated a new protein-DNA complex, referred to as CMAT for c-myb in activated T cells; this complex was not present in unactivated T cells. Because the binding site showed some sequence similarity with the nuclear factor of activated T cells (NFAT) binding site, we compared the kinetics of induction of the two binding complexes and the molecular masses of the two proteins. Studies of the kinetics of induction showed that the NFAT EMSA binding complex appeared earlier than the CMAT complex. The NFAT protein migrated more slowly in a sodium dodecyl sulfate-polyacrylamide gel than the CMAT protein did. In addition, an antibody against NFAT did not cross-react with the CMAT protein. The appearance of the CMAT binding complex was inhibited by both cyclosporin A and rapamycin. The CMAT protein appears to be a novel inducible protein involved in the regulation of c-myb expression during T-cell activation. PMID:8628306

  5. A Novel WRKY transcription factor is required for induction of PR-1a gene expression by salicylic acid and bacterial elicitors.

    PubMed

    van Verk, Marcel C; Pappaioannou, Dimitri; Neeleman, Lyda; Bol, John F; Linthorst, Huub J M

    2008-04-01

    PR-1a is a salicylic acid-inducible defense gene of tobacco (Nicotiana tabacum). One-hybrid screens identified a novel tobacco WRKY transcription factor (NtWRKY12) with specific binding sites in the PR-1a promoter at positions -564 (box WK(1)) and -859 (box WK(2)). NtWRKY12 belongs to the class of transcription factors in which the WRKY sequence is followed by a GKK rather than a GQK sequence. The binding sequence of NtWRKY12 (WK box TTTTCCAC) deviated significantly from the consensus sequence (W box TTGAC[C/T]) shown to be recognized by WRKY factors with the GQK sequence. Mutation of the GKK sequence in NtWRKY12 into GQK or GEK abolished binding to the WK box. The WK(1) box is in close proximity to binding sites in the PR-1a promoter for transcription factors TGA1a (as-1 box) and Myb1 (MBSII box). Expression studies with PR-1a promoterbeta-glucuronidase (GUS) genes in stably and transiently transformed tobacco indicated that NtWRKY12 and TGA1a act synergistically in PR-1a expression induced by salicylic acid and bacterial elicitors. Cotransfection of Arabidopsis thaliana protoplasts with 35SNtWRKY12 and PR-1aGUS promoter fusions showed that overexpression of NtWRKY12 resulted in a strong increase in GUS expression, which required functional WK boxes in the PR-1a promoter.

  6. Distinct p53 genomic binding patterns in normal and cancer-derived human cells

    PubMed Central

    McCorkle, Sean R; McCombie, WR; Dunn, John J

    2011-01-01

    Here, we report genome-wide analysis of the tumor suppressor p53 binding sites in normal human cells. 743 high-confidence ChIP-seq peaks representing putative genomic binding sites were identified in normal IMR90 fibroblasts using a reference chromatin sample. More than 40% were located within 2 kb of a transcription start site (TSS), a distribution similar to that documented for individually studied, functional p53 binding sites and, to date, not observed by previous p53 genome-wide studies. Nearly half of the high-confidence binding sites in the IMR90 cells reside in CpG islands in marked contrast to sites reported in cancer-derived cells. The distinct genomic features of the IMR90 binding sites do not reflect a distinct preference for specific sequences, since the de novo developed p53 motif based on our study is similar to those reported by genome-wide studies of cancer cells. More likely, the different chromatin landscape in normal, compared with cancer-derived cells, influences p53 binding via modulating availability of the sites. We compared the IMR90 ChIP-seq peaks to the recently published IMR90 methylome1 and demonstrated that they are enriched at hypomethylated DNA. Our study represents the first genome-wide, de novo mapping of p53 binding sites in normal human cells and reveals that p53 binding sites reside in distinct genomic landscapes in normal and cancer-derived human cells. PMID:22127205

  7. Recent sequence variation in probe binding site affected detection of respiratory syncytial virus group B by real-time RT-PCR.

    PubMed

    Kamau, Everlyn; Agoti, Charles N; Lewa, Clement S; Oketch, John; Owor, Betty E; Otieno, Grieven P; Bett, Anne; Cane, Patricia A; Nokes, D James

    2017-03-01

    Direct immuno-fluorescence test (IFAT) and multiplex real-time RT-PCR have been central to RSV diagnosis in Kilifi, Kenya. Recently, these two methods showed discrepancies with an increasing number of PCR undetectable RSV-B viruses. Establish if mismatches in the primer and probe binding sites could have reduced real-time RT-PCR sensitivity. Nucleoprotein (N) and glycoprotein (G) genes were sequenced for real-time RT-PCR positive and negative samples. Primer and probe binding regions in N gene were checked for mismatches and phylogenetic analyses done to determine molecular epidemiology of these viruses. New primers and probe were designed and tested on the previously real-time RT-PCR negative samples. N gene sequences revealed 3 different mismatches in the probe target site of PCR negative, IFAT positive viruses. The primers target sites had no mismatches. Phylogenetic analysis of N and G genes showed that real-time RT-PCR positive and negative samples fell into distinct clades. Newly designed primers-probe pair improved detection and recovered previous PCR undetectable viruses. An emerging RSV-B variant is undetectable by a quite widely used real-time RT-PCR assay due to polymorphisms that influence probe hybridization affecting PCR accuracy. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  8. Escherichia coli ArgR mutants defective in cer/Xer recombination, but not in DNA binding.

    PubMed

    Sénéchal, Hélène; Delesques, Jérémy; Szatmari, George

    2010-04-01

    The Escherichia coli arginine repressor (ArgR) is an L-arginine-dependent DNA-binding protein that controls the expression of the arginine biosynthetic genes and is required as an accessory factor for Xer site-specific recombination at cer and related recombination sites in plasmids. We used the technique of pentapeptide scanning mutagenesis to isolate a series of ArgR mutants that were considerably reduced in cer recombination, but were still able to repress an argA::lacZ fusion. DNA sequence analysis showed that all of the mutants mapped to the same nucleotide, resulting in a five amino acid insertion between residues 149 and 150 of ArgR, corresponding to the end of the alpha6 helix. A truncated ArgR containing a stop codon at residue 150 displayed the same phenotype as the protein with the five amino acid insertion, and both mutants displayed sequence-specific DNA-binding activity that was L-arginine dependent. These results show that the C-terminus of ArgR is more important in cer/Xer site-specific recombination than in DNA binding.

  9. The HIP1 binding site is required for growth regulation of the dihydrofolate reductase gene promoter.

    PubMed

    Means, A L; Slansky, J E; McMahon, S L; Knuth, M W; Farnham, P J

    1992-03-01

    The transcription rate of the dihydrofolate reductase (DHFR) gene increases at the G1/S boundary of the proliferative cell cycle. Through analysis of transiently and stably transfected NIH 3T3 cells, we have now demonstrated that DHFR promoter sequences extending from -270 to +20 are sufficient to confer similar regulation on a reporter gene. Mutation of a protein binding site that spans sequences from -16 to +11 in the DHFR promoter resulted in loss of the transcriptional increase at the G1/S boundary. Purification of an activity from HeLa nuclear extract that binds to this region enriched for a 180-kDa polypeptide (HIP1). Using this HIP1 preparation, we have identified specific positions within the binding site that are critical for efficient protein-DNA interactions. An analysis of association and dissociation rates suggests that bound HIP1 protein can exchange rapidly with free protein. This rapid exchange may facilitate the burst of transcriptional activity from the DHFR promoter at the G1/S boundary.

  10. The HIP1 binding site is required for growth regulation of the dihydrofolate reductase gene promoter.

    PubMed Central

    Means, A L; Slansky, J E; McMahon, S L; Knuth, M W; Farnham, P J

    1992-01-01

    The transcription rate of the dihydrofolate reductase (DHFR) gene increases at the G1/S boundary of the proliferative cell cycle. Through analysis of transiently and stably transfected NIH 3T3 cells, we have now demonstrated that DHFR promoter sequences extending from -270 to +20 are sufficient to confer similar regulation on a reporter gene. Mutation of a protein binding site that spans sequences from -16 to +11 in the DHFR promoter resulted in loss of the transcriptional increase at the G1/S boundary. Purification of an activity from HeLa nuclear extract that binds to this region enriched for a 180-kDa polypeptide (HIP1). Using this HIP1 preparation, we have identified specific positions within the binding site that are critical for efficient protein-DNA interactions. An analysis of association and dissociation rates suggests that bound HIP1 protein can exchange rapidly with free protein. This rapid exchange may facilitate the burst of transcriptional activity from the DHFR promoter at the G1/S boundary. Images PMID:1545788

  11. UBF-binding site arrays form pseudo-NORs and sequester the RNA polymerase I transcription machinery

    PubMed Central

    Mais, Christine; Wright, Jane E.; Prieto, José-Luis; Raggett, Samantha L.; McStay, Brian

    2005-01-01

    Human ribosomal genes (rDNA) are located in nucleolar organizer regions (NORs) on the short arms of acrocentric chromosomes. Metaphase NORs that were transcriptionally active in the previous cell cycle appear as prominent chromosomal features termed secondary constrictions that are achromatic in chromosome banding and positive in silver staining. The architectural RNA polymerase I (pol I) transcription factor UBF binds extensively across rDNA throughout the cell cycle. To determine if UBF binding underpins NOR structure, we integrated large arrays of heterologous UBF-binding sequences at ectopic sites on human chromosomes. These arrays efficiently recruit UBF even to sites outside the nucleolus and, during metaphase, form novel silver stainable secondary constrictions, termed pseudo-NORs, morphologically similar to NORs. We demonstrate for the first time that in addition to UBF the other components of the pol I machinery are found associated with sequences across the entire human rDNA repeat. Remarkably, a significant fraction of these same pol I factors are sequestered by pseudo-NORs independent of both transcription and nucleoli. Because of the heterologous nature of the sequence employed, we infer that sequestration is mediated primarily by protein–protein interactions with UBF. These results suggest that extensive binding of UBF is responsible for formation and maintenance of the secondary constriction at active NORs. Furthermore, we propose that UBF mediates recruitment of the pol I machinery to nucleoli independently of promoter elements. PMID:15598984

  12. Structure of homeodomain-leucine zipper/DNA complexes studied using hydroxyl radical cleavage of DNA and methylation interference.

    PubMed

    Tron, Adriana E; Comelli, Raúl N; Gonzalez, Daniel H

    2005-12-27

    Homeodomain-leucine zipper (HD-Zip) proteins, unlike most homeodomain proteins, bind a pseudopalindromic DNA sequence as dimers. We have investigated the structure of the DNA complexes formed by two HD-Zip proteins with different nucleotide preferences at the central position of the binding site using footprinting and interference methods. The results indicate that the respective complexes are not symmetric, with the strand bearing a central purine (top strand) showing higher protection around the central region and the bottom strand protected toward the 3' end. Binding to a sequence with a nonpreferred central base pair produces a decrease in protection in either the top or the bottom strand, depending upon the protein. Modeling studies derived from the complex formed by the monomeric Antennapedia homeodomain with DNA indicate that in the HD-Zip/DNA complex the recognition helix of one of the monomers is displaced within the major groove respective to the other one. This monomer seems to lose contacts with a part of the recognition sequence upon binding to the nonpreferred site. The results show that the structure of the complex formed by HD-Zip proteins with DNA is dependent upon both protein intrinsic characteristics and the nucleotides present at the central position of the recognition sequence.

  13. An NMR-Based Structural Rationale for Contrasting Stoichiometry and Ligand Binding Site(s) in Fatty Acid-binding Proteins†

    PubMed Central

    He, Yan; Estephan, Rima; Yang, Xiaomin; Vela, Adriana; Wang, Hsin; Bernard, Cédric; Stark, Ruth E.

    2011-01-01

    Liver fatty acid-binding protein (LFABP) is a 14-kDa cytosolic polypeptide, differing from other family members in number of ligand binding sites, diversity of bound ligands, and transfer of fatty acid(s) to membranes primarily via aqueous diffusion rather than direct collisional interactions. Distinct two-dimensional 1H-15N NMR signals indicative of slowly exchanging LFABP assemblies formed during stepwise ligand titration were exploited, without solving the protein-ligand complex structures, to yield the stoichiometries for the bound ligands, their locations within the protein binding cavity, the sequence of ligand occupation, and the corresponding protein structural accommodations. Chemical shifts were monitored for wild-type LFABP and a R122L/S124A mutant in which electrostatic interactions viewed as essential to fatty acid binding were removed. For wild-type LFABP the results compared favorably with previous tertiary structures of oleate-bound wild-type LFABP in crystals and in solution: there are two oleates, one U-shaped ligand that positions the long hydrophobic chain deep within the cavity and another extended structure with the hydrophobic chain facing the cavity and the carboxylate group lying close to the protein surface. The NMR titration validated a prior hypothesis that the first oleate to enter the cavity occupies the internal protein site. In contrast, 1H/15N chemical shift changes supported only one liganded oleate for R122L/S124A LFABP, at an intermediate location within the protein cavity. A rationale based on protein sequence and electrostatics was developed to explain the stoichiometry and binding site trends for LFABPs and to put these findings into context within the larger protein family. PMID:21226535

  14. High level activity of the mouse CCAAT/enhancer binding protein (C/EBP alpha) gene promoter involves autoregulation and several ubiquitous transcription factors.

    PubMed Central

    Legraverend, C; Antonson, P; Flodby, P; Xanthopoulos, K G

    1993-01-01

    The promoter region of the mouse CCAAT-Enhancer Binding Protein (C/EBP alpha) gene is capable of directing high levels of expression of reporter constructs in various cell lines, albeit even in cells that do not express their endogenous C/EBP alpha gene. To understand the molecular mechanisms underlying this ubiquitous expression, we have characterized the promoter region of the mouse C/EBP alpha gene by a variety of in vitro and in vivo methods. We show that three sites related in sequence to USF, BTE and C/EBP binding sites and present in promoter region -350/+3, are recognized by proteins from rat liver nuclear extracts. The sequence of the C/EBP alpha promoter that includes the USF binding site is also capable of forming stable complexes with purified Myc+Max heterodimers and mutation of this site drastically reduces transcription of C/EBP alpha promoter luciferase constructs both in liver and non liver cell lines. In addition, we identify three novel protein-binding sites two of which display similarity to NF-1 and a NF kappa B binding sites. The region located between nucleotides -197 and -178 forms several heat-stable complexes with liver nuclear proteins in vitro which are recognized mainly by antibodies specific for C/EBP alpha. Furthermore, transient expression of C/EBP alpha and to a lesser extent C/EBP beta expression vectors, results in transactivation of a cotransfected C/EBP alpha promoter-luciferase reporter construct. These experiments support the notion that the C/EBP alpha gene is regulated by C/EBP alpha but other C/EBP-related proteins may also be involved. Images PMID:8493090

  15. Insulation and wiring specificity of BceR-like response regulators and their target promoters in Bacillus subtilis.

    PubMed

    Fang, Chong; Nagy-Staroń, Anna; Grafe, Martin; Heermann, Ralf; Jung, Kirsten; Gebhard, Susanne; Mascher, Thorsten

    2017-04-01

    BceRS and PsdRS are paralogous two-component systems in Bacillus subtilis controlling the response to antimicrobial peptides. In the presence of extracellular bacitracin and nisin, respectively, the two response regulators (RRs) bind their target promoters, P bceA or P psdA , resulting in a strong up-regulation of target gene expression and ultimately antibiotic resistance. Despite high sequence similarity between the RRs BceR and PsdR and their known binding sites, no cross-regulation has been observed between them. We therefore investigated the specificity determinants of P bceA and P psdA that ensure the insulation of these two paralogous pathways at the RR-promoter interface. In vivo and in vitro analyses demonstrate that the regulatory regions within these two promoters contain three important elements: in addition to the known (main) binding site, we identified a linker region and a secondary binding site that are crucial for functionality. Initial binding to the high-affinity, low-specificity main binding site is a prerequisite for the subsequent highly specific binding of a second RR dimer to the low-affinity secondary binding site. In addition to this hierarchical cooperative binding, discrimination requires a competition of the two RRs for their respective binding site mediated by only slight differences in binding affinities. © 2016 John Wiley & Sons Ltd.

  16. A novel substance P binding site in bovine adrenal medulla.

    PubMed

    Geraghty, D P; Livett, B G; Rogerson, F M; Burcher, E

    1990-05-04

    Radioligand binding techniques were used to characterize the substance P (SP) binding site on membranes prepared from bovine adrenal medullae. 125I-labelled Bolton-Hunter substance P (BHSP), which recognises the C-terminally directed, SP-preferring NK1 receptor, showed no specific binding. In contrast, binding of [3H]SP was saturable (at 6 nM) and reversible, with an equilibrium dissociation constant (Kd) 1.46 +/- 0.73 nM, Bmax 0.73 +/- 0.06 pmol/g wet weight and Hill coefficient 0.98 +/- 0.01. Specific binding of [3H]SP was displaced by SP greater than neurokinin A (NKA) greater than SP(3-11) approximately SP(1-9) greater than SP(1-7) approximately SP(1-4) approximately SP(1-6), with neurokinin B (NKB) and SP(1-3) very weak competitors and SP(5-11), SP(7-11) and SP(9-11) causing negligible inhibition (up to 10 microM). This potency order is quite distinct from that seen with binding to an NK1 site, a conclusion confirmed by the lack of BHSP binding. It appears that Lys3 and/or Pro4 are critical for binding, suggesting an anionic binding site. These data suggest the existence of an unusual binding site which may represent a novel SP receptor. This site appears to require the entire sequence of the SP molecule for full recognition.

  17. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

    PubMed

    Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

    2017-09-27

    Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Campylobacter jejuni chromosomal sequences that hybridize to Vibrio cholerae and Escherichia coli LT enterotoxin genes.

    PubMed

    Calva, E; Torres, J; Vázquez, M; Angeles, V; de la Vega, H; Ruíz-Palacios, G M

    1989-02-20

    Campylobacter jejuni is one of the main etiologic agents of gastrointestinal illness in developing and developed areas throughout the world. Isolation of enterotoxin-producing C. jejuni has been associated with clinical symptoms of a watery-secretory type of diarrhea. Although physiological and immunological relatedness has been demonstrated between the C. jejuni enterotoxin (CJT), the Vibrio cholerae enterotoxin (CT), and the heat-labile cholera-like Escherichia coli enterotoxin (LT), nucleotide sequence similarity between C. jejuni DNA and either the toxA, toxB, eltA or eltB genes remained to be shown. We found that binding to ganglioside GM1 prevented recognition of CJT by monoclonal antibodies directed to either CT or LT. This indicates antigenic similarity between the three enterotoxins in the ganglioside GM1-binding site. Therefore we searched for corresponding similarities at the DNA level and found, by oligodeoxynucleotide hybridization, C. jejuni chromosomal nucleotide sequences similar to the coding region for a postulated ganglioside GM1-binding site on toxB and eltB.

  19. Homologous kappa-neurotoxins exhibit residue-specific interactions with the alpha 3 subunit of the nicotinic acetylcholine receptor: a comparison of the structural requirements for kappa-bungarotoxin and kappa-flavitoxin binding.

    PubMed

    McLane, K E; Weaver, W R; Lei, S; Chiappinelli, V A; Conti-Tronconi, B M

    1993-07-13

    kappa-Flavotoxin (kappa-FTX), a snake neurotoxin that is a selective antagonist of certain neuronal nicotinic acetylcholine receptors (AChRs), has recently been isolated and characterized [Grant, G. A., Frazier, M. W., & Chiappinelli, V. A. (1988) Biochemistry 27, 1532-1537]. Like the related snake toxin kappa-bungarotoxin (kappa-BTX), kappa-FTX binds with high affinity to alpha 3 subtypes of neuronal AChRs, even though there are distinct sequence differences between the two toxins. To further characterize the sequence regions of the neuronal AChR alpha 3 subunit involved in formation of the binding site for this family of kappa-neurotoxins, we investigated kappa-FTX binding to overlapping synthetic peptides screening the alpha 3 subunit sequence. A sequence region forming a "prototope" for kappa-FTX was identified within residues alpha 3 (51-70), confirming the suggestions of previous studies on the binding of kappa-BTX to the alpha 3 subunit [McLane, K. E., Tang, F., & Conti-Tronconi, B. M. (1990) J. Biol. Chem. 265, 1537-1544] and alpha-bungarotoxin to the Torpedo AChR alpha subunit [Conti-Tronconi, B. M., Tang, F., Diethelm, B. M., Spencer, S. R., Reinhardt-Maelicke, S., & Maelicke, A. (1990) Biochemistry 29, 6221-6230] that this sequence region is involved in formation of a cholinergic site. Single residue substituted analogues, where each residue of the sequence alpha 3 (51-70) was sequentially replaced by a glycine, were used to identify the amino acid side chains involved in the interaction of this prototope with kappa-FTX.(ABSTRACT TRUNCATED AT 250 WORDS)

  20. In vitro selection using a dual RNA library that allows primerless selection

    PubMed Central

    Jarosch, Florian; Buchner, Klaus; Klussmann, Sven

    2006-01-01

    High affinity target-binding aptamers are identified from random oligonucleotide libraries by an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX). Since the SELEX process includes a PCR amplification step the randomized region of the oligonucleotide libraries need to be flanked by two fixed primer binding sequences. These primer binding sites are often difficult to truncate because they may be necessary to maintain the structure of the aptamer or may even be part of the target binding motif. We designed a novel type of RNA library that carries fixed sequences which constrain the oligonucleotides into a partly double-stranded structure, thereby minimizing the risk that the primer binding sequences become part of the target-binding motif. Moreover, the specific design of the library including the use of tandem RNA Polymerase promoters allows the selection of oligonucleotides without any primer binding sequences. The library was used to select aptamers to the mirror-image peptide of ghrelin. Ghrelin is a potent stimulator of growth-hormone release and food intake. After selection, the identified aptamer sequences were directly synthesized in their mirror-image configuration. The final 44 nt-Spiegelmer, named NOX-B11-3, blocks ghrelin action in a cell culture assay displaying an IC50 of 4.5 nM at 37°C. PMID:16855281

  1. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-08-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded.

  2. Widespread Site-Dependent Buffering of Human Regulatory Polymorphism

    PubMed Central

    Kutyavin, Tanya; Stamatoyannopoulos, John A.

    2012-01-01

    The average individual is expected to harbor thousands of variants within non-coding genomic regions involved in gene regulation. However, it is currently not possible to interpret reliably the functional consequences of genetic variation within any given transcription factor recognition sequence. To address this, we comprehensively analyzed heritable genome-wide binding patterns of a major sequence-specific regulator (CTCF) in relation to genetic variability in binding site sequences across a multi-generational pedigree. We localized and quantified CTCF occupancy by ChIP-seq in 12 related and unrelated individuals spanning three generations, followed by comprehensive targeted resequencing of the entire CTCF–binding landscape across all individuals. We identified hundreds of variants with reproducible quantitative effects on CTCF occupancy (both positive and negative). While these effects paralleled protein–DNA recognition energetics when averaged, they were extensively buffered by striking local context dependencies. In the significant majority of cases buffering was complete, resulting in silent variants spanning every position within the DNA recognition interface irrespective of level of binding energy or evolutionary constraint. The prevalence of complex partial or complete buffering effects severely constrained the ability to predict reliably the impact of variation within any given binding site instance. Surprisingly, 40% of variants that increased CTCF occupancy occurred at positions of human–chimp divergence, challenging the expectation that the vast majority of functional regulatory variants should be deleterious. Our results suggest that, even in the presence of “perfect” genetic information afforded by resequencing and parallel studies in multiple related individuals, genomic site-specific prediction of the consequences of individual variation in regulatory DNA will require systematic coupling with empirical functional genomic measurements. PMID:22457641

  3. The molecular mechanism for interaction of ceruloplasmin and myeloperoxidase

    NASA Astrophysics Data System (ADS)

    Bakhautdin, Bakytzhan; Bakhautdin, Esen Göksöy

    2016-04-01

    Ceruloplasmin (Cp) is a copper-containing ferroxidase with potent antioxidant activity. Cp is expressed by hepatocytes and activated macrophages and has been known as physiologic inhibitor of myeloperoxidase (MPO). Enzymatic activity of MPO produces anti-microbial agents and strong prooxidants such as hypochlorous acid and has a potential to damage host tissue at the sites of inflammation and infection. Thus Cp-MPO interaction and inhibition of MPO has previously been suggested as an important control mechanism of excessive MPO activity. Our aim in this study was to identify minimal Cp domain or peptide that interacts with MPO. We first confirmed Cp-MPO interaction by ELISA and surface plasmon resonance (SPR). SPR analysis of the interaction yielded 30 nM affinity between Cp and MPO. We then designed and synthesized 87 overlapping peptides spanning the entire amino acid sequence of Cp. Each of the peptides was tested whether it binds to MPO by direct binding ELISA. Two of the 87 peptides, P18 and P76 strongly interacted with MPO. Amino acid sequence analysis of identified peptides revealed high sequence and structural homology between them. Further structural analysis of Cp's crystal structure by PyMOL software unfolded that both peptides represent surface-exposed sites of Cp and face nearly the same direction. To confirm our finding we raised anti-P18 antisera in rabbit and demonstrated that this antisera disrupts Cp-MPO binding and rescues MPO activity. Collectively, our results confirm Cp-MPO interaction and identify two nearly identical sites on Cp that specifically bind MPO. We propose that inhibition of MPO by Cp requires two nearly identical sites on Cp to bind homodimeric MPO simultaneously and at an angle of at least 120 degrees, which, in turn, exerts tension on MPO and results in conformational change.

  4. ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data

    PubMed Central

    2010-01-01

    Background Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) or ChIP followed by genome tiling array analysis (ChIP-chip) have become standard technologies for genome-wide identification of DNA-binding protein target sites. A number of algorithms have been developed in parallel that allow identification of binding sites from ChIP-seq or ChIP-chip datasets and subsequent visualization in the University of California Santa Cruz (UCSC) Genome Browser as custom annotation tracks. However, summarizing these tracks can be a daunting task, particularly if there are a large number of binding sites or the binding sites are distributed widely across the genome. Results We have developed ChIPpeakAnno as a Bioconductor package within the statistical programming environment R to facilitate batch annotation of enriched peaks identified from ChIP-seq, ChIP-chip, cap analysis of gene expression (CAGE) or any experiments resulting in a large number of enriched genomic regions. The binding sites annotated with ChIPpeakAnno can be viewed easily as a table, a pie chart or plotted in histogram form, i.e., the distribution of distances to the nearest genes for each set of peaks. In addition, we have implemented functionalities for determining the significance of overlap between replicates or binding sites among transcription factors within a complex, and for drawing Venn diagrams to visualize the extent of the overlap between replicates. Furthermore, the package includes functionalities to retrieve sequences flanking putative binding sites for PCR amplification, cloning, or motif discovery, and to identify Gene Ontology (GO) terms associated with adjacent genes. Conclusions ChIPpeakAnno enables batch annotation of the binding sites identified from ChIP-seq, ChIP-chip, CAGE or any technology that results in a large number of enriched genomic regions within the statistical programming environment R. Allowing users to pass their own annotation data such as a different Chromatin immunoprecipitation (ChIP) preparation and a dataset from literature, or existing annotation packages, such as GenomicFeatures and BSgenome, provides flexibility. Tight integration to the biomaRt package enables up-to-date annotation retrieval from the BioMart database. PMID:20459804

  5. The Bacterial Response Regulator ArcA Uses a Diverse Binding Site Architecture to Regulate Carbon Oxidation Globally

    PubMed Central

    Park, Dan M.; Akhtar, Md. Sohail; Ansari, Aseem Z.; Landick, Robert; Kiley, Patricia J.

    2013-01-01

    Despite the importance of maintaining redox homeostasis for cellular viability, how cells control redox balance globally is poorly understood. Here we provide new mechanistic insight into how the balance between reduced and oxidized electron carriers is regulated at the level of gene expression by mapping the regulon of the response regulator ArcA from Escherichia coli, which responds to the quinone/quinol redox couple via its membrane-bound sensor kinase, ArcB. Our genome-wide analysis reveals that ArcA reprograms metabolism under anaerobic conditions such that carbon oxidation pathways that recycle redox carriers via respiration are transcriptionally repressed by ArcA. We propose that this strategy favors use of catabolic pathways that recycle redox carriers via fermentation akin to lactate production in mammalian cells. Unexpectedly, bioinformatic analysis of the sequences bound by ArcA in ChIP-seq revealed that most ArcA binding sites contain additional direct repeat elements beyond the two required for binding an ArcA dimer. DNase I footprinting assays suggest that non-canonical arrangements of cis-regulatory modules dictate both the length and concentration-sensitive occupancy of DNA sites. We propose that this plasticity in ArcA binding site architecture provides both an efficient means of encoding binding sites for ArcA, σ70-RNAP and perhaps other transcription factors within the same narrow sequence space and an effective mechanism for global control of carbon metabolism to maintain redox homeostasis. PMID:24146625

  6. Nanobiological studies on drug design using molecular mechanic method.

    PubMed

    Ghaheh, Hooria Seyedhosseini; Mousavi, Maryam; Araghi, Mahmood; Rasoolzadeh, Reza; Hosseini, Zahra

    2015-01-01

    Influenza H1N1 is very important worldwide and point mutations that occur in the virus gene are a threat for the World Health Organization (WHO) and druggists, since they could make this virus resistant to the existing antibiotics. Influenza epidemics cause severe respiratory illness in 30 to 50 million people and kill 250,000 to 500,000 people worldwide every year. Nowadays, drug design is not done through trial and error because of its cost and waste of time; therefore bioinformatics studies is essential for designing drugs. This paper, infolds a study on binding site of Neuraminidase (NA) enzyme, (that is very important in drug design) in 310K temperature and different dielectrics, for the best drug design. Information of NA enzyme was extracted from Protein Data Bank (PDB) and National Center for Biotechnology Information (NCBI) websites. The new sequences of N1 were downloaded from the NCBI influenza virus sequence database. Drug binding sites were assimilated and homologized modeling using Argus lab 4.0, HyperChem 6.0 and Chem. D3 softwares. Their stability was assessed in different dielectrics and temperatures. Measurements of potential energy (Kcal/mol) of binding sites of NA in different dielectrics and 310K temperature revealed that at time step size = 0 pSec drug binding sites have maximum energy level and at time step size = 100 pSec have maximum stability and minimum energy. Drug binding sites are more dependent on dielectric constants rather than on temperature and the optimum dielectric constant is 39/78.

  7. Non-B-DNA structures on the interferon-beta promoter?

    PubMed

    Robbe, K; Bonnefoy, E

    1998-01-01

    The high mobility group (HMG) I protein intervenes as an essential factor during the virus induced expression of the interferon-beta (IFN-beta) gene. It is a non-histone chromatine associated protein that has the dual capacity of binding to a non-B-DNA structure such as cruciform-DNA as well as to AT rich B-DNA sequences. In this work we compare the binding affinity of HMGI for a synthetic cruciform-DNA to its binding affinity for the HMGI-binding-site present in the positive regulatory domain II (PRDII) of the IFN-beta promoter. Using gel retardation experiments, we show that HMGI protein binds with at least ten times more affinity to the synthetic cruciform-DNA structure than to the PRDII B-DNA sequence. DNA hairpin sequences are present in both the human and the murine PRDII-DNAs. We discuss in this work the presence of, yet putative, non-B-DNA structures in the IFN-beta promoter.

  8. Binding of the cyclic AMP receptor protein of Escherichia coli and DNA bending at the P4 promoter of pBR322.

    PubMed

    Brierley, I; Hoggett, J G

    1992-07-01

    The binding of the Escherichia coli cyclic AMP receptor protein (CRP) to its specific site on the P4 promoter of pBR322 has been studied by gel electrophoresis. Binding to the P4 site was about 40-50-fold weaker than to the principal CRP site on the lactose promoter at both low (0.01 M) and high (0.1 M) ionic strengths. CRP-induced bending at the P4 site was investigated from the mobilities of CRP bound to circularly permuted P4 fragments. The estimated bending angle, based on comparison with Zinkel & Crothers [(1990) Biopolymers 29, 29-38] A-tract bending standards, was found to be approximately 96 degrees, similar to that found for binding to the lac site. These observations suggest that there is not a simple relationship between strength of CRP binding and the extent of induced bending for different CRP sites. The apparent centre of bending in P4 is displaced about 6-8 bp away from the conserved TGTGA sequence and the P4 transcription start site.

  9. Cluster analysis of S. Cerevisiae nucleosome binding sites

    NASA Astrophysics Data System (ADS)

    Suvorova, Y.; Korotkov, E.

    2017-12-01

    It is well known that major part of a eukaryotic genome is wrapped around histone proteins forming nucleosomes. It was also demonstrated that the DNA sequence itself is playing an important role in the nucleosome positioning process. In this work, a cluster analysis of 67 517 nucleosome binding sites from the S. Cerevisiae genome was carried out. The classification method is based on the self-adjusting dinucleotides position weight matrix. As a result, 135 significant clusters were discovered that contain 43225 sequences (which constitutes 64% of the initial set). The meaning of the found classes is discussed, as well as the possibility of the further usage.

  10. Toward rules relating zinc finger protein sequences and DNA binding site preferences.

    PubMed

    Desjarlais, J R; Berg, J M

    1992-08-15

    Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.

  11. When core competence is not enough: functional interplay of the DEAD-box helicase core with ancillary domains and auxiliary factors in RNA binding and unwinding.

    PubMed

    Rudolph, Markus G; Klostermeier, Dagmar

    2015-08-01

    DEAD-box helicases catalyze RNA duplex unwinding in an ATP-dependent reaction. Members of the DEAD-box helicase family consist of a common helicase core formed by two RecA-like domains. According to the current mechanistic model for DEAD-box mediated RNA unwinding, binding of RNA and ATP triggers a conformational change of the helicase core, and leads to formation of a compact, closed state. In the closed conformation, the two parts of the active site for ATP hydrolysis and of the RNA binding site, residing on the two RecA domains, become aligned. Closing of the helicase core is coupled to a deformation of the RNA backbone and destabilization of the RNA duplex, allowing for dissociation of one of the strands. The second strand remains bound to the helicase core until ATP hydrolysis and product release lead to re-opening of the core. The concomitant disruption of the RNA binding site causes dissociation of the second strand. The activity of the helicase core can be modulated by interaction partners, and by flanking N- and C-terminal domains. A number of C-terminal flanking regions have been implicated in RNA binding: RNA recognition motifs (RRM) typically mediate sequence-specific RNA binding, whereas positively charged, unstructured regions provide binding sites for structured RNA, without sequence-specificity. Interaction partners modulate RNA binding to the core, or bind to RNA regions emanating from the core. The functional interplay of the helicase core and ancillary domains or interaction partners in RNA binding and unwinding is not entirely understood. This review summarizes our current knowledge on RNA binding to the DEAD-box helicase core and the roles of ancillary domains and interaction partners in RNA binding and unwinding by DEAD-box proteins.

  12. ZifBASE: a database of zinc finger proteins and associated resources.

    PubMed

    Jayakanthan, Mannu; Muthukumaran, Jayaraman; Chandrasekar, Sanniyasi; Chawla, Konika; Punetha, Ankita; Sundar, Durai

    2009-09-09

    Information on the occurrence of zinc finger protein motifs in genomes is crucial to the developing field of molecular genome engineering. The knowledge of their target DNA-binding sequences is vital to develop chimeric proteins for targeted genome engineering and site-specific gene correction. There is a need to develop a computational resource of zinc finger proteins (ZFP) to identify the potential binding sites and its location, which reduce the time of in vivo task, and overcome the difficulties in selecting the specific type of zinc finger protein and the target site in the DNA sequence. ZifBASE provides an extensive collection of various natural and engineered ZFP. It uses standard names and a genetic and structural classification scheme to present data retrieved from UniProtKB, GenBank, Protein Data Bank, ModBase, Protein Model Portal and the literature. It also incorporates specialized features of ZFP including finger sequences and positions, number of fingers, physiochemical properties, classes, framework, PubMed citations with links to experimental structures (PDB, if available) and modeled structures of natural zinc finger proteins. ZifBASE provides information on zinc finger proteins (both natural and engineered ones), the number of finger units in each of the zinc finger proteins (with multiple fingers), the synergy between the adjacent fingers and their positions. Additionally, it gives the individual finger sequence and their target DNA site to which it binds for better and clear understanding on the interactions of adjacent fingers. The current version of ZifBASE contains 139 entries of which 89 are engineered ZFPs, containing 3-7F totaling to 296 fingers. There are 50 natural zinc finger protein entries ranging from 2-13F, totaling to 307 fingers. It has sequences and structures from literature, Protein Data Bank, ModBase and Protein Model Portal. The interface is cross linked to other public databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point) of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface http://web.iitd.ac.in/~sundar/zifbase.

  13. Tumour suppressor protein p53 regulates the stress activated bilirubin oxidase cytochrome P450 2A6

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hu, Hao, E-mail: hao.hu1@uqconnect.edu.au; Yu, Ting, E-mail: t.yu2@uq.edu.au; Arpiainen, Satu, E-mail: Satu.Juhila@orion.fi

    2015-11-15

    Human cytochrome P450 (CYP) 2A6 enzyme has been proposed to play a role in cellular defence against chemical-induced oxidative stress. The encoding gene is regulated by various stress activated transcription factors. This paper demonstrates that p53 is a novel transcriptional regulator of the gene. Sequence analysis of the CYP2A6 promoter revealed six putative p53 binding sites in a 3 kb proximate promoter region. The site closest to transcription start site (TSS) is highly homologous with the p53 consensus sequence. Transfection with various stepwise deletions of CYP2A6-5′-Luc constructs – down to − 160 bp from the TSS – showed p53 responsivenessmore » in p53 overexpressed C3A cells. However, a further deletion from − 160 to − 74 bp, including the putative p53 binding site, totally abolished the p53 responsiveness. Electrophoretic mobility shift assay with a probe containing the putative binding site showed specific binding of p53. A point mutation at the binding site abolished both the binding and responsiveness of the recombinant gene to p53. Up-regulation of the endogenous p53 with benzo[α]pyrene – a well-known p53 activator – increased the expression of the p53 responsive positive control and the CYP2A6-5′-Luc construct containing the intact p53 binding site but not the mutated CYP2A6-5′-Luc construct. Finally, inducibility of the native CYP2A6 gene by benzo[α]pyrene was demonstrated by dose-dependent increases in CYP2A6 mRNA and protein levels along with increased p53 levels in the nucleus. Collectively, the results indicate that p53 protein is a regulator of the CYP2A6 gene in C3A cells and further support the putative cytoprotective role of CYP2A6. - Highlights: • CYP2A6 is an immediate target gene of p53. • Six putative p53REs located on 3 kb proximate CYP2A6 promoter region. • The region − 160 bp from TSS is highly homologous with the p53 consensus sequence. • P53 specifically bind to the p53RE on the − 160 bp region. • HNF4α may interact with p53 in regulating CYP2A6 expression.« less

  14. Intrasteric control of AMPK via the gamma1 subunit AMP allosteric regulatory site.

    PubMed

    Adams, Julian; Chen, Zhi-Ping; Van Denderen, Bryce J W; Morton, Craig J; Parker, Michael W; Witters, Lee A; Stapleton, David; Kemp, Bruce E

    2004-01-01

    AMP-activated protein kinase (AMPK) is a alphabetagamma heterotrimer that is activated in response to both hormones and intracellular metabolic stress signals. AMPK is regulated by phosphorylation on the alpha subunit and by AMP allosteric control previously thought to be mediated by both alpha and gamma subunits. Here we present evidence that adjacent gamma subunit pairs of CBS repeat sequences (after Cystathionine Beta Synthase) form an AMP binding site related to, but distinct from the classical AMP binding site in phosphorylase, that can also bind ATP. The AMP binding site of the gamma(1) CBS1/CBS2 pair, modeled on the structures of the CBS sequences present in the inosine monophosphate dehydrogenase crystal structure, contains three arginine residues 70, 152, and 171 and His151. The yeast gamma homolog, snf4 contains a His151Gly substitution, and when this is introduced into gamma(1), AMP allosteric control is substantially lost and explains why the yeast snf1p/snf4p complex is insensitive to AMP. Arg70 in gamma(1) corresponds to the site of mutation in human gamma(2) and pig gamma(3) genes previously identified to cause an unusual cardiac phenotype and glycogen storage disease, respectively. Mutation of any of AMP binding site Arg residues to Gln substantially abolishes AMP allosteric control in expressed AMPK holoenzyme. The Arg/Gln mutations also suppress the previously described inhibitory properties of ATP and render the enzyme constitutively active. We propose that ATP acts as an intrasteric inhibitor by bridging the alpha and gamma subunits and that AMP functions to derepress AMPK activity.

  15. Footprinting reveals that nogalamycin and actinomycin shuffle between DNA binding sites.

    PubMed Central

    Fox, K R; Waring, M J

    1986-01-01

    The hypothesis that sequence-selective DNA-binding antibiotics locate their preferred binding sites by a process involving migration from nonspecific sites has been tested by footprinting with DNAase I. Footprinting patterns on the tyrT DNA fragment produced by nogalamycin and actinomycin change with time after mixing the antibiotic with the DNA. Sites of protection as well as enhanced cleavage are seen to develop in a fashion which is both temperature and concentration-dependent. At certain sites cutting is transiently enhanced, then blocked. Limited evidence for slow reaction with echinomycin and mithramycin is presented, but the kinetics of footprinting with daunomycin and distamycin appear instantaneous. The feasibility of adducing direct evidence for shuffling by footprinting seems to be governed by slow dissociation of the antibiotic-DNA complex. It may also be dependent upon the mode of binding, be it intercalative or non-intercalative in character. Images PMID:2421246

  16. Mapping of RNA accessible sites by extension of random oligonucleotide libraries with reverse transcriptase.

    PubMed Central

    Allawi, H T; Dong, F; Ip, H S; Neri, B P; Lyamichev, V I

    2001-01-01

    A rapid and simple method for determining accessible sites in RNA that is independent of the length of target RNA and does not require RNA labeling is described. In this method, target RNA is allowed to hybridize with sequence-randomized libraries of DNA oligonucleotides linked to a common tag sequence at their 5'-end. Annealed oligonucleotides are extended with reverse transcriptase and the extended products are then amplified by using PCR with a primer corresponding to the tag sequence and a second primer specific to the target RNA sequence. We used the combination of both the lengths of the RT-PCR products and the location of the binding site of the RNA-specific primer to determine which regions of the RNA molecules were RNA extendible sites, that is, sites available for oligonucleotide binding and extension. We then employed this reverse transcription with the random oligonucleotide libraries (RT-ROL) method to determine the accessible sites on four mRNA targets, human activated ras (ha-ras), human intercellular adhesion molecule-1 (ICAM-1), rabbit beta-globin, and human interferon-gamma (IFN-gamma). Our results were concordant with those of other researchers who had used RNase H cleavage or hybridization with arrays of oligonucleotides to identify accessible sites on some of these targets. Further, we found good correlation between sites when we compared the location of extendible sites identified by RT-ROL with hybridization sites of effective antisense oligonucleotides on ICAM-1 mRNA in antisense inhibition studies. Finally, we discuss the relationship between RNA extendible sites and RNA accessibility. PMID:11233988

  17. CCCTC-Binding Factor Acts as a Heterochromatin Barrier on Herpes Simplex Viral Latent Chromatin and Contributes to Poised Latent Infection

    PubMed Central

    2018-01-01

    ABSTRACT Herpes simplex virus 1 (HSV-1) establishes latent infection in neurons via a variety of epigenetic mechanisms that silence its genome. The cellular CCCTC-binding factor (CTCF) functions as a mediator of transcriptional control and chromatin organization and has binding sites in the HSV-1 genome. We constructed an HSV-1 deletion mutant that lacked a pair of CTCF-binding sites (CTRL2) within the latency-associated transcript (LAT) coding sequences and found that loss of these CTCF-binding sites did not alter lytic replication or levels of establishment of latent infection, but their deletion reduced the ability of the virus to reactivate from latent infection. We also observed increased heterochromatin modifications on viral chromatin over the LAT promoter and intron. We therefore propose that CTCF binding at the CTRL2 sites acts as a chromatin insulator to keep viral chromatin in a form that is poised for reactivation, a state which we call poised latency. PMID:29437926

  18. Development of a sugar-binding residue prediction system from protein sequences using support vector machine.

    PubMed

    Banno, Masaki; Komiyama, Yusuke; Cao, Wei; Oku, Yuya; Ueki, Kokoro; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

    2017-02-01

    Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513). Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

    PubMed

    Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-01-01

    Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.

  20. Human La binds mRNAs through contacts to the poly(A) tail

    PubMed Central

    Vinayak, Jyotsna; Marrella, Stefano A; Hussain, Rawaa H; Rozenfeld, Leonid; Solomon, Karine; Bayfield, Mark A

    2018-01-01

    Abstract In addition to a role in the processing of nascent RNA polymerase III transcripts, La proteins are also associated with promoting cap-independent translation from the internal ribosome entry sites of numerous cellular and viral coding RNAs. La binding to RNA polymerase III transcripts via their common UUU-3’OH motif is well characterized, but the mechanism of La binding to coding RNAs is poorly understood. Using electromobility shift assays and cross-linking immunoprecipitation, we show that in addition to a sequence specific UUU-3’OH binding mode, human La exhibits a sequence specific and length dependent poly(A) binding mode. We demonstrate that this poly(A) binding mode uses the canonical nucleic acid interaction winged helix face of the eponymous La motif, previously shown to be vacant during uridylate binding. We also show that cytoplasmic, but not nuclear La, engages poly(A) RNA in human cells, that La entry into polysomes utilizes the poly(A) binding mode, and that La promotion of translation from the cyclin D1 internal ribosome entry site occurs in competition with cytoplasmic poly(A) binding protein (PABP). Our data are consistent with human La functioning in translation through contacts to the poly(A) tail. PMID:29447394

  1. Studies on DNA-binding selectivity of WRKY transcription factors lend structural clues into WRKY-domain function.

    PubMed

    Ciolkowski, Ingo; Wanke, Dierk; Birkenbihl, Rainer P; Somssich, Imre E

    2008-09-01

    WRKY transcription factors have been shown to play a major role in regulating, both positively and negatively, the plant defense transcriptome. Nearly all studied WRKY factors appear to have a stereotypic binding preference to one DNA element termed the W-box. How specificity for certain promoters is accomplished therefore remains completely unknown. In this study, we tested five distinct Arabidopsis WRKY transcription factor subfamily members for their DNA binding selectivity towards variants of the W-box embedded in neighboring DNA sequences. These studies revealed for the first time differences in their binding site preferences, which are partly dependent on additional adjacent DNA sequences outside of the TTGACY-core motif. A consensus WRKY binding site derived from these studies was used for in silico analysis to identify potential target genes within the Arabidopsis genome. Furthermore, we show that even subtle amino acid substitutions within the DNA binding region of AtWRKY11 strongly impinge on its binding activity. Additionally, all five factors were found localized exclusively to the plant cell nucleus and to be capable of trans-activating expression of a reporter gene construct in vivo.

  2. The adenovirus L4-22K protein regulates transcription and RNA splicing via a sequence-specific single-stranded RNA binding.

    PubMed

    Lan, Susan; Kamel, Wael; Punga, Tanel; Akusjärvi, Göran

    2017-02-28

    The adenovirus L4-22K protein both activates and suppresses transcription from the adenovirus major late promoter (MLP) by binding to DNA elements located downstream of the MLP transcriptional start site: the so-called DE element (positive) and the R1 region (negative). Here we show that L4-22K preferentially binds to the RNA form of the R1 region, both to the double-stranded RNA and the single-stranded RNA of the same polarity as the nascent MLP transcript. Further, L4-22K binds to a 5΄-CAAA-3΄ motif in the single-stranded RNA, which is identical to the sequence motif characterized for L4-22K DNA binding. L4-22K binding to single-stranded RNA results in an enhancement of U1 snRNA recruitment to the major late first leader 5΄ splice site. This increase in U1 snRNA binding results in a suppression of MLP transcription and a concurrent stimulation of major late first intron splicing. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Propeptide cleavage conditions sortilin/neurotensin receptor-3 for ligand binding.

    PubMed

    Munck Petersen, C; Nielsen, M S; Jacobsen, C; Tauris, J; Jacobsen, L; Gliemann, J; Moestrup, S K; Madsen, P

    1999-02-01

    We recently reported the isolation and sequencing of sortilin, a new putative sorting receptor that binds receptor-associated protein (RAP). The luminal N-terminus of sortilin comprises a consensus sequence for cleavage by furin, R41WRR44, which precedes a truncation originally found in sortilin isolated from human brain. We now show that the truncation results from cellular processing. Sortilin is synthesized as a proform which, in late Golgi compartments, is converted to the mature receptor by furin-mediated cleavage of a 44 residue N-terminal propeptide. We further demonstrate that the propeptide exhibits pH-dependent high affinity binding to fully processed sortilin, that the binding is competed for by RAP and the newly discovered sortilin ligand neurotensin, and that prevention of propeptide cleavage essentially prevents binding of RAP and neurotensin. The findings evidence that the propeptide sterically hinders ligands from gaining access to overlapping binding sites in prosortilin, and that cleavage and release of the propeptide preconditions sortilin for full functional activity. Although proteolytic processing is involved in the maturation of several receptors, the described exposure of previously concealed ligand-binding sites after furin-mediated cleavage of propeptide represents a novel mechanism in receptor activation.

  4. Propeptide cleavage conditions sortilin/neurotensin receptor-3 for ligand binding.

    PubMed Central

    Munck Petersen, C; Nielsen, M S; Jacobsen, C; Tauris, J; Jacobsen, L; Gliemann, J; Moestrup, S K; Madsen, P

    1999-01-01

    We recently reported the isolation and sequencing of sortilin, a new putative sorting receptor that binds receptor-associated protein (RAP). The luminal N-terminus of sortilin comprises a consensus sequence for cleavage by furin, R41WRR44, which precedes a truncation originally found in sortilin isolated from human brain. We now show that the truncation results from cellular processing. Sortilin is synthesized as a proform which, in late Golgi compartments, is converted to the mature receptor by furin-mediated cleavage of a 44 residue N-terminal propeptide. We further demonstrate that the propeptide exhibits pH-dependent high affinity binding to fully processed sortilin, that the binding is competed for by RAP and the newly discovered sortilin ligand neurotensin, and that prevention of propeptide cleavage essentially prevents binding of RAP and neurotensin. The findings evidence that the propeptide sterically hinders ligands from gaining access to overlapping binding sites in prosortilin, and that cleavage and release of the propeptide preconditions sortilin for full functional activity. Although proteolytic processing is involved in the maturation of several receptors, the described exposure of previously concealed ligand-binding sites after furin-mediated cleavage of propeptide represents a novel mechanism in receptor activation. PMID:9927419

  5. Identification and positional distribution analysis of transcription factor binding sites for genes from the wheat fl-cDNA sequences.

    PubMed

    Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui

    2017-06-01

    The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osipiuk, J.; Gornicki, P.; Maj, L.

    The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 Angstroms. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 Angstroms from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer {alpha}/{beta} sandwich with the overall shape of a cylinder and shows no structural homology to proteins of knownmore » structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the {alpha}-{beta} plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.« less

  7. Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold.

    PubMed

    Osipiuk, J; Górnicki, P; Maj, L; Dementieva, I; Laskowski, R; Joachimiak, A

    2001-11-01

    The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 A. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 A from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer alpha/beta sandwich with the overall shape of a cylinder and shows no structural homology to proteins of known structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the alpha-beta plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.

  8. Changes in solvation during DNA binding and cleavage are critical to altered specificity of the EcoRI endonuclease

    PubMed Central

    Robinson, Clifford R.; Sligar, Stephen G.

    1998-01-01

    Restriction endonucleases such as EcoRI bind and cleave DNA with great specificity and represent a paradigm for protein–DNA interactions and molecular recognition. Using osmotic pressure to induce water release, we demonstrate the participation of bound waters in the sequence discrimination of substrate DNA by EcoRI. Changes in solvation can play a critical role in directing sequence-specific DNA binding by EcoRI and are also crucial in assisting site discrimination during catalysis. By measuring the volume change for complex formation, we show that at the cognate sequence (GAATTC) EcoRI binding releases about 70 fewer water molecules than binding at an alternate DNA sequence (TAATTC), which differs by a single base pair. EcoRI complexation with nonspecific DNA releases substantially less water than either of these specific complexes. In cognate substrates (GAATTC) kcat decreases as osmotic pressure is increased, indicating the binding of about 30 water molecules accompanies the cleavage reaction. For the alternate substrate (TAATTC), release of about 40 water molecules accompanies the reaction, indicated by a dramatic acceleration of the rate when osmotic pressure is raised. These large differences in solvation effects demonstrate that water molecules can be key players in the molecular recognition process during both association and catalytic phases of the EcoRI reaction, acting to change the specificity of the enzyme. For both the protein–DNA complex and the transition state, there may be substantial conformational differences between cognate and alternate sites, accompanied by significant alterations in hydration and solvent accessibility. PMID:9482860

  9. The spacing between adjacent binding sites in the family of repeats affects the functions of Epstein-Barr nuclear antigen 1 in transcription activation and stable plasmid maintenance.

    PubMed

    Hebner, Christy; Lasanen, Julie; Battle, Scott; Aiyar, Ashok

    2003-07-05

    Epstein-Barr virus (EBV) and the closely related Herpesvirus papio (HVP) are stably replicated as episomes in proliferating latently infected cells. Maintenance and partitioning of these viral plasmids requires a viral sequence in cis, termed the family of repeats (FR), that is bound by a viral protein, Epstein-Barr nuclear antigen 1 (EBNA1). Upon binding FR, EBNA1 maintains viral genomes in proliferating cells and activates transcription from viral promoters required for immortalization. FR from either virus encodes multiple binding sites for the viral maintenance protein, EBNA1, with the FR from the prototypic B95-8 strain of EBV containing 20 binding sites, and FR from HVP containing 8 binding sites. In addition to differences in the number of EBNA1-binding sites, adjacent binding sites in the EBV FR are typically separated by 14 base pairs (bp), but are separated by 10 bp in HVP. We tested whether the number of binding sites, as well as the distance between adjacent binding sites, affects the function of EBNA1 in transcription activation or plasmid maintenance. Our results indicate that EBNA1 activates transcription more efficiently when adjacent binding sites are separated by 10 bp, the spacing observed in HVP. In contrast, using two separate assays, we demonstrate that plasmid maintenance is greatly augmented when adjacent EBNA1-binding sites are separated by 14 bp, and therefore, presumably lie on the same face of the DNA double helix. These results provide indication that the functions of EBNA1 in transcription activation and plasmid maintenance are separable.

  10. Structure-based Analysis to Hu-DNA Binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swinger,K.; Rice, P.

    2007-01-01

    HU and IHF are prokaryotic proteins that induce very large bends in DNA. They are present in high concentrations in the bacterial nucleoid and aid in chromosomal compaction. They also function as regulatory cofactors in many processes, such as site-specific recombination and the initiation of replication and transcription. HU and IHF have become paradigms for understanding DNA bending and indirect readout of sequence. While IHF shows significant sequence specificity, HU binds preferentially to certain damaged or distorted DNAs. However, none of the structurally diverse HU substrates previously studied in vitro is identical with the distorted substrates in the recently publishedmore » Anabaena HU(AHU)-DNA cocrystal structures. Here, we report binding affinities for AHU and the DNA in the cocrystal structures. The binding free energies for formation of these AHU-DNA complexes range from 10-14.5 kcal/mol, representing K{sub d} values in the nanomolar to low picomolar range, and a maximum stabilization of at least 6.3 kcal/mol relative to complexes with undistorted, non-specific DNA. We investigated IHF binding and found that appropriate structural distortions can greatly enhance its affinity. On the basis of the coupling of structural and relevant binding data, we estimate the amount of conformational strain in an IHF-mediated DNA kink that is relieved by a nick (at least 0.76 kcal/mol) and pinpoint the location of the strain. We show that AHU has a sequence preference for an A+T-rich region in the center of its DNA-binding site, correlating with an unusually narrow minor groove. This is similar to sequence preferences shown by the eukaryotic nucleosome.« less

  11. NMR and computational methods applied to the 3- dimensional structure determination of DNA and ligand-DNA complexes in solution

    NASA Astrophysics Data System (ADS)

    Smith, Jarrod Anson

    2D homonuclear 1H NMR methods and restrained molecular dynamics (rMD) calculations have been applied to determining the three-dimensional structures of DNA and minor groove-binding ligand-DNA complexes in solution. The structure of the DNA decamer sequence d(GCGTTAACGC)2 has been solved both with a distance-based rMD protocol and an NOE relaxation matrix backcalculation-based protocol in order to probe the relative merits of the different refinement methods. In addition, three minor groove binding ligand-DNA complexes have been examined. The solution structure of the oligosaccharide moiety of the antitumor DNA scission agent calicheamicin γ1I has been determined in complex with a decamer duplex containing its high affinity 5'-TCCT- 3' binding sequence. The structure of the complex reinforces the belief that the oligosaccharide moiety is responsible for the sequence selective minor-groove binding activity of the agent, and critical intermolecular contacts are revealed. The solution structures of both the (+) and (-) enantiomers of the minor groove binding DNA alkylating agent duocarmycin SA have been determined in covalent complex with the undecamer DNA duplex d(GACTAATTGTC).d(GAC AATTAGTC). The results support the proposal that the alkylation activity of the duocarmycin antitumor antibiotics is catalyzed by a binding-induced conformational change in the ligand which activates the cyclopropyl group for reaction with the DNA. Comparisons between the structures of the two enantiomers covalently bound to the same DNA sequence at the same 5'-AATTA-3 ' site have provided insight into the binding orientation and site selectivity, as well as the relative rates of reactivity of these two agents.

  12. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less

  13. The Sequence-specific Peptide-binding Activity of the Protein Sulfide Isomerase AGR2 Directs Its Stable Binding to the Oncogenic Receptor EpCAM.

    PubMed

    Mohtar, M Aiman; Hernychova, Lenka; O'Neill, J Robert; Lawrence, Melanie L; Murray, Euan; Vojtesek, Borek; Hupp, Ted R

    2018-04-01

    AGR2 is an oncogenic endoplasmic reticulum (ER)-resident protein disulfide isomerase. AGR2 protein has a relatively unique property for a chaperone in that it can bind sequence-specifically to a specific peptide motif (TTIYY). A synthetic TTIYY-containing peptide column was used to affinity-purify AGR2 from crude lysates highlighting peptide selectivity in complex mixtures. Hydrogen-deuterium exchange mass spectrometry localized the dominant region in AGR2 that interacts with the TTIYY peptide to within a structural loop from amino acids 131-135 (VDPSL). A peptide binding site consensus of Tx[IL][YF][YF] was developed for AGR2 by measuring its activity against a mutant peptide library. Screening the human proteome for proteins harboring this motif revealed an enrichment in transmembrane proteins and we focused on validating EpCAM as a potential AGR2-interacting protein. AGR2 and EpCAM proteins formed a dose-dependent protein-protein interaction in vitro Proximity ligation assays demonstrated that endogenous AGR2 and EpCAM protein associate in cells. Introducing a single alanine mutation in EpCAM at Tyr251 attenuated its binding to AGR2 in vitro and in cells. Hydrogen-deuterium exchange mass spectrometry was used to identify a stable binding site for AGR2 on EpCAM, adjacent to the TLIYY motif and surrounding EpCAM's detergent binding site. These data define a dominant site on AGR2 that mediates its specific peptide-binding function. EpCAM forms a model client protein for AGR2 to study how an ER-resident chaperone can dock specifically to a peptide motif and regulate the trafficking a protein destined for the secretory pathway. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  14. Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing; Progress report, June 1, 1990--May 31, 1993

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Richardson, C.C.

    1993-12-31

    This project focuses on the DNA polymerase (gene 5 protein) of phage T7 for use in DNA sequence analysis. Gene 5 protein interacts with accessory proteins to acquire properties essential for DNA replication. One goal is to understand these interactions in order to modify the proteins for use in DNA sequencing. E. coli thioredoxin, binds to gene 5 protein and clamps it to a primer-template. They have analyzed the binding of gene 5 protein-thioredoxin to primer-templates and have defined the optimal conditions to form an extremely stable complex with a dNTP in the polymerase catalytic site. The spatial proximity ofmore » these components has been determined using fluorescence emission anisotropy. The T7 DNA binding protein, the gene 2.5 protein, interacts with gene 5 protein and gene 4 protein to increase processivity and primer synthesis, respectively. Mutant gene 2.5 proteins have been isolated that do not interact with T7 DNA polymerase and can not support T7 growth. The nucleotide binding site of the T7 helicase has been identified and mutations affecting the site provide information on how the hydrolysis of NTPs fuel its unidirectional translocation. The sequence, GTC, has been shown to be necessary and sufficient for recognition by the T7 primase. The T7 gene 5.5 protein interacts with the E. coli nucleoid protein, H-NS, and also overcomes the phage {lambda} rex restriction system.« less

  15. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

    PubMed

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-12-01

    The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  16. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  17. The lytic origin of herpesvirus papio is highly homologous to Epstein-Barr virus ori-Lyt: evolutionary conservation of transcriptional activation and replication signals.

    PubMed Central

    Ryon, J J; Fixman, E D; Houchens, C; Zong, J; Lieberman, P M; Chang, Y N; Hayward, G S; Hayward, S D

    1993-01-01

    Herpesvirus papio (HVP) is a B-lymphotropic baboon virus with an estimated 40% homology to Epstein-Barr virus (EBV). We have cloned and sequenced ori-Lyt of herpesvirus papio and found a striking degree of nucleotide homology (89%) with ori-Lyt of EBV. Transcriptional elements form an integral part of EBV ori-Lyt. The promoter and enhancer domains of EBV ori-Lyt are conserved in herpesvirus papio. The EBV ori-Lyt promoter contains four binding sites for the EBV lytic cycle transactivator Zta, and the enhancer includes one Zta and two Rta response elements. All five of the Zta response elements and one of the Rta motifs are conserved in HVP ori-Lyt, and the HVP DS-L leftward promoter and the enhancer were activated in transient transfection assays by the EBV Zta and Rta transactivators. The EBV ori-Lyt enhancer contains a palindromic sequence, GGTCAGCTGACC, centered on a PvuII restriction site. This sequence, with a single base change, is also present in the HVP ori-Lyt enhancer. DNase I footprinting demonstrated that the PvuII sequence was bound by a protein present in a Raji nuclear extract. Mobility shift and competition assays using oligonucleotide probes identified this sequence as a binding site for the cellular transcription factor MLTF. Mutagenesis of the binding site indicated that MLTF contributes significantly to the constitutive activity of the ori-Lyt enhancer. The high degree of conservation of cis-acting signal sequences in HVP ori-Lyt was further emphasized by the finding that an HVP ori-Lyt-containing plasmid was replicated in Vero cells by a set of cotransfected EBV replication genes. The central domain of EBV ori-Lyt contains two related AT-rich palindromes, one of which is partially duplicated in the HVP sequence. The AT-rich palindromes are functionally important cis-acting motifs. Deletion of these palindromes severely diminished replication of an ori-Lyt target plasmid. Images PMID:8389916

  18. Finding the target sites of RNA-binding proteins

    PubMed Central

    Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D

    2014-01-01

    RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996

  19. Chorion gene activation and repression is dependent on BmC/EBP expression and binding to cognate cis-elements.

    PubMed

    Papantonis, Argyris; Sourmeli, Sissy; Lecanidou, Rena

    2008-05-09

    From the different cis-elements clustered on silkmoth chorion gene promoters, C/EBP binding sites predominate. Their sequence composition and dispersal vary amongst promoters of diverse developmental specificity. Occupancy of these sites by BmC/EBP was examined through Southwestern and ChIP assays modified to suit ovarian follicular cells. For the genes studied, binding of BmC/EBP coincided with the respective stages of transcriptional activation. However, the factor was reloaded on promoter sequences long after individual gene repression. Furthermore, suppression of BmC/EBP transcription in developing follicles resulted in de-regulation of chorion gene expression. A biphasic function of BmC/EBP, according to which it may act as both an activator and a repressor during silkmoth choriogenesis, is considered under the light of the presented data.

  20. Predicting conformational ensembles and genome-wide transcription factor binding sites from DNA sequences.

    PubMed

    Andrabi, Munazah; Hutchins, Andrew Paul; Miranda-Saavedra, Diego; Kono, Hidetoshi; Nussinov, Ruth; Mizuguchi, Kenji; Ahmad, Shandar

    2017-06-22

    DNA shape is emerging as an important determinant of transcription factor binding beyond just the DNA sequence. The only tool for large scale DNA shape estimates, DNAshape was derived from Monte-Carlo simulations and predicts four broad and static DNA shape features, Propeller twist, Helical twist, Minor groove width and Roll. The contributions of other shape features e.g. Shift, Slide and Opening cannot be evaluated using DNAshape. Here, we report a novel method DynaSeq, which predicts molecular dynamics-derived ensembles of a more exhaustive set of DNA shape features. We compared the DNAshape and DynaSeq predictions for the common features and applied both to predict the genome-wide binding sites of 1312 TFs available from protein interaction quantification (PIQ) data. The results indicate a good agreement between the two methods for the common shape features and point to advantages in using DynaSeq. Predictive models employing ensembles from individual conformational parameters revealed that base-pair opening - known to be important in strand separation - was the best predictor of transcription factor-binding sites (TFBS) followed by features employed by DNAshape. Of note, TFBS could be predicted not only from the features at the target motif sites, but also from those as far as 200 nucleotides away from the motif.

  1. Computational Design of Ligand Binding Proteins with High Affinity and Selectivity

    PubMed Central

    Dou, Jiayi; Doyle, Lindsey; Nelson, Jorgen W.; Schena, Alberto; Jankowski, Wojciech; Kalodimos, Charalampos G.; Johnsson, Kai; Stoddard, Barry L.; Baker, David

    2014-01-01

    The ability to design proteins with high affinity and selectivity for any given small molecule would have numerous applications in biosensing, diagnostics, and therapeutics, and is a rigorous test of our understanding of the physiochemical principles that govern molecular recognition phenomena. Attempts to design ligand binding proteins have met with little success, however, and the computational design of precise molecular recognition between proteins and small molecules remains an “unsolved problem”1. We describe a general method for the computational design of small molecule binding sites with pre-organized hydrogen bonding and hydrophobic interfaces and high overall shape complementary to the ligand, and use it to design protein binding sites for the steroid digoxigenin (DIG). Of 17 designs that were experimentally characterized, two bind DIG; the highest affinity design has the lowest predicted interaction energy and the most pre-organized binding site in the set. A comprehensive binding-fitness landscape of this design generated by library selection and deep sequencing was used to guide optimization of binding affinity to a picomolar level, and two X-ray co-crystal structures of optimized complexes show atomic level agreement with the design models. The designed binder has a high selectivity for DIG over the related steroids digitoxigenin, progesterone, and β-estradiol, which can be reprogrammed through the designed hydrogen-bonding interactions. Taken together, the binding fitness landscape, co-crystal structures, and thermodynamic binding parameters illustrate how increases in binding affinity can result from distal sequence changes that limit the protein ensemble to conformers making the most energetically favorable interactions with the ligand. The computational design method presented here should enable the development of a new generation of biosensors, therapeutics, and diagnostics. PMID:24005320

  2. Opaque-2 is a transcriptional activator that recognizes a specific target site in 22-kD zein genes.

    PubMed Central

    Schmidt, R J; Ketudat, M; Aukerman, M J; Hoschek, G

    1992-01-01

    opaque-2 (o2) is a regulatory locus in maize that plays an essential role in controlling the expression of genes encoding the 22-kD zein proteins. Through DNase I footprinting and DNA binding analyses, we have identified the binding site for the O2 protein (O2) in the promoter of 22-kD zein genes. The sequence in the 22-kD zein gene promoter that is recognized by O2 is similar to the target site recognized by other "basic/leucine zipper" (bZIP) proteins in that it contains an ACGT core that is necessary for DNA binding. The site is located in the -300 region relative to the translation start and lies about 20 bp downstream of the highly conserved zein gene sequence motif known as the "prolamin box." Employing gel mobility shift assays, we used O2 antibodies and nuclear extracts from an o2 null mutant to demonstrate that the O2 protein in maize endosperm nuclei recognizes the target site in the zein gene promoter. Mobility shift assays using nuclear proteins from an o2 null mutant indicated that other endosperm proteins in addition to O2 can bind the O2 target site and that O2 may be associated with one of these proteins. We also demonstrated that in yeast cells the O2 protein can activate expression of a lacZ gene containing a multimer of the O2 target sequence as part of its promoter, thus confirming its role as a transcriptional activator. A computer-assisted search indicated that the O2 target site is not present in the promoters of zein genes other than those of the 22-kD class. These data suggest a likely explanation at the molecular level for the differential effect of o2 mutations on expression of certain members of the zein gene family. PMID:1392590

  3. Functional analysis of the EspR binding sites upstream of espR in Mycobacterium tuberculosis.

    PubMed

    Cao, Guangxiang; Howard, Susan T; Zhang, Peipei; Hou, Guihua; Pang, Xiuhua

    2013-11-01

    The ESX-1 secretion system exports substrate proteins into host cells and is crucial for the pathogenesis of Mycobacterium tuberculosis. EspR is one of the characterized transcriptional regulators that modulates the ESX-1 system by binding the conserved EspR binding sites in the promoter of espA, the encoding gene of EspA, which is also a substrate protein of the ESX-1 system and is required for the ESX-1 activity. EspR is autoregulatory and conserved EspR binding sites are present upstream of espR. In this study, we showed that these EspR sites had varying affinities for EspR, with site B being the strongest one. Point mutations of the DNA sequence at site B abolished binding of EspR to oligonucleotides containing site B alone or with other sites, further suggesting that site B is a major binding site for EspR. Complementation studies showed that constructs containing espR, and the upstream intergenic region fully restored espR expression in a ΔespR mutant strain. Although recombinant strains with mutations at more than one EspR site showed minimal differences in espR expression, reduced expression of other EspR target genes was observed, suggesting that slight changes in EspR levels can have downstream regulatory effects. These findings contribute to our understanding of the regulation of the ESX-1 system.

  4. Copper attachment to a non-octarepeat site in prion protein

    NASA Astrophysics Data System (ADS)

    Hodak, Miroslav; Bernholc, Jerry

    2010-03-01

    Prion protein, PrP, plays a causative role in several neurodegenerative diseases, including mad cow disease in cattle and Creutzfeldt-Jakob disease in humans. The PrP is known to efficiently bind copper ions and this ability has been linked to its function. PrP contains up to six binding sites, four of which are located in the so-called octarepeat region and are now well known. The binding sites outside this region are still largely undetermined, despite evidence of their relevance to prion diseases. Using a hybrid DFT/DFT, which combines Kohn-Sham DFT with orbital-free DFT to achieve accurate and efficient description of solvent effects in ab initio calculations, we have investigated copper attachment to the sequence GGGTH, which represents the copper binding site located at His96. We have considered both NNNN and NNNO types of copper coordination, as suggested by experiments. Our calculations have determined the geometry of copper attachment site and its energetics. Comparison to the already known binding sites provides insight into the process of copper uptake in PrP.

  5. Crystal structures of botulinum neurotoxin DC in complex with its protein receptors synaptotagmin I and II.

    PubMed

    Berntsson, Ronnie Per-Arne; Peng, Lisheng; Svensson, Linda Marie; Dong, Min; Stenmark, Pål

    2013-09-03

    Botulinum neurotoxins (BoNTs) can cause paralysis at exceptionally low concentrations and include seven serotypes (BoNT/A-G). The chimeric BoNT/DC toxin has a receptor binding domain similar to the same region in BoNT/C. However, BoNT/DC does not share protein receptor with BoNT/C. Instead, it shares synaptotagmin (Syt) I and II as receptors with BoNT/B, despite their low sequence similarity. Here, we present the crystal structures of the binding domain of BoNT/DC in complex with the recognition domains of its protein receptors, Syt-I and Syt-II. The structures reveal that BoNT/DC possesses a Syt binding site, distinct from the established Syt-II binding site in BoNT/B. Structure-based mutagenesis further shows that hydrophobic interactions play a key role in Syt binding. The structures suggest that the BoNT/DC ganglioside binding sites are independent of the protein receptor binding site. Our results reveal the remarkable versatility in the receptor recognition of the BoNTs. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein,more » TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.« less

  7. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    PubMed Central

    Le Coq, Johanne; Ghosh, Partho

    2011-01-01

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 1020 potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation. PMID:21873231

  8. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement.

    PubMed

    Le Coq, Johanne; Ghosh, Partho

    2011-08-30

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10(20) potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  9. Massive GGAAs in genomic repetitive sequences serve as a nuclear reservoir of NF-κB.

    PubMed

    Wu, Jian; Wang, Qiao; Dai, Wei; Wang, Wei; Yue, Ming; Wang, Jinke

    2018-04-13

    Nuclear factor κB (NF-κB) is a DNA-binding transcription factor. Characterizing its genomic binding sites is crucial for understanding its gene regulatory function and mechanism in cells. This study characterized the binding sites of NF-κB RelA/p65 in the tumor neurosis factor-α (TNFα) stimulated HeLa cells by a precise chromatin immunoprecipitation-sequencing (ChIP-seq). The results revealed that NF-κB binds nontraditional motifs (nt-motifs) containing conserved GGAA quadruplet. Moreover, nt-motifs mainly distribute in the peaks nearby centromeres that contain a larger number of repetitive elements such as satellite, simple repeats and short interspersed nuclear elements (SINEs). This intracellular binding pattern was then confirmed by the in vitro detection, indicating that NF-κB dimers can bind the nontraditional κB (nt-κB) sites with low affinity. However, this binding hardly activates transcription. This study thus deduced that NF-κB binding nt-motifs may realize functions other than gene regulation as NF-κB binding traditional motifs (t-motifs). To testify the deduction, many ChIP-seq data of other cell lines were then analyzed. The results indicate that NF-κB binding nt-motifs is also widely present in other cells. The ChIP-seq data analysis also revealed that nt-motifs more widely distribute in the peaks with low-fold enrichment. Importantly, it was also found that NF-κB binding nt-motifs is mainly present in the resting cells, whereas NF-κB binding t-motifs is mainly present in the stimulated cells. Astonishingly, no known function was enriched by the gene annotation of nt-motif peaks. Based on these results, this study proposed that the nt-κB sites that extensively distribute in larger numbers of repeat elements function as a nuclear reservoir of NF-κB. The nuclear NF-κB proteins stored at nt-κB sites in the resting cells may be recruited to the t-κB sites for regulating its target genes upon stimulation. Copyright © 2018 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.

  10. Simultaneous fluorescence light-up and selective multicolor nucleobase recognition based on sequence-dependent strong binding of berberine to DNA abasic site.

    PubMed

    Wu, Fei; Shao, Yong; Ma, Kun; Cui, Qinghua; Liu, Guiying; Xu, Shujuan

    2012-04-28

    Label-free DNA nucleobase recognition by fluorescent small molecules has received much attention due to its simplicity in mutation identification and drug screening. However, sequence-dependent fluorescence light-up nucleobase recognition and multicolor emission with individual emission energy for individual nucleobases have been seldom realized. Herein, an abasic site (AP site) in a DNA duplex was employed as a binding field for berberine, one of isoquinoline alkaloids. Unlike weak binding of berberine to the fully matched DNAs without the AP site, strong binding of berberine to the AP site occurs and the berberine's fluorescence light-up behaviors are highly dependent on the target nucleobases opposite the AP site in which the targets thymine and cytosine produce dual emission bands, while the targets guanine and adenine only give a single emission band. Furthermore, more intense emissions are observed for the target pyrimidines than purines. The flanking bases of the AP site also produce some modifications of the berberine's emission behavior. The binding selectivity of berberine at the AP site is also confirmed by measurements of fluorescence resonance energy transfer, excited-state lifetime, DNA melting and fluorescence quenching by ferrocyanide and sodium chloride. It is expected that the target pyrimidines cause berberine to be stacked well within DNA base pairs near the AP site, which results in a strong resonance coupling of the electronic transitions to the particular vibration mode to produce the dual emissions. The fluorescent signal-on and emission energy-modulated sensing for nucleobases based on this fluorophore is substantially advantageous over the previously used fluorophores. We expect that this approach will be developed as a practical device for differentiating pyrimidines from purines by positioning an AP site toward a target that is available for readout by this alkaloid probe. This journal is © The Royal Society of Chemistry 2012

  11. Effects of nucleoside analog incorporation on DNA binding to the DNA binding domain of the GATA-1 erythroid transcription factor.

    PubMed

    Foti, M; Omichinski, J G; Stahl, S; Maloney, D; West, J; Schweitzer, B I

    1999-02-05

    We investigate here the effects of the incorporation of the nucleoside analogs araC (1-beta-D-arabinofuranosylcytosine) and ganciclovir (9-[(1,3-dihydroxy-2-propoxy)methyl] guanine) into the DNA binding recognition sequence for the GATA-1 erythroid transcription factor. A 10-fold decrease in binding affinity was observed for the ganciclovir-substituted DNA complex in comparison to an unmodified DNA of the same sequence composition. AraC substitution did not result in any changes in binding affinity. 1H-15N HSQC and NOESY NMR experiments revealed a number of chemical shift changes in both DNA and protein in the ganciclovir-modified DNA-protein complex when compared to the unmodified DNA-protein complex. These changes in chemical shift and binding affinity suggest a change in the binding mode of the complex when ganciclovir is incorporated into the GATA DNA binding site.

  12. pUL34 binding near the human cytomegalovirus origin of lytic replication enhances DNA replication and viral growth.

    PubMed

    Slayton, Mark; Hossain, Tanvir; Biegalke, Bonita J

    2018-05-01

    The human cytomegalovirus (HCMV) UL34 gene encodes sequence-specific DNA-binding proteins (pUL34) which are required for viral replication. Interactions of pUL34 with DNA binding sites represses transcription of two viral immune evasion genes, US3 and US9. 12 additional predicted pUL34-binding sites are present in the HCMV genome (strain AD169) with three binding sites concentrated near the HCMV origin of lytic replication (oriLyt). We used ChIP-seq analysis of pUL34-DNA interactions to confirm that pUL34 binds to the oriLyt region during infection. Mutagenesis of the UL34-binding sites in an oriLyt-containing plasmid significantly reduced viral-mediated oriLyt-dependent DNA replication. Mutagenesis of these sites in the HCMV genome reduced the replication efficiencies of the resulting viruses. Protein-protein interaction analyses demonstrated that pUL34 interacts with the viral proteins IE2, UL44, and UL84, that are essential for viral DNA replication, suggesting that pUL34-DNA interactions in the oriLyt region are involved in the DNA replication cascade. Copyright © 2018 Elsevier Inc. All rights reserved.

  13. PolyaPeak: Detecting Transcription Factor Binding Sites from ChIP-seq Using Peak Shape Information

    PubMed Central

    Wu, Hao; Ji, Hongkai

    2014-01-01

    ChIP-seq is a powerful technology for detecting genomic regions where a protein of interest interacts with DNA. ChIP-seq data for mapping transcription factor binding sites (TFBSs) have a characteristic pattern: around each binding site, sequence reads aligned to the forward and reverse strands of the reference genome form two separate peaks shifted away from each other, and the true binding site is located in between these two peaks. While it has been shown previously that the accuracy and resolution of binding site detection can be improved by modeling the pattern, efficient methods are unavailable to fully utilize that information in TFBS detection procedure. We present PolyaPeak, a new method to improve TFBS detection by incorporating the peak shape information. PolyaPeak describes peak shapes using a flexible Pólya model. The shapes are automatically learnt from the data using Minorization-Maximization (MM) algorithm, then integrated with the read count information via a hierarchical model to distinguish true binding sites from background noises. Extensive real data analyses show that PolyaPeak is capable of robustly improving TFBS detection compared with existing methods. An R package is freely available. PMID:24608116

  14. Effect of substrate RNA sequence on the cleavage reaction by a short ribozyme.

    PubMed Central

    Ohmichi, T; Okumoto, Y; Sugimoto, N

    1998-01-01

    Leadzyme is a ribozyme that requires Pb2+. The catalytic sequence, CUGGGAGUCC, binds to an RNA substrate, GGACC downward arrowGAGCCAG, cleaving the RNA substrate at one site. We have investigated the effect of the substrate sequence on the cleavage activity of leadzyme using mutant substrates in order to structurally understand the RNA catalysis. The results showed that leadzyme acted as a catalyst for single site cleavage of a C5 deletion mutant substrate, GGAC downward arrowGAGCCAG, as well as the wild-type substrate. However, a mutant substrate GGACCGACCAG, which had G8 deleted from the wild-type substrate, was not cleaved. Kinetic studies by surface plasmon resonance indicated that the difference between active and inactive structures reflected the slow association and dissociation rate constants of complex formation induced by Pb2+rather than differences in complex stability. CD spectra showed that the active form of the substrate-leadzyme complex was rearranged by Pb2+binding. The G8 of the wild-type substrate, which was absent in the inactive complex, is not near the cleavage site. Thus, these results show that the active substrate-leadzyme complex has a Pb2+binding site at the junction between the unpaired region (asymmetric internal loop) and the stem region, which is distal to the cleavage site. Pb2+may play a role in rearranging the bases in the asymmetric internal loop to the correct position for catalysis. PMID:9837996

  15. A ternary metal binding site in the C2 domain of phosphoinositide-specific phospholipase C-delta1.

    PubMed

    Essen, L O; Perisic, O; Lynch, D E; Katan, M; Williams, R L

    1997-03-11

    We have determined the crystal structures of complexes of phosphoinositide-specific phospholipase C-delta1 from rat with calcium, barium, and lanthanum at 2.5-2.6 A resolution. Binding of these metal ions is observed in the active site of the catalytic TIM barrel and in the calcium binding region (CBR) of the C2 domain. The C2 domain of PLC-delta1 is a circularly permuted topological variant (P-variant) of the synaptotagmin I C2A domain (S-variant). On the basis of sequence analysis, we propose that both the S-variant and P-variant topologies are present among other C2 domains. Multiple adjacent binding sites in the C2 domain were observed for calcium and the other metal/enzyme complexes. The maximum number of binding sites observed was for the calcium analogue lanthanum. This complex shows an array-like binding of three lanthanum ions (sites I-III) in a crevice on one end of the C2 beta-sandwich. Residues involved in metal binding are contained in three loops, CBR1, CBR2, and CBR3. Sites I and II are maintained in the calcium and barium complexes, whereas sites II and III coincide with a binary calcium binding site in the C2A domain of synaptotagmin I. Several conformers for CBR1 are observed. The conformation of CBR1 does not appear to be strictly dependent on metal binding; however, metal binding may stabilize certain conformers. No significant structural changes are observed for CBR2 or CBR3. The surface of this ternary binding site provides a cluster of freely accessible liganding positions for putative phospholipid ligands of the C2 domain. It may be that the ternary metal binding site is also a feature of calcium-dependent phospholipid binding in solution. A ternary metal binding site might be a conserved feature among C2 domains that contain the critical calcium ligands in their CBR's. The high cooperativity of calcium-mediated lipid binding by C2 domains described previously is explained by this novel type of calcium binding site.

  16. DROMPA: easy-to-handle peak calling and visualization software for the computational analysis and validation of ChIP-seq data.

    PubMed

    Nakato, Ryuichiro; Itoh, Tahehiko; Shirahige, Katsuhiko

    2013-07-01

    Chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) can identify genomic regions that bind proteins involved in various chromosomal functions. Although the development of next-generation sequencers offers the technology needed to identify these protein-binding sites, the analysis can be computationally challenging because sequencing data sometimes consist of >100 million reads/sample. Herein, we describe a cost-effective and time-efficient protocol that is generally applicable to ChIP-seq analysis; this protocol uses a novel peak-calling program termed DROMPA to identify peaks and an additional program, parse2wig, to preprocess read-map files. This two-step procedure drastically reduces computational time and memory requirements compared with other programs. DROMPA enables the identification of protein localization sites in repetitive sequences and efficiently identifies both broad and sharp protein localization peaks. Specifically, DROMPA outputs a protein-binding profile map in pdf or png format, which can be easily manipulated by users who have a limited background in bioinformatics. © 2013 The Authors Genes to Cells © 2013 by the Molecular Biology Society of Japan and Wiley Publishing Asia Pty Ltd.

  17. Cloning and sequencing of a gene encoding a novel extracellular neutral proteinase from Streptomyces sp. strain C5 and expression of the gene in Streptomyces lividans 1326.

    PubMed Central

    Lampel, J S; Aphale, J S; Lampel, K A; Strohl, W R

    1992-01-01

    The gene encoding a novel milk protein-hydrolyzing proteinase was cloned on a 6.56-kb SstI fragment from Streptomyces sp. strain C5 genomic DNA into Streptomyces lividans 1326 by using the plasmid vector pIJ702. The gene encoding the small neutral proteinase (snpA) was located within a 2.6-kb BamHI-SstI restriction fragment that was partially sequenced. The molecular mass of the deduced amino acid sequence of the mature protein was determined to be 15,740, which corresponds very closely with the relative molecular mass of the purified protein (15,500) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The N-terminal amino acid sequence of the purified neutral proteinase was determined, and the DNA encoding this sequence was found to be located within the sequenced DNA. The deduced amino acid sequence contains a conserved zinc binding site, although secondary ligand binding and active sites typical of thermolysinlike metalloproteinases are absent. The combination of its small size, deduced amino acid sequence, and substrate and inhibition profile indicate that snpA encodes a novel neutral proteinase. Images PMID:1569011

  18. Structural and immunologic characterization of bovine, horse, and rabbit serum albumins

    PubMed Central

    Majorek, Karolina A.; Porebski, Przemyslaw J.; Dayal, Arjun; Zimmerman, Matthew D.; Jablonska, Kamila; Stewart, Alan J.; Chruszcz, Maksymilian; Minor, Wladek

    2012-01-01

    Serum albumin (SA) is the most abundant plasma protein in mammals. SA is a multifunctional protein with extraordinary ligand binding capacity, making it a transporter molecule for a diverse range of metabolites, drugs, nutrients, metals and other molecules. Due to its ligand binding properties, albumins have wide clinical, pharmaceutical, and biochemical applications. Albumins are also allergenic, and exhibit a high degree of cross-reactivity due to significant sequence and structure similarity of SAs from different organisms. Here we present crystal structures of albumins from cattle (BSA), horse (ESA) and rabbit (RSA) serums. The structural data are correlated with the results of immunological studies of SAs. We also analyze the conservation or divergence of structures and sequences of SAs in the context of their potential allergenicity and cross-reactivity. In addition, we identified a previously uncharacterized ligand binding site in the structure of RSA, and calcium binding sites in the structure of BSA, which is the first serum albumin structure to contain metal ions. PMID:22677715

  19. Directing an artificial zinc finger protein to new targets by fusion to a non-DNA-binding domain.

    PubMed

    Lim, Wooi F; Burdach, Jon; Funnell, Alister P W; Pearson, Richard C M; Quinlan, Kate G R; Crossley, Merlin

    2016-04-20

    Transcription factors are often regarded as having two separable components: a DNA-binding domain (DBD) and a functional domain (FD), with the DBD thought to determine target gene recognition. While this holds true for DNA bindingin vitro, it appears thatin vivoFDs can also influence genomic targeting. We fused the FD from the well-characterized transcription factor Krüppel-like Factor 3 (KLF3) to an artificial zinc finger (AZF) protein originally designed to target the Vascular Endothelial Growth Factor-A (VEGF-A) gene promoter. We compared genome-wide occupancy of the KLF3FD-AZF fusion to that observed with AZF. AZF bound to theVEGF-Apromoter as predicted, but was also found to occupy approximately 25,000 other sites, a large number of which contained the expected AZF recognition sequence, GCTGGGGGC. Interestingly, addition of the KLF3 FD re-distributes the fusion protein to new sites, with total DNA occupancy detected at around 50,000 sites. A portion of these sites correspond to known KLF3-bound regions, while others contained sequences similar but not identical to the expected AZF recognition sequence. These results show that FDs can influence and may be useful in directing AZF DNA-binding proteins to specific targets and provide insights into how natural transcription factors operate. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Violation of an Evolutionarily Conserved Immunoglobulin Diversity Gene Sequence Preference Promotes Production of dsDNA-Specific IgG Antibodies

    PubMed Central

    Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.

    2015-01-01

    Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374

  1. An isoleucine to leucine mutation that switches the cofactor requirement of the EcoRV restriction endonuclease from magnesium to manganese.

    PubMed

    Vipond, I B; Moon, B J; Halford, S E

    1996-02-13

    The EcoRV restriction endonuclease cleaves DNA at its recognition sequence more readily with Mg2+ as the cofactor than with Mn2+ but, at noncognate sequences that differ from the EcoRV site by one base pair, Mn2+ gives higher rates than Mg2+. A mutant of EcoRV, in which an isoleucine near the active site was replaced by leucine, showed the opposite behavior. It had low activity with Mg2+, but, in the presence of Mn2+ ions, it cleaved the recognition site faster than wild-type EcoRV with either Mn2+ or Mg2+. The mutant was also more specific for the recognition sequence than the native enzyme: the noncognate DNA cleavages by wild-type EcoRV and Mn2+ were not detected with the mutant. Further mutagenesis showed that the protein required the same acidic residues at its active site as wild-type EcoRV. The Ile-->Leu mutation seems to perturb the configuration of the metal-binding ligands at the active site so that the protein has virtually no affinity for Mg2+ yet it can still bind Mn2+ ions, though the latter only occurs when the protein is at the recognition site. This contrasts to wild-type EcoRV, where Mn2+ ions bind readily to complexes with either cognate and noncognate DNA and only Mg2+ shows the discrimination between the complexes. The structural perturbation is a specific consequence of leucine in place of isoleucine, since mutants with valine or alanine were similar to wild-type EcoRV.

  2. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed Central

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-01-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded. Images PMID:2823109

  3. An SRY mutation causing human sex reversal resolves a general mechanism of structure-specific DNA recognition: application to the four-way DNA junction.

    PubMed

    Peters, R; King, C Y; Ukiyama, E; Falsafi, S; Donahoe, P K; Weiss, M A

    1995-04-11

    SRY, a genetic "master switch" for male development in mammals, exhibits two biochemical activities: sequence-specific recognition of duplex DNA and sequence-independent binding to the sharp angles of four-way DNA junctions. Here, we distinguish between these activities by analysis of a mutant SRY associated with human sex reversal (46, XY female with pure gonadal dysgenesis). The substitution (168T in human SRY) alters a nonpolar side chain in the minor-groove DNA recognition alpha-helix of the HMG box [Haqq, C.M., King, C.-Y., Ukiyama, E., Haqq, T.N., Falsalfi, S., Donahoe, P.K., & Weiss, M.A. (1994) Science 266, 1494-1500]. The native (but not mutant) side chain inserts between specific base pairs in duplex DNA, interrupting base stacking at a site of induced DNA bending. Isotope-aided 1H-NMR spectroscopy demonstrates that analogous side-chain insertion occurs on binding of SRY to a four-way junction, establishing a shared mechanism of sequence- and structure-specific DNA binding. Although the mutant DNA-binding domain exhibits > 50-fold reduction in sequence-specific DNA recognition, near wild-type affinity for four-way junctions is retained. Our results (i) identify a shared SRY-DNA contact at a site of either induced or intrinsic DNA bending, (ii) demonstrate that this contact is not required to bind an intrinsically bent DNA target, and (iii) rationalize patterns of sequence conservation or diversity among HMG boxes. Clinical association of the I68T mutation with human sex reversal supports the hypothesis that specific DNA recognition by SRY is required for male sex determination.

  4. The Minimal Replicator of Epstein-Barr Virus oriP

    PubMed Central

    Yates, John L.; Camiolo, Sarah M.; Bashaw, Jacqueline M.

    2000-01-01

    oriP is a 1.7-kb region of the Epstein-Barr virus (EBV) chromosome that supports the replication and stable maintenance of plasmids in human cells. oriP contains two essential components, called the DS and the FR, both of which contain multiple binding sites for the EBV-encoded protein, EBNA-1. The DS appears to function as the replicator of oriP, while the FR acts in conjunction with EBNA-1 to prevent the loss of plasmids from proliferating cells. Because of EBNA-1's role in stabilizing plasmids through the FR, it has not been entirely clear to what extent EBNA-1 might be required for replication from oriP per se, and a recent study has questioned whether EBNA-1 has any direct role in replication. In the present study we found that plasmids carrying oriP required EBNA-1 to replicate efficiently even when assayed only 2 days after plasmids were introduced into the cell lines 143B and 293. Significantly, using 293 cells it was demonstrated that the plasmid-retention function of EBNA-1 and the FR did not contribute significantly to the accumulation of replicated plasmids, and the DS supported efficient EBNA-1-dependent replication in the absence of the FR. The DS contains two pairs of closely spaced EBNA-1 binding sites, and a previous study had shown that both sites within either pair are required for activity. However, it was unclear from previous work what additional sequences within the DS might be required. We found that each “half” of the DS, including a pair of closely spaced EBNA-1 binding sites, had significant replicator activity when the other half had been deleted. The only significant DNA sequences that the two halves of the DS share in common, other than EBNA-1 binding sites, is a 9-bp sequence that is present twice in the “left half” and once in the “right half.” These nonamer repeats, while not essential for activity, contributed significantly to the activity of each half of the DS. Two thymines occur at unique positions within EBNA-1 binding sites 1 and 4 at the DS and become sensitive to oxidation by permanganate when EBNA-1 binds, but mutation of each to the consensus base, adenine, actually improved the activity of each half of the DS slightly. In conclusion, the DS of oriP is an EBNA-1-dependent replicator, and its minimal active core appears to be simply two properly spaced EBNA-1 binding sites. PMID:10775587

  5. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, Soo-Ik; Hammes, G.G.

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less

  6. The DNA-encoded nucleosome organization of a eukaryotic genome.

    PubMed

    Kaplan, Noam; Moore, Irene K; Fondufe-Mittendorf, Yvonne; Gossett, Andrea J; Tillo, Desiree; Field, Yair; LeProust, Emily M; Hughes, Timothy R; Lieb, Jason D; Widom, Jonathan; Segal, Eran

    2009-03-19

    Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for approximately 40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo.

  7. DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

    PubMed

    Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford

    2017-10-01

    Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  8. Genome-Wide Identification of Binding Sites Defines Distinct Functions for Caenorhabditis elegans PHA-4/FOXA in Development and Environmental Response

    PubMed Central

    Zhong, Mei; Niu, Wei; Lu, Zhi John; Sarov, Mihail; Murray, John I.; Janette, Judith; Raha, Debasish; Sheaffer, Karyn L.; Lam, Hugo Y. K.; Preston, Elicia; Slightham, Cindie; Hillier, LaDeana W.; Brock, Trisha; Agarwal, Ashish; Auerbach, Raymond; Hyman, Anthony A.; Gerstein, Mark; Mango, Susan E.; Kim, Stuart K.; Waterston, Robert H.; Reinke, Valerie; Snyder, Michael

    2010-01-01

    Transcription factors are key components of regulatory networks that control development, as well as the response to environmental stimuli. We have established an experimental pipeline in Caenorhabditis elegans that permits global identification of the binding sites for transcription factors using chromatin immunoprecipitation and deep sequencing. We describe and validate this strategy, and apply it to the transcription factor PHA-4, which plays critical roles in organ development and other cellular processes. We identified thousands of binding sites for PHA-4 during formation of the embryonic pharynx, and also found a role for this factor during the starvation response. Many binding sites were found to shift dramatically between embryos and starved larvae, from developmentally regulated genes to genes involved in metabolism. These results indicate distinct roles for this regulator in two different biological processes and demonstrate the versatility of transcription factors in mediating diverse biological roles. PMID:20174564

  9. A map of human PRDM9 binding provides evidence for novel behaviors of PRDM9 and other zinc-finger proteins in meiosis

    PubMed Central

    Noor, Nudrat; Bitoun, Emmanuelle; Tumian, Afidalina; Imbeault, Michael; Chapman, J Ross; Aricescu, A Radu

    2017-01-01

    PRDM9 binding localizes almost all meiotic recombination sites in humans and mice. However, most PRDM9-bound loci do not become recombination hotspots. To explore factors that affect binding and subsequent recombination outcomes, we mapped human PRDM9 binding sites in a transfected human cell line and measured PRDM9-induced histone modifications. These data reveal varied DNA-binding modalities of PRDM9. We also find that human PRDM9 frequently binds promoters, despite their low recombination rates, and it can activate expression of a small number of genes including CTCFL and VCX. Furthermore, we identify specific sequence motifs that predict consistent, localized meiotic recombination suppression around a subset of PRDM9 binding sites. These motifs strongly associate with KRAB-ZNF protein binding, TRIM28 recruitment, and specific histone modifications. Finally, we demonstrate that, in addition to binding DNA, PRDM9's zinc fingers also mediate its multimerization, and we show that a pair of highly diverged alleles preferentially form homo-multimers. PMID:29072575

  10. Detecting cis-regulatory binding sites for cooperatively binding proteins

    PubMed Central

    van Oeffelen, Liesbeth; Cornelis, Pierre; Van Delm, Wouter; De Ridder, Fedor; De Moor, Bart; Moreau, Yves

    2008-01-01

    Several methods are available to predict cis-regulatory modules in DNA based on position weight matrices. However, the performance of these methods generally depends on a number of additional parameters that cannot be derived from sequences and are difficult to estimate because they have no physical meaning. As the best way to detect cis-regulatory modules is the way in which the proteins recognize them, we developed a new scoring method that utilizes the underlying physical binding model. This method requires no additional parameter to account for multiple binding sites; and the only necessary parameters to model homotypic cooperative interactions are the distances between adjacent protein binding sites in basepairs, and the corresponding cooperative binding constants. The heterotypic cooperative binding model requires one more parameter per cooperatively binding protein, which is the concentration multiplied by the partition function of this protein. In a case study on the bacterial ferric uptake regulator, we show that our scoring method for homotypic cooperatively binding proteins significantly outperforms other PWM-based methods where biophysical cooperativity is not taken into account. PMID:18400778

  11. Binding of the cyclic AMP receptor protein of Escherichia coli and DNA bending at the P4 promoter of pBR322.

    PubMed Central

    Brierley, I; Hoggett, J G

    1992-01-01

    The binding of the Escherichia coli cyclic AMP receptor protein (CRP) to its specific site on the P4 promoter of pBR322 has been studied by gel electrophoresis. Binding to the P4 site was about 40-50-fold weaker than to the principal CRP site on the lactose promoter at both low (0.01 M) and high (0.1 M) ionic strengths. CRP-induced bending at the P4 site was investigated from the mobilities of CRP bound to circularly permuted P4 fragments. The estimated bending angle, based on comparison with Zinkel & Crothers [(1990) Biopolymers 29, 29-38] A-tract bending standards, was found to be approximately 96 degrees, similar to that found for binding to the lac site. These observations suggest that there is not a simple relationship between strength of CRP binding and the extent of induced bending for different CRP sites. The apparent centre of bending in P4 is displaced about 6-8 bp away from the conserved TGTGA sequence and the P4 transcription start site. Images Fig. 1. Fig. 3. Fig. 4. PMID:1322129

  12. Kinetic, Thermodynamic, and Structural Characterizations of the Association between Nrf2-DLGex Degron and Keap1

    PubMed Central

    Fukutomi, Toshiaki; Takagi, Kenji; Mizushima, Tsunehiro; Ohuchi, Noriaki

    2014-01-01

    Transcription factor Nrf2 (NF-E2-related factor 2) coordinately regulates cytoprotective gene expression, but under unstressed conditions, Nrf2 is degraded rapidly through Keap1 (Kelch-like ECH-associated protein 1)-mediated ubiquitination. Nrf2 harbors two Keap1-binding motifs, DLG and ETGE. Interactions between these two motifs and Keap1 constitute a key regulatory nexus for cellular Nrf2 activity through the formation of a two-site binding hinge-and-latch mechanism. In this study, we determined the minimum Keap1-binding sequence of the DLG motif, the low-affinity latch site, and defined a new DLGex motif that covers a sequence much longer than that previously defined. We have successfully clarified the crystal structure of the Keap1-DC-DLGex complex at 1.6 Å. DLGex possesses a complicated helix structure, which interprets well the human-cancer-derived loss-of-function mutations in DLGex. In thermodynamic analyses, Keap1-DLGex binding is characterized as enthalpy and entropy driven, while Keap1-ETGE binding is characterized as purely enthalpy driven. In kinetic analyses, Keap1-DLGex binding follows a fast-association and fast-dissociation model, while Keap1-ETGE binding contains a slow-reaction step that leads to a stable conformation. These results demonstrate that the mode of DLGex binding to Keap1 is distinct from that of ETGE structurally, thermodynamically, and kinetically and support our contention that the DLGex motif serves as a converter transmitting environmental stress to Nrf2 induction as the latch site. PMID:24366543

  13. Chemical probes of the conformation of DNA modified by cis-diamminedichloroplatinum(II)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marrot, L.; Leng, M.

    The purpose of this work was to analyze at the nucleotide level the distortions induced by the binding of cis-diamminedichloroplatinum(II) (cis-DDP) to DNA by means of chemical probes. In order to test the chemical probes, experiments were first carried out on two platinated oligonucleotides. It has been verified by circular dichroism and gel electrophoresis that the binding of cis-DDP to an AG or to a GTG site within a double-stranded oligonucleotide distorts the double helix. The reactivity of the oligonucleotide platinated at the GTG site with chloroacetaldehyde, diethyl pyrocarbonate, and osmium tetraoxide, respectively, suggests a local denaturation of the doublemore » helix. The 5'G residue and the T residue within the adduct are no longer paired, while the 3'G residue is paired. The double helix is more distorted (but not denatured) at the 5' side of the adduct than at the 3' side. The reactivities of the chemical probes with six platinated DNA restriction fragments show that even at a relatively high level of platination only a few base pairs are unpaired but the double helix is largely distorted. No local denaturation has been detected at the GG sites separated from the nearest GG or AG sites by at least three base pairs. The AG sites separated from the nearest AG or GG sites by at least three base pairs do not denature the double helix locally when they are in the sequences puAG/pyTC. It is suggested that the distortion within these sequences is induced by adducts located further away along the DNA fragments, these sequences not being the major sites for the binding of cis-DDP.« less

  14. Hb taradale [beta82(EF6)Lys-->Arg]: a novel mutation at a 2,3-diphosphoglycerate binding site.

    PubMed

    Brennan, Stephen O; Sheen, Campbell; Chan, Tim; George, Peter M

    2005-01-01

    Hb Taradale [beta82(EF6)Lys-->Arg] was initially detected as a split Hb A0 peak on Hb A1c, monitoring. Red cell parameters, hemoglobin (Hb) electrophoresis and stability tests were normal. Mass spectrometry (ms) clearly identified a variant beta chain with a mass increase of 28 Da and peptide mapping located the mutation site to peptide betaT-9. DNA sequencing confirmed the presence of a novel beta82(EF6)Lys-->Arg mutation. This conservative substitution at a 2,3-diphosphoglycerate (2,3-DPG) binding site did not, however, appear to affect the P50 for oxygen binding.

  15. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  16. In silico modeling of epigenetic-induced changes in photoreceptor cis-regulatory elements.

    PubMed

    Hossain, Reafa A; Dunham, Nicholas R; Enke, Raymond A; Berndsen, Christopher E

    2018-01-01

    DNA methylation is a well-characterized epigenetic repressor of mRNA transcription in many plant and vertebrate systems. However, the mechanism of this repression is not fully understood. The process of transcription is controlled by proteins that regulate recruitment and activity of RNA polymerase by binding to specific cis-regulatory sequences. Cone-rod homeobox (CRX) is a well-characterized mammalian transcription factor that controls photoreceptor cell-specific gene expression. Although much is known about the functions and DNA binding specificity of CRX, little is known about how DNA methylation modulates CRX binding affinity to genomic cis-regulatory elements. We used bisulfite pyrosequencing of human ocular tissues to measure DNA methylation levels of the regulatory regions of RHO , PDE6B, PAX6 , and LINE1 retrotransposon repeats. To describe the molecular mechanism of repression, we used molecular modeling to illustrate the effect of DNA methylation on human RHO regulatory sequences. In this study, we demonstrate an inverse correlation between DNA methylation in regulatory regions adjacent to the human RHO and PDE6B genes and their subsequent transcription in human ocular tissues. Docking of CRX to the DNA models shows that CRX interacts with the grooves of these sequences, suggesting changes in groove structure could regulate binding. Molecular dynamics simulations of the RHO promoter and enhancer regions show changes in the flexibility and groove width upon epigenetic modification. Models also demonstrate changes in the local dynamics of CRX binding sites within RHO regulatory sequences which may account for the repression of CRX-dependent transcription. Collectively, these data demonstrate epigenetic regulation of CRX binding sites in human retinal tissue and provide insight into the mechanism of this mode of epigenetic regulation to be tested in future experiments.

  17. Binding Site Turnover Produces Pervasive Quantitative Changes in Transcription Factor Binding between Closely Related Drosophila Species

    PubMed Central

    Trapnell, Cole; Davidson, Stuart; Pachter, Lior; Chu, Hou Cheng; Tonkin, Leath A.; Biggin, Mark D.; Eisen, Michael B.

    2010-01-01

    Changes in gene expression play an important role in evolution, yet the molecular mechanisms underlying regulatory evolution are poorly understood. Here we compare genome-wide binding of the six transcription factors that initiate segmentation along the anterior-posterior axis in embryos of two closely related species: Drosophila melanogaster and Drosophila yakuba. Where we observe binding by a factor in one species, we almost always observe binding by that factor to the orthologous sequence in the other species. Levels of binding, however, vary considerably. The magnitude and direction of the interspecies differences in binding levels of all six factors are strongly correlated, suggesting a role for chromatin or other factor-independent forces in mediating the divergence of transcription factor binding. Nonetheless, factor-specific quantitative variation in binding is common, and we show that it is driven to a large extent by the gain and loss of cognate recognition sequences for the given factor. We find only a weak correlation between binding variation and regulatory function. These data provide the first genome-wide picture of how modest levels of sequence divergence between highly morphologically similar species affect a system of coordinately acting transcription factors during animal development, and highlight the dominant role of quantitative variation in transcription factor binding over short evolutionary distances. PMID:20351773

  18. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    NASA Astrophysics Data System (ADS)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely resembles temperate phages, such as lambda. It, however, encodes its own T7-like RNA polymerase (characteristic of virulent phages), whose role in gene expression was unclear. Our analysis resulted in quantitative understanding of the role of both host and phage RNA polymerase, and in the identification of the previously unknown promoter sequence for Xp10 RNA polymerase. More generally, an increasing number of phage genomes are being sequenced every year, and we expect that methods of quantitative data analysis that we introduced will provide an efficient way to study gene expression strategies of novel bacterial viruses.

  19. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    PubMed

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  20. Vaccine-elicited receptor-binding site antibodies neutralize two New World hemorrhagic fever arenaviruses.

    PubMed

    Clark, Lars E; Mahmutovic, Selma; Raymond, Donald D; Dilanyan, Taleen; Koma, Takaaki; Manning, John T; Shankar, Sundaresh; Levis, Silvana C; Briggiler, Ana M; Enria, Delia A; Wucherpfennig, Kai W; Paessler, Slobodan; Abraham, Jonathan

    2018-05-14

    While five arenaviruses cause human hemorrhagic fevers in the Western Hemisphere, only Junin virus (JUNV) has a vaccine. The GP1 subunit of their envelope glycoprotein binds transferrin receptor 1 (TfR1) using a surface that substantially varies in sequence among the viruses. As such, receptor-mimicking antibodies described to date are type-specific and lack the usual breadth associated with this mode of neutralization. Here we isolate, from the blood of a recipient of the live attenuated JUNV vaccine, two antibodies that cross-neutralize Machupo virus with varying efficiency. Structures of GP1-Fab complexes explain the basis for efficient cross-neutralization, which involves avoiding receptor mimicry and targeting a conserved epitope within the receptor-binding site (RBS). The viral RBS, despite its extensive sequence diversity, is therefore a target for cross-reactive antibodies with activity against New World arenaviruses of public health concern.

  1. Array-Based Rational Design of Short Peptide Probe-Derived from an Anti-TNT Monoclonal Antibody.

    PubMed

    Okochi, Mina; Muto, Masaki; Yanai, Kentaro; Tanaka, Masayoshi; Onodera, Takeshi; Wang, Jin; Ueda, Hiroshi; Toko, Kiyoshi

    2017-10-09

    Complementarity-determining regions (CDRs) are sites on the variable chains of antibodies responsible for binding to specific antigens. In this study, a short peptide probe for recognition of 2,4,6-trinitrotoluene (TNT), was identified by testing sequences derived from the CDRs of an anti-TNT monoclonal antibody. The major TNT-binding site in this antibody was identified in the heavy chain CDR3 by antigen docking simulation and confirmed by an immunoassay using a spot-synthesis based peptide array comprising amino acid sequences of six CDRs in the variable region. A peptide derived from heavy chain CDR3 (RGYSSFIYWF) bound to TNT with a dissociation constant of 1.3 μM measured by surface plasmon resonance. Substitution of selected amino acids with basic residues increased TNT binding while substitution with acidic amino acids decreased affinity, an isoleucine to arginine change showed the greatest improvement of 1.8-fold. The ability to create simple peptide binders of volatile organic compounds from sequence information provided by the immune system in the creation of an immune response will be beneficial for sensor developments in the future.

  2. Identification of an Electrostatic Ruler Motif for Sequence-Specific Binding of Collagenase to Collagen.

    PubMed

    Subramanian, Sundar Raman; Singam, Ettayapuram Ramaprasad Azhagiya; Berinski, Michael; Subramanian, Venkatesan; Wade, Rebecca C

    2016-08-25

    Sequence-specific cleavage of collagen by mammalian collagenase plays a pivotal role in cell function. Collagenases are matrix metalloproteinases that cleave the peptide bond at a specific position on fibrillar collagen. The collagenase Hemopexin-like (HPX) domain has been proposed to be responsible for substrate recognition, but the mechanism by which collagenases identify the cleavage site on fibrillar collagen is not clearly understood. In this study, Brownian dynamics simulations coupled with atomic-detail and coarse-grained molecular dynamics simulations were performed to dock matrix metalloproteinase-1 (MMP-1) on a collagen IIIα1 triple helical peptide. We find that the HPX domain recognizes the collagen triple helix at a conserved R-X11-R motif C-terminal to the cleavage site to which the HPX domain of collagen is guided electrostatically. The binding of the HPX domain between the two arginine residues is energetically stabilized by hydrophobic contacts with collagen. From the simulations and analysis of the sequences and structural flexibility of collagen and collagenase, a mechanistic scheme by which MMP-1 can recognize and bind collagen for proteolysis is proposed.

  3. Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

    PubMed Central

    2010-01-01

    Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. Conclusions SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites. PMID:20102603

  4. Biophysical and structural considerations for protein sequence evolution

    PubMed Central

    2011-01-01

    Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550

  5. Site-directed mutagenesis reveals transition-state stabilization as a general catalytic mechanism for aminoacyl-tRNA synthetases.

    PubMed

    Borgford, T J; Gray, T E; Brand, N J; Fersht, A R

    1987-11-17

    Some aminoacyl-tRNA synthetases of almost negligible homology do have a small region of similarity around four-residue sequence His-Ile(or Leu or Met)-Gly-His(or Asn), the HIGH sequence. The first histidine in this sequence in the tyrosyl-tRNA synthetase, His-45, has been shown to form part of a binding site for the gamma-phosphate of ATP in the transition state for the reaction as does Thr-40. Residue His-56 in the valyl-tRNA synthetase begins a HIGH sequence, and there is a threonine at position 52, one position closer to the histidine than in the tyrosyl-tRNA synthetase. The mutants Thr----Ala-52 and His----Asn-56 have been made and their complete free energy profiles for the formation of valyl adenylate determined. Difference energy diagrams have been constructed by comparison with the reaction of wild-type enzyme. The difference energy profiles are very similar to those for the mutants Thr----Ala-40 and His----Asn-45 of the tyrosyl-tRNA synthetase. Thr-52 and His-56 of the valyl-tRNA synthetase contribute little binding energy to valine, ATP, and Val-AMP. Instead, the wild-type enzyme binds the transition state and pyrophosphate some 6 kcal/mol more tightly than do the mutants. Preferential transition-state stabilization is thus an important component of catalysis by the valyl-tRNA synthetase. Further, by analogy to the tyrosyl-tRNA synthetase, the valyl-tRNA synthetase has a binding site for the gamma-phosphate of ATP in the transition state, and this is likely to be a general feature of aminoacyl-tRNA synthetases that have a HIGH region.

  6. Analysis of functional importance of binding sites in the Drosophila gap gene network model.

    PubMed

    Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria

    2015-01-01

    The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.

  7. Prediction of FAD binding sites in electron transport proteins according to efficient radial basis function networks and significant amino acid pairs.

    PubMed

    Le, Nguyen-Quoc-Khanh; Ou, Yu-Yen

    2016-07-30

    Cellular respiration is a catabolic pathway for producing adenosine triphosphate (ATP) and is the most efficient process through which cells harvest energy from consumed food. When cells undergo cellular respiration, they require a pathway to keep and transfer electrons (i.e., the electron transport chain). Due to oxidation-reduction reactions, the electron transport chain produces a transmembrane proton electrochemical gradient. In case protons flow back through this membrane, this mechanical energy is converted into chemical energy by ATP synthase. The convert process is involved in producing ATP which provides energy in a lot of cellular processes. In the electron transport chain process, flavin adenine dinucleotide (FAD) is one of the most vital molecules for carrying and transferring electrons. Therefore, predicting FAD binding sites in the electron transport chain is vital for helping biologists understand the electron transport chain process and energy production in cells. We used an independent data set to evaluate the performance of the proposed method, which had an accuracy of 69.84 %. We compared the performance of the proposed method in analyzing two newly discovered electron transport protein sequences with that of the general FAD binding predictor presented by Mishra and Raghava and determined that the accuracy of the proposed method improved by 9-45 % and its Matthew's correlation coefficient was 0.14-0.5. Furthermore, the proposed method enabled reducing the number of false positives significantly and can provide useful information for biologists. We developed a method that is based on PSSM profiles and SAAPs for identifying FAD binding sites in newly discovered electron transport protein sequences. This approach achieved a significant improvement after we added SAAPs to PSSM features to analyze FAD binding proteins in the electron transport chain. The proposed method can serve as an effective tool for predicting FAD binding sites in electron transport proteins and can help biologists understand the functions of the electron transport chain, particularly those of FAD binding sites. We also developed a web server which identifies FAD binding sites in electron transporters available for academics.

  8. Sequence specificity of mutagen-nucleic acid complexes in solution: intercalation and mutagen-base pair overlap geometries for proflavine binding to dC-dC-dG-dG and dG-dG-dC-dC self-complementary duplexes.

    PubMed

    Patel, D J; Canuel, L L

    1977-07-01

    The complex formed between the mutagen proflavine and the dC-dC-dG-dG and dG-dG-dC-dC self-complementary tetranucleotide duplexes has been monitored by proton high resolution nuclear magnetic resonance spectroscopy in 0.1 M phosphate solution at high nucleotide/drug ratios. The large upfield shifts (0.5 to 0.85 ppm) observed at all the proflavine ring nonexchangeable protons on complex formation are consistent with intercalation of the mutagen between base pairs of the tetranucleotide duplex. We have proposed an approximate overlap geometry between the proflavine ring and nearest neighbor base pairs at the intercalation site from a comparison between experimental shifts and those calculated for various stacking orientations. We have compared the binding of actinomycin D, propidium diiodide, and proflavine to self-complementary tetranucleotide sequences dC-dC-dG-dG and dG-dG-dC-dC by UV absorbance changes in the drug bands between 400 and 500 nm. Actinomycin D exhibits a pronounced specificity for sequences with dG-dC sites (dG-dG-dC-dC), while propidium diiodide and proflavine exhibit a specificity for sequences with dC-dG sites (dC-dC-dG-dG). Actinomycin D binds more strongly than propidium diiodide and proflavine to dC-dG-dC-dG (contains dC-dG and dG-dC binding sites), indicative of the additional stabilization from hydrogen bonding and hydrophobic interactions between the pentapeptide lactone rings of actinomycin D and the base pair edges and sugar-phosphate backbone of the tetranucleotide duplex.

  9. Sequence specificity of mutagen-nucleic acid complexes in solution: Intercalation and mutagen-base pair overlap geometries for proflavine binding to dC-dC-dG-dG and dG-dG-dC-dC self-complementary duplexes

    PubMed Central

    Patel, Dinshaw J.; Canuel, Lita L.

    1977-01-01

    The complex formed between the mutagen proflavine and the dC-dC-dG-dG and dG-dG-dC-dC self-complementary tetranucleotide duplexes has been monitored by proton high resolution nuclear magnetic resonance spectroscopy in 0.1 M phosphate solution at high nucleotide/drug ratios. The large upfield shifts (0.5 to 0.85 ppm) observed at all the proflavine ring nonexchangeable protons on complex formation are consistent with intercalation of the mutagen between base pairs of the tetranucleotide duplex. We have proposed an approximate overlap geometry between the proflavine ring and nearest neighbor base pairs at the intercalation site from a comparison between experimental shifts and those calculated for various stacking orientations. We have compared the binding of actinomycin D, propidium diiodide, and proflavine to self-complementary tetranucleotide sequences dC-dC-dG-dG and dG-dG-dC-dC by UV absorbance changes in the drug bands between 400 and 500 nm. Actinomycin D exhibits a pronounced specificity for sequences with dG-dC sites (dG-dG-dC-dC), while propidium diiodide and proflavine exhibit a specificity for sequences with dC-dG sites (dC-dC-dG-dG). Actinomycin D binds more strongly than propidium diiodide and proflavine to dC-dG-dC-dG (contains dC-dG and dG-dC binding sites), indicative of the additional stabilization from hydrogen bonding and hydrophobic interactions between the pentapeptide lactone rings of actinomycin D and the base pair edges and sugar-phosphate backbone of the tetranucleotide duplex. PMID:268613

  10. RBSDesigner: software for designing synthetic ribosome binding sites that yields a desired level of protein expression.

    PubMed

    Na, Dokyun; Lee, Doheon

    2010-10-15

    RBSDesigner predicts the translation efficiency of existing mRNA sequences and designs synthetic ribosome binding sites (RBSs) for a given coding sequence (CDS) to yield a desired level of protein expression. The program implements the mathematical model for translation initiation described in Na et al. (Mathematical modeling of translation initiation for the estimation of its efficiency to computationally design mRNA sequences with a desired expression level in prokaryotes. BMC Syst. Biol., 4, 71). The program additionally incorporates the effect on translation efficiency of the spacer length between a Shine-Dalgarno (SD) sequence and an AUG codon, which is crucial for the incorporation of fMet-tRNA into the ribosome. RBSDesigner provides a graphical user interface (GUI) for the convenient design of synthetic RBSs. RBSDesigner is written in Python and Microsoft Visual Basic 6.0 and is publicly available as precompiled stand-alone software on the web (http://rbs.kaist.ac.kr). dhlee@kaist.ac.kr

  11. Genome-Wide Identification of Chromatin Transitional Regions Reveals Diverse Mechanisms Defining the Boundary of Facultative Heterochromatin

    PubMed Central

    Li, Guangyao; Zhou, Lei

    2013-01-01

    Due to the self-propagating nature of the heterochromatic modification H3K27me3, chromatin barrier activities are required to demarcate the boundary and prevent it from encroaching into euchromatic regions. Studies in Drosophila and vertebrate systems have revealed several important chromatin barrier elements and their respective binding factors. However, epigenomic data indicate that the binding of these factors are not exclusive to chromatin boundaries. To gain a comprehensive understanding of facultative heterochromatin boundaries, we developed a two-tiered method to identify the Chromatin Transitional Region (CTR), i.e. the nucleosomal region that shows the greatest transition rate of the H3K27me3 modification as revealed by ChIP-Seq. This approach was applied to identify CTRs in Drosophila S2 cells and human HeLa cells. Although many insulator proteins have been characterized in Drosophila, less than half of the CTRs in S2 cells are associated with known insulator proteins, indicating unknown mechanisms remain to be characterized. Our analysis also revealed that the peak binding of insulator proteins are usually 1–2 nucleosomes away from the CTR. Comparison of CTR-associated insulator protein binding sites vs. those in heterochromatic region revealed that boundary-associated binding sites are distinctively flanked by nucleosome destabilizing sequences, which correlates with significant decreased nucleosome density and increased binding intensities of co-factors. Interestingly, several subgroups of boundaries have enhanced H3.3 incorporation but reduced nucleosome turnover rate. Our genome-wide study reveals that diverse mechanisms are employed to define the boundaries of facultative heterochromatin. In both Drosophila and mammalian systems, only a small fraction of insulator protein binding sites co-localize with H3K27me3 boundaries. However, boundary-associated insulator binding sites are distinctively flanked by nucleosome destabilizing sequences, which correlates with significantly decreased nucleosome density and increased binding of co-factors. PMID:23840609

  12. Apocalmodulin and Ca2+ calmodulin bind to the same region on the skeletal muscle Ca2+ release channel

    NASA Technical Reports Server (NTRS)

    Moore, C. P.; Rodney, G.; Zhang, J. Z.; Santacruz-Toloza, L.; Strasburg, G.; Hamilton, S. L.

    1999-01-01

    The skeletal muscle Ca2+ release channel (RYR1) is regulated by calmodulin in both its Ca2+-free (apocalmodulin) and Ca2+-bound (Ca2+ calmodulin) states. Apocalmodulin is an activator of the channel, and Ca2+ calmodulin is an inhibitor of the channel. Both apocalmodulin and Ca2+ calmodulin binding sites on RYR1 are destroyed by a mild tryptic digestion of the sarcoplasmic reticulum membranes, but calmodulin (either form), bound to RYR1 prior to tryptic digestion, protects both the apocalmodulin and Ca2+ calmodulin sites from tryptic destruction. The protected sites are after arginines 3630 and 3637 on RYR1. These studies suggest that both Ca2+ calmodulin and apocalmodulin bind to the same or overlapping regions on RYR1 and block access of trypsin to sites at amino acids 3630 and 3637. This sequence is part of a predicted Ca2+ CaM binding site of amino acids 3614-3642 [Takeshima, H., et al. (1989) Nature 339, 439-445].

  13. Large-scale turnover of functional transcription factor bindingsites in Drosophila

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moses, Alan M.; Pollard, Daniel A.; Nix, David A.

    2006-07-14

    The gain and loss of functional transcription-factor bindingsites has been proposed as a major source of evolutionary change incis-regulatory DNA and gene expression. We have developed an evolutionarymodel to study binding site turnover that uses multiple sequencealignments to assess the evolutionary constraint on individual bindingsites, and to map gain and loss events along a phylogenetic tree. Weapply this model to study the evolutionary dynamics of binding sites ofthe Drosophila melanogaster transcription factor Zeste, using genome-widein vivo (ChIP-chip) binding data to identify functional Zeste bindingsites, and the genome sequences of D. melanogaster, D. simulans, D.erecta and D. yakuba to study theirmore » evolution. We estimate that more than5 percent of functional Zeste binding sites in D. melanogaster weregained along the D. melanogaster lineage or lost along one of the otherlineages. We find that Zeste bound regions have a reduced rate of bindingsite loss and an increased rate of binding site gain relative to flankingsequences. Finally, we show that binding site gains and losses areasymmetrically distributed with respect to D. melanogaster, consistentwith lineage-specific acquisition and loss of Zeste-responsive regulatoryelements.« less

  14. DNA-binding regulates site-specific ubiquitination of IRF-1.

    PubMed

    Landré, Vivien; Pion, Emmanuelle; Narayan, Vikram; Xirodimas, Dimitris P; Ball, Kathryn L

    2013-02-01

    Understanding the determinants for site-specific ubiquitination by E3 ligase components of the ubiquitin machinery is proving to be a challenge. In the present study we investigate the role of an E3 ligase docking site (Mf2 domain) in an intrinsically disordered domain of IRF-1 [IFN (interferon) regulatory factor-1], a short-lived IFNγ-regulated transcription factor, in ubiquitination of the protein. Ubiquitin modification of full-length IRF-1 by E3 ligases such as CHIP [C-terminus of the Hsc (heat-shock cognate) 70-interacting protein] and MDM2 (murine double minute 2), which dock to the Mf2 domain, was specific for lysine residues found predominantly in loop structures that extend from the DNA-binding domain, whereas no modification was detected in the more conformationally flexible C-terminal half of the protein. The E3 docking site was not available when IRF-1 was in its DNA-bound conformation and cognate DNA-binding sequences strongly suppressed ubiquitination, highlighting a strict relationship between ligase binding and site-specific modification at residues in the DNA-binding domain. Hyperubiquitination of a non-DNA-binding mutant supports a mechanism where an active DNA-bound pool of IRF-1 is protected from polyubiquitination and degradation.

  15. Identification of natural and artificial DNA substrates for the light-activated LOV-HTH transcription factor EL222

    PubMed Central

    Rivera-Cancel, Giomar; Motta-Mena, Laura B.; Gardner, Kevin H.

    2012-01-01

    Light-oxygen-voltage (LOV) domains serve as the photosensory modules for a wide range of plant and bacterial proteins, conferring blue light dependent regulation to effector activities as diverse as enzymes and DNA binding. LOV domains can also be engineered into a variety of exogenous targets, enabling similar regulation for new protein-based reagents. Common to these proteins is the ability for LOV domains to reversibly form a photochemical adduct between an internal flavin chromophore and the surrounding protein, using this to trigger conformational changes that affect output activity. Using the Erythrobacter litoralis protein EL222 model system which links LOV regulation to a helix-turn-helix (HTH) DNA binding domain, we demonstrated that the LOV domain binds and inhibits the HTH domain in the dark, releasing these interactions upon illumination [Nash et al. (2011) Proc. Natl. Acad. Sci. USA 108, 9449–9454]. Here we combine genomic and in vitro selection approaches to identify optimal DNA binding sites for EL222. Within the bacterial host, we observe binding several genomic sites using a 12 bp sequence consensus that is also found by in vitro selection methods. Sequence-specific alterations in the DNA consensus reduce EL222-binding affinity in a manner consistent with the expected binding mode: a protein dimer binding to two repeats. Finally, we demonstrate the light-dependent activation of transcription of two genes adjacent to an EL222 binding site. Taken together, these results shed light on the native function of EL222 and provide useful reagents for further basic and applications research of this versatile protein. PMID:23205774

  16. Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

    PubMed

    Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

    2016-04-01

    Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. An ensemble model of competitive multi-factor binding of the genome

    PubMed Central

    Wasson, Todd; Hartemink, Alexander J.

    2009-01-01

    Hundreds of different factors adorn the eukaryotic genome, binding to it in large number. These DNA binding factors (DBFs) include nucleosomes, transcription factors (TFs), and other proteins and protein complexes, such as the origin recognition complex (ORC). DBFs compete with one another for binding along the genome, yet many current models of genome binding do not consider different types of DBFs together simultaneously. Additionally, binding is a stochastic process that results in a continuum of binding probabilities at any position along the genome, but many current models tend to consider positions as being either binding sites or not. Here, we present a model that allows a multitude of DBFs, each at different concentrations, to compete with one another for binding sites along the genome. The result is an “occupancy profile,” a probabilistic description of the DNA occupancy of each factor at each position. We implement our model efficiently as the software package COMPETE. We demonstrate genome-wide and at specific loci how modeling nucleosome binding alters TF binding, and vice versa, and illustrate how factor concentration influences binding occupancy. Binding cooperativity between nearby TFs arises implicitly via mutual competition with nucleosomes. Our method applies not only to TFs, but also recapitulates known occupancy profiles of a well-studied replication origin with and without ORC binding. Importantly, the sequence preferences our model takes as input are derived from in vitro experiments. This ensures that the calculated occupancy profiles are the result of the forces of competition represented explicitly in our model and the inherent sequence affinities of the constituent DBFs. PMID:19720867

  18. Interference between Triplex and Protein Binding to Distal Sites on Supercoiled DNA.

    PubMed

    Noy, Agnes; Maxwell, Anthony; Harris, Sarah A

    2017-02-07

    We have explored the interdependence of the binding of a DNA triplex and a repressor protein to distal recognition sites on supercoiled DNA minicircles using MD simulations. We observe that the interaction between the two ligands through their influence on their DNA template is determined by a subtle interplay of DNA mechanics and electrostatics, that the changes in flexibility induced by ligand binding play an important role and that supercoiling can instigate additional ligand-DNA contacts that would not be possible in simple linear DNA sequences. Copyright © 2017. Published by Elsevier Inc.

  19. Protein sequences bound to mineral surfaces persist into deep time

    PubMed Central

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515

  20. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2008-04-08

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  1. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2010-10-05

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  2. Computation of repetitions and regularities of biologically weighted sequences.

    PubMed

    Christodoulakis, M; Iliopoulos, C; Mouchard, L; Perdikuri, K; Tsakalidis, A; Tsichlas, K

    2006-01-01

    Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (iii) computation of regularities. Our algorithms can be used as basic building blocks for more sophisticated algorithms applied on weighted sequences.

  3. An interferon regulatory factor binding site in the U5 region of the bovine leukemia virus long terminal repeat stimulates Tax-independent gene expression.

    PubMed

    Kiermer, V; Van Lint, C; Briclet, D; Vanhulle, C; Kettmann, R; Verdin, E; Burny, A; Droogmans, L

    1998-07-01

    Bovine leukemia virus (BLV) replication is controlled by both cis- and trans-acting elements. The virus-encoded transactivator, Tax, is necessary for efficient transcription from the BLV promoter, although it is not present during the early stages of infection. Therefore, sequences that control Tax-independent transcription must play an important role in the initiation of viral gene expression. This study demonstrates that the R-U5 sequence of BLV stimulates Tax-independent reporter gene expression directed by the BLV promoter. R-U5 was also stimulatory when inserted immediately downstream from the transcription initiation site of a heterologous promoter. Progressive deletion analysis of this region revealed that a 46-bp element corresponding to the 5' half of U5 is principally responsible for the stimulation. This element exhibited enhancer activity when inserted upstream or downstream from the herpes simplex virus thymidine kinase promoter. This enhancer contains a binding site for the interferon regulatory factors IRF-1 and IRF-2. A 3-bp mutation that destroys the IRF recognition site caused a twofold decrease in Tax-independent BLV long terminal repeat-driven gene expression. These observations suggest that the IRF binding site in the U5 region of BLV plays a role in the initiation of virus replication.

  4. In Silico Detection of Sequence Variations Modifying Transcriptional Regulation

    PubMed Central

    Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob

    2008-01-01

    Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319

  5. Mapping of a binding site for ATP within the extracellular region of the Torpedo nicotinic acetylcholine receptor beta-subunit.

    PubMed

    Schrattenholz, A; Roth, U; Godovac-Zimmermann, J; Maelicke, A

    1997-10-28

    Using 2,8,5'-[3H]ATP as a direct photoaffinity label for membrane-bound nicotinic acetylcholine receptor (nAChR) from Torpedo marmorata, we have identified a binding site for ATP in the extracellular region of the beta-subunit of the receptor. Photolabeling was completely inhibited in the presence of saturating concentrations of nonradioactive ATP, whereas neither the purinoreceptor antagonists suramin, theophyllin, and caffeine nor the nAChR antagonists alpha-bungarotoxin and d-tubocurarine affected the labeling reaction. Competitive and noncompetitive nicotinic agonists and Ca2+ increased the yield of the photoreaction by up to 50%, suggesting that the respective binding sites are allosterically linked with the ATP site. The dissociation constant KD of binding of ATP to the identified site on the nAChR was of the order of 10(-4) M. Sites of labeling were found in the sequence regions Leu11-Pro17 and Asp152-His163 of the nAChR beta-subunit. These regions may represent parts of a single binding site for ATP, which is discontinuously distributed within the primary structure of the N-terminal extracellular domain. The existence of an extracellular binding site for ATP confirms, on the molecular level, that this nucleotide can directly act on nicotinic receptors, as has been suggested from previous electrophysiological and biochemical studies.

  6. Transcription Factor Map Alignment of Promoter Regions

    PubMed Central

    Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic

    2006-01-01

    We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547

  7. Characterization of the Igf-II Binding Site of the IGF-II/MAN-6-P Receptor Extracellular Domain.

    NASA Astrophysics Data System (ADS)

    Garmroudi, Farideh

    1995-01-01

    In mammals, insulin-like growth factor II (IGF -II) and glycoproteins bearing the mannose 6-phosphate (Man -6-P) recognition marker bind with high affinity to the same receptor. The functional consequences of IGF-II binding to the receptor at the cell surface are not clear. In these studies, we sought to broaden our understanding of the functional regions of the receptor regarding its IGF -II binding site. The IGF-II binding/cross-linking domain of the IGF-II/Man-6-P receptor was mapped by sequencing receptor fragments covalently attached to IGF-II. Purified rat placental or bovine liver receptors were affinity-labeled, with ^{125}I-IGF-II and digested with endoproteinase Glu-C. Analysis of digests by gel electrophoresis revealed a major radiolabeled band of 18 kDa, which was purified by gel filtration chromatography followed by reverse-phase HPLC and electroblotting. Sequence analysis revealed that, the peptide S(H)VNSXPMF, located within extracellular repeat 10 and beginning with serine 1488 of the bovine receptor, was the best candidate for the IGF-II cross-linked peptide. These data indicated that residues within repeats 10-11 were important for IGF -II binding. To define the location of the IGF-II binding site further, a nested set of six human receptor cDNA constructs was designed to produce epitope-tagged fusion proteins encompassing the region between repeats 8 and 11 of the human IGF-II/Man-6-P receptor extracellular domain. These truncated receptors were transiently expressed in COS-7 cells, immunoprecipitated and analyzed for their abilities to bind and cross-link to IGF-II. All of the constructs were capable of binding/cross-linking to IGF-II, except for the 9.0-11 construct. Displacement curve analysis indicated that the truncated receptors were approximately equivalent in IGF-II binding affinity, but were of 5- to 10-fold lower affinity than full-length receptors. Sequencing of the 9.0-11 construct indicated the presence of a point mutation substituting threonine for isoleucine at position 1621, which is located in the N-terminal half of repeat 11, and was found to abrogate IGF-II binding. Collectively, our work indicates that repeat 11 of the IGF-II/Man-6-P receptor's extracellular domain encompasses the elements both for binding and cross-linking to IGF-II.

  8. Cloning, sequencing, and expression of the gene encoding the high-molecular-weight cytochrome c from Desulfovibrio vulgaris Hildenborough.

    PubMed Central

    Pollock, W B; Loutfi, M; Bruschi, M; Rapp-Giles, B J; Wall, J D; Voordouw, G

    1991-01-01

    By using a synthetic deoxyoligonucleotide probe designed to recognize the structural gene for cytochrome cc3 from Desulfovibrio vulgaris Hildenborough, a 3.7-kb XhoI genomic DNA fragment containing the cc3 gene was isolated. The gene encodes a precursor polypeptide of 58.9 kDa, with an NH2-terminal signal sequence of 31 residues. The mature polypeptide (55.7 kDa) has 16 heme binding sites of the form C-X-X-C-H. Covalent binding of heme to these 16 sites gives a holoprotein of 65.5 kDa with properties similar to those of the high-molecular-weight cytochrome c (Hmc) isolated from the same strain by Higuchi et al. (Y. Higuchi, K. Inaka, N. Yasuoka, and T. Yagi, Biochim. Biophys. Acta 911:341-348, 1987). Since the data indicate that cytochrome cc3 and Hmc are the same protein, the gene has been named hmc. The Hmc polypeptide contains 31 histidinyl residues, 16 of which are integral to heme binding sites. Thus, only 15 of the 16 hemes can have bis-histidinyl coordination. A comparison of the arrangement of heme binding sites and coordinated histidines in the amino acid sequences of cytochrome c3 and Hmc from D. vulgaris Hildenborough suggests that the latter contains three cytochrome c3-like domains. Cloning of the D. vulgaris Hildenborough hmc gene into the broad-host-range vector pJRD215 and subsequent conjugational transfer of the recombinant plasmid into D. desulfuricans G200 led to expression of a periplasmic Hmc gene product with covalently bound hemes. Images PMID:1846136

  9. Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin

    PubMed Central

    Maksimenko, Oksana; Bartkuhn, Marek; Stakhov, Viacheslav; Herold, Martin; Zolotarev, Nickolay; Jox, Theresa; Buxa, Melanie K.; Kirsch, Ramona; Bonchuk, Artem; Fedotova, Anna; Kyrchanova, Olga

    2015-01-01

    Insulators are multiprotein–DNA complexes that regulate the nuclear architecture. The Drosophila CP190 protein is a cofactor for the DNA-binding insulator proteins Su(Hw), CTCF, and BEAF-32. The fact that CP190 has been found at genomic sites devoid of either of the known insulator factors has until now been unexplained. We have identified two DNA-binding zinc-finger proteins, Pita, and a new factor named ZIPIC, that interact with CP190 in vivo and in vitro at specific interaction domains. Genomic binding sites for these proteins are clustered with CP190 as well as with CTCF and BEAF-32. Model binding sites for Pita or ZIPIC demonstrate a partial enhancer-blocking activity and protect gene expression from PRE-mediated silencing. The function of the CTCF-bound MCP insulator sequence requires binding of Pita. These results identify two new insulator proteins and emphasize the unifying function of CP190, which can be recruited by many DNA-binding insulator proteins. PMID:25342723

  10. A deep learning framework for modeling structural features of RNA-binding protein targets

    PubMed Central

    Zhang, Sai; Zhou, Jingtian; Hu, Hailin; Gong, Haipeng; Chen, Ligong; Cheng, Chao; Zeng, Jianyang

    2016-01-01

    RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp. PMID:26467480

  11. The Binding Sites of miR-619-5p in the mRNAs of Human and Orthologous Genes.

    PubMed

    Atambayeva, Shara; Niyazova, Raigul; Ivashchenko, Anatoliy; Pyrkova, Anna; Pinsky, Ilya; Akimniyazova, Aigul; Labeit, Siegfried

    2017-06-01

    Normally, one miRNA interacts with the mRNA of one gene. However, there are miRNAs that can bind to many mRNAs, and one mRNA can be the target of many miRNAs. This significantly complicates the study of the properties of miRNAs and their diagnostic and medical applications. The search of 2,750 human microRNAs (miRNAs) binding sites in 12,175 mRNAs of human genes using the MirTarget program has been completed. For the binding sites of the miR-619-5p the hybridization free energy of the bonds was equal to 100% of the maximum potential free energy. The mRNAs of 201 human genes have complete complementary binding sites of miR-619-5p in the 3'UTR (214 sites), CDS (3 sites), and 5'UTR (4 sites). The mRNAs of CATAD1, ICA1L, GK5, POLH, and PRR11 genes have six miR-619-5p binding sites, and the mRNAs of OPA3 and CYP20A1 genes have eight and ten binding sites, respectively. All of these miR-619-5p binding sites are located in the 3'UTRs. The miR-619-5p binding site in the 5'UTR of mRNA of human USP29 gene is found in the mRNAs of orthologous genes of primates. Binding sites of miR-619-5p in the coding regions of mRNAs of C8H8orf44, C8orf44, and ISY1 genes encode the WLMPVIP oligopeptide, which is present in the orthologous proteins. Binding sites of miR-619-5p in the mRNAs of transcription factor genes ZNF429 and ZNF429 encode the AHACNP oligopeptide in another reading frame. Binding sites of miR-619-5p in the 3'UTRs of all human target genes are also present in the 3'UTRs of orthologous genes of mammals. The completely complementary binding sites for miR-619-5p are conservative in the orthologous mammalian genes. The majority of miR-619-5p binding sites are located in the 3'UTRs but some genes have miRNA binding sites in the 5'UTRs of mRNAs. Several genes have binding sites for miRNAs in the CDSs that are read in different open reading frames. Identical nucleotide sequences of binding sites encode different amino acids in different proteins. The binding sites of miR-619-5p in 3'UTRs, 5'UTRs and CDSs are conservative in the orthologous mammalian genes.

  12. Shape-selective recognition of DNA abasic sites by metallohelices: inhibition of human AP endonuclease 1

    PubMed Central

    Malina, Jaroslav; Scott, Peter; Brabec, Viktor

    2015-01-01

    Loss of a base in DNA leading to creation of an abasic (AP) site leaving a deoxyribose residue in the strand, is a frequent lesion that may occur spontaneously or under the action of various physical and chemical agents. Progress in the understanding of the chemistry and enzymology of abasic DNA largely relies upon the study of AP sites in synthetic duplexes. We report here on interactions of diastereomerically pure metallo–helical ‘flexicate’ complexes, bimetallic triple-stranded ferro-helicates [Fe2(NN-NN)3]4+ incorporating the common NN–NN bis(bidentate) helicand, with short DNA duplexes containing AP sites in different sequence contexts. The results show that the flexicates bind to AP sites in DNA duplexes in a shape-selective manner. They preferentially bind to AP sites flanked by purines on both sides and their binding is enhanced when a pyrimidine is placed in opposite orientation to the lesion. Notably, the Λ-enantiomer binds to all tested AP sites with higher affinity than the Δ-enantiomer. In addition, the binding of the flexicates to AP sites inhibits the activity of human AP endonuclease 1, which is as a valid anticancer drug target. Hence, this finding indicates the potential of utilizing well-defined metallo–helical complexes for cancer chemotherapy. PMID:25940617

  13. Kinetic and Spectroscopic Studies of Bicupin Oxalate Oxidase and Putative Active Site Mutants

    PubMed Central

    Moomaw, Ellen W.; Hoffer, Eric; Moussatche, Patricia; Salerno, John C.; Grant, Morgan; Immelman, Bridget; Uberto, Richard; Ozarowski, Andrew; Angerhofer, Alexander

    2013-01-01

    Ceriporiopsis subvermispora oxalate oxidase (CsOxOx) is the first bicupin enzyme identified that catalyzes manganese-dependent oxidation of oxalate. In previous work, we have shown that the dominant contribution to catalysis comes from the monoprotonated form of oxalate binding to a form of the enzyme in which an active site carboxylic acid residue must be unprotonated. CsOxOx shares greatest sequence homology with bicupin microbial oxalate decarboxylases (OxDC) and the 241-244DASN region of the N-terminal Mn binding domain of CsOxOx is analogous to the lid region of OxDC that has been shown to determine reaction specificity. We have prepared a series of CsOxOx mutants to probe this region and to identify the carboxylate residue implicated in catalysis. The pH profile of the D241A CsOxOx mutant suggests that the protonation state of aspartic acid 241 is mechanistically significant and that catalysis takes place at the N-terminal Mn binding site. The observation that the D241S CsOxOx mutation eliminates Mn binding to both the N- and C- terminal Mn binding sites suggests that both sites must be intact for Mn incorporation into either site. The introduction of a proton donor into the N-terminal Mn binding site (CsOxOx A242E mutant) does not affect reaction specificity. Mutation of conserved arginine residues further support that catalysis takes place at the N-terminal Mn binding site and that both sites must be intact for Mn incorporation into either site. PMID:23469254

  14. Identity of a peptide domain of human C9 that is bound by the cell-surface complement inhibitor, CD59.

    PubMed

    Chang, C P; Hüsler, T; Zhao, J; Wiedmer, T; Sims, P J

    1994-10-21

    The CD59 antigen is a plasma membrane glycoprotein that serves as an inhibitor of the C5b-9 complex of complement. This inhibitory activity appears related to the capacity of CD59 to bind with high affinity to sites that are nascently exposed in the alpha-chain subunit of human C8, as well as within the C9b domain (amino acid residues 245-538) of human C9, during assembly of the C5b-9 complex on the target membrane (Ninomiya, H., and Sims, P. J. (1992) J. Biol. Chem. 267, 13675-13680). The CD59 binding site in C9 was first investigated by N-terminal sequencing of CD59-binding peptides generated by limited digest of the isolated C9b domain. These experiments revealed a 17-kDa fragment (starting at C9 residue Thr-320) that retained affinity for CD59, suggesting the possibility for localizing the CD59 binding site by mapping with small C9-derived peptides. Peptides spanning the entire C9b sequence were expressed in Escherichia coli and then probed with CD59. CD59 bound specifically to all peptides starting N-terminal to C9 residue 359 with C termini extending beyond residue 411. Little to no CD59 binding was observed for various C9-derived peptides that started C-terminal to residue 359 or that were truncated N-terminal to residue 411. Affinity-purified antibody against C9 residues 320-411 inhibited CD59 binding to C9 by > 50% and completely inhibited its binding to the isolated C9b domain. Little to no specific binding of CD59 was detected for peptides restricted to the putative hinge domain within C9b (residues 245-271). These results indicate that a CD59 binding site is located between residues 320 and 411 of the C9 polypeptide and suggest that the affinity of this site is principally determined by residues 359-411.

  15. Sequence characterization of S100A8 gene reveals structural differences of protein and transcriptional factor binding sites in water buffalo and yak.

    PubMed

    Kathiravan, P; Goyal, S; Kataria, R S; Mishra, B P; Jayakumar, S; Joshi, B K

    2011-01-01

    The present study was undertaken to characterize the structure of S100A8 gene and its promoter in water buffalo and yak. Sequence data of 2.067 kb, 2.071 kb, and 2.052 kb with respect to complete S100A8 gene including 5' flanking region was generated in river buffalo, swamp buffalo, and yak, respectively. BLAST analysis of coding DNA sequences (CDS) of S100A8 gene revealed 95% homology of buffalo sequence with cattle, 85% with pig and horse, 83% with dog, 72-73% with murines, and around 79% with primates and humans. Phylogenetic analysis of predicted CDS revealed distinct clustering of murines, primates, and domestic animals with bovines and bubalines forming a subcluster among farm animals. In silico translation of predicted CDS revealed a sequence of 89 amino acids with 7 amino acid changes between cattle and buffalo and 2 changes between cattle and yak. The search for Pfam family revealed the N-terminal calcium binding domain and the noncanonical EF hand domain in the carboxy terminus, with more variations being observed in the N-terminal domain among different species. Two amino acid changes observed in carboxy terminal EF hand domain resulted in altered secondary structure of yak S100A8 protein. Analysis of S100A8 gene promoter revealed 14 putative motifs for transcriptional factor binding sites. Two putative motifs viz. C/EBP and v-Myb were found to be absent in swamp buffalo as compared to river buffalo and cattle. Differences in the structure of S100A8 protein and the transcriptional factor binding sites identified in the present study need to be analyzed further for their functional significance in yak and swamp buffalo respectively. Copyright © Taylor & Francis Group, LLC

  16. Comparative analysis and molecular characterization of a gene BANF1 encoded a DNA-binding protein during mitosis from the Giant Panda and Black Bear.

    PubMed

    Zeng, Yichun; Hou, Yi-Ling; Ding, Xiang; Hou, Wan-Ru; Li, Jian

    2014-01-01

    Barrier to autointegration factor 1 (BANF1) is a DNA-binding protein found in the nucleus and cytoplasm of eukaryotic cells that functions to establish nuclear architecture during mitosis. The cDNA and the genomic sequence of BANF1 were cloned from the Giant Panda (Ailuropoda melanoleuca) and Black Bear (Ursus thibetanus mupinensis) using RT-PCR technology and Touchdown-PCR, respectively. The cDNA of the BANF1 cloned from Giant Panda and Black Bear is 297 bp in size, containing an open reading frame of 270 bp encoding 89 amino acids. The length of the genomic sequence from Giant Panda is 521 bp, from Black Bear is 536 bp, which were found both to possess 2 exons. Alignment analysis indicated that the nucleotide sequence and the deduced amino acid sequence are highly conserved to some mammalian species studied. Topology prediction showed there is one Protein kinase C phosphorylation site, one Casein kinase II phosphorylation site, one Tyrosine kinase phosphorylation site, one N-myristoylation site, and one Amidation site in the BANF1 protein of the Giant Panda, and there is one Protein kinase C phosphorylation site, one Tyrosine kinase phosphorylation site, one N-myristoylation site, and one Amidation site in the BANF1 protein of the Black Bear. The BANF1 gene can be readily expressed in E. coli. Results showed that the protein BANF1 fusion with the N-terminally His-tagged form gave rise to the accumulation of an expected 14 kD polypeptide that formed inclusion bodies. The expression products obtained could be used to purify the proteins and study their function further.

  17. Three RNA recognition motifs participate in RNA recognition and structural organization by the pro-apoptotic factor TIA-1

    PubMed Central

    Bauer, William J.; Heath, Jason; Jenkins, Jermaine L.; Kielkopf, Clara L.

    2012-01-01

    T-cell intracellular antigen-1 (TIA-1) regulates developmental and stress-responsive pathways through distinct activities at the levels of alternative pre-mRNA splicing and mRNA translation. The TIA-1 polypeptide contains three RNA recognition motifs (RRMs). The central RRM2 and C-terminal RRM3 associate with cellular mRNAs. The N-terminal RRM1 enhances interactions of a C-terminal Q-rich domain of TIA-1 with the U1-C splicing factor, despite linear separation of the domains in the TIA-1 sequence. Given the expanded functional repertoire of the RRM family, it was unknown whether TIA-1 RRM1 contributes to RNA binding as well as documented protein interactions. To address this question, we used isothermal titration calorimetry and small-angle X-ray scattering (SAXS) to dissect the roles of the TIA-1 RRMs in RNA recognition. Notably, the fas RNA exhibited two binding sites with indistinguishable affinities for TIA-1. Analyses of TIA-1 variants established that RRM1 was dispensable for binding AU-rich fas sites, yet all three RRMs were required to bind a polyU RNA with high affinity. SAXS analyses demonstrated a `V' shape for a TIA-1 construct comprising the three RRMs, and revealed that its dimensions became more compact in the RNA-bound state. The sequence-selective involvement of TIA-1 RRM1 in RNA recognition suggests a possible role for RNA sequences in regulating the distinct functions of TIA-1. Further implications for U1-C recruitment by the adjacent TIA-1 binding sites of the fas pre-mRNA and the bent TIA-1 shape, which organizes the N- and C-termini on the same side of the protein, are discussed. PMID:22154808

  18. COUP-TF (chicken ovalbumin upstream promoter transcription factor)-interacting protein 1 (CTIP1) is a sequence-specific DNA binding protein.

    PubMed Central

    Avram, Dorina; Fields, Andrew; Senawong, Thanaset; Topark-Ngarm, Acharawan; Leid, Mark

    2002-01-01

    Chicken ovalbumin upstream promoter transcription factor (COUP-TF)-interacting proteins 1 and 2 [CTIP1/Evi9/B cell leukaemia (Bcl) l1a and CTIP2/Bcl11b respectively] are highly related C(2)H(2) zinc finger proteins that are abundantly expressed in brain and the immune system, and are associated with immune system malignancies. A selection procedure was employed to isolate high-affinity DNA binding sites for CTIP1. The core binding site on DNA identified in these studies, 5'-GGCCGG-3' (upper strand), is highly related to the canonical GC box and was bound by a CTIP1 oligomeric complex(es) in vitro. Furthermore, both CTIP1 and CTIP2 repressed transcription of a reporter gene harbouring a multimerized CTIP binding site, and this repression was neither reversed by trichostatin A (an inhibitor of known class I and II histone deacetylases) nor stimulated by co-transfection of a COUP-TF family member. These results demonstrate that CTIP1 is a sequence-specific DNA binding protein and a bona fide transcriptional repressor that is capable of functioning independently of COUP-TF family members. These findings may be relevant to the physiological and/or pathological action(s) of CTIPs in cells that do not express COUP-TF family members, such as cells of the haematopoietic and immune systems. PMID:12196208

  19. Evidence for glucocorticoid receptor binding to a site(s) in a remote region of the 5' flanking sequences of the human proopiomelanocortin gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tully, D.B.; Hillman, D.; Herbert, E.

    1986-05-01

    Glucocorticoids negatively regulate expression of the human proopiomelanocortin (POMC) gene. It has been postulated that this effect may be modulated by a direct interaction of the glucocorticoid receptor (GR) with DNA in the vicinity of the POMC promoter. In order to investigate interactions of GR with POMC DNA, DNA-cellulose competitive binding assays have been performed using isolated fragments of cloned POMC DNA to compete with calf thymus DNA-cellulose for binding of triamcinolone acetonide affinity-labelled GR prepared from HeLa S/sub 3/ cells. In these assays, two fragments isolated from the 5' flanking sequences of POMC DNA (Fragment 3,-1765 to -677 andmore » Fragment 4, -676 to +125 with respect to the mRNA cap site) have competed favorably, with Fragment 3 consistently competing more strongly than Fragment 4. Additional studies have been conducted utilizing a newly developed South-western Blot procedure in which specific /sup 32/P-labelled DNA fragments are allowed to bind to dexamethasone mesylate labelled GR immobilized on nitrocellulose filters. Results from these studies have also shown preferential binding by POMC DNA fragments 3 and 4. DNA footprinting and gene transfer experiments are now being conducted to further characterize the nature of GR interaction with POMC DNA.« less

  20. Peptidomimetic Escape Mechanisms Arise via Genetic Diversity in the Ligand-Binding Site of the Hepatitis C Virus NS3/4A Serine Protease

    PubMed Central

    Welsch, Christoph; Shimakami, Tetsuro; Hartmann, Christoph; Yang, Yan; Domingues, Francisco S.; Lengauer, Thomas; Zeuzem, Stefan; Lemon, Stanley M.

    2011-01-01

    Background & Aims It is a challenge to develop direct-acting antiviral agents (DAAs) that target the NS3/4A protease of hepatitis C virus (HCV) because resistant variants develop. Ketoamide compounds, designed to mimic the natural protease substrate, have been developed as inhibitors. However, clinical trials have revealed rapid selection of resistant mutants, most of which are considered to be pre-existing variants. Methods We identified residues near the ketoamide-binding site in X-ray structures of the genotype 1a protease, co-crystallized with boceprevir or a telaprevir-like ligand, and then identified variants at these positions in 219 genotype 1 sequences from a public database. We used side-chain modeling to assess the potential effects of these variants on the interaction between ketoamide and the protease, and compared these results with the phenotypic effects on ketoamide resistance, RNA replication capacity, and infectious virus yields in a cell culture model of infection. Results Thirteen natural binding-site variants with potential for ketoamide resistance were identified at 10 residues in the protease, near the ketoamide binding site. Rotamer analysis of amino acid side-chain conformations indicated that 2 variants (R155K and D168G) could affect binding of telaprevir more than boceprevir. Measurements of antiviral susceptibility in cell culture studies were consistent with this observation. Four variants (Q41H, I132V, R155K, and D168G) caused low-to-moderate levels of ketoamide resistance; 3 of these were highly fit (Q41H, I132V, and R155K). Conclusions Using a comprehensive sequence and structure-based analysis, we showed how natural variation in the HCV protease NS3/4A sequences might affect susceptibility to first-generation DAAs. These findings increase our understanding of the molecular basis of ketoamide resistance among naturally existing viral variants. PMID:22155364

  1. Structural studies of polypeptides: Mechanism of immunoglobin catalysis and helix propagation in hybrid sequence, disulfide containing peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Storrs, Richard Wood

    1992-08-01

    Catalytic immunoglobin fragments were studied Nuclear Magnetic Resonance spectroscopy to identify amino acid residues responsible for the catalytic activity. Small, hybrid sequence peptides were analyzed for helix propagation following covalent initiation and for activity related to the protein from which the helical sequence was derived. Hydrolysis of p-nitrophenyl carbonates and esters by specific immunoglobins is thought to involve charge complementarity. The pK of the transition state analog P-nitrophenyl phosphate bound to the immunoglobin fragment was determined by 31P-NMR to verify the juxtaposition of a positively charged amino acid to the binding/catalytic site. Optical studies of immunoglobin mediated photoreversal of cis,more » syn cyclobutane thymine dimers implicated tryptophan as the photosensitizing chromophore. Research shows the chemical environment of a single tryptophan residue is altered upon binding of the thymine dimer. This tryptophan residue was localized to within 20 Å of the binding site through the use of a nitroxide paramagnetic species covalently attached to the thymine dimer. A hybrid sequence peptide was synthesized based on the bee venom peptide apamin in which the helical residues of apamin were replaced with those from the recognition helix of the bacteriophage 434 repressor protein. Oxidation of the disufide bonds occured uniformly in the proper 1-11, 3-15 orientation, stabilizing the 434 sequence in an α-helix. The glycine residue stopped helix propagation. Helix propagation in 2,2,2-trifluoroethanol mixtures was investigated in a second hybrid sequence peptide using the apamin-derived disulfide scaffold and the S-peptide sequence. The helix-stop signal previously observed was not observed in the NMR NOESY spectrum. Helical connectivities were seen throughout the S-peptide sequence. The apamin/S-peptide hybrid binded to the S-protein (residues 21-166 of ribonuclease A) and reconstituted enzymatic activity.« less

  2. Structural studies of polypeptides: Mechanism of immunoglobin catalysis and helix propagation in hybrid sequence, disulfide containing peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Storrs, R.W.

    1992-08-01

    Catalytic immunoglobin fragments were studied Nuclear Magnetic Resonance spectroscopy to identify amino acid residues responsible for the catalytic activity. Small, hybrid sequence peptides were analyzed for helix propagation following covalent initiation and for activity related to the protein from which the helical sequence was derived. Hydrolysis of p-nitrophenyl carbonates and esters by specific immunoglobins is thought to involve charge complementarity. The pK of the transition state analog P-nitrophenyl phosphate bound to the immunoglobin fragment was determined by [sup 31]P-NMR to verify the juxtaposition of a positively charged amino acid to the binding/catalytic site. Optical studies of immunoglobin mediated photoreversal ofmore » cis, syn cyclobutane thymine dimers implicated tryptophan as the photosensitizing chromophore. Research shows the chemical environment of a single tryptophan residue is altered upon binding of the thymine dimer. This tryptophan residue was localized to within 20 [Angstrom] of the binding site through the use of a nitroxide paramagnetic species covalently attached to the thymine dimer. A hybrid sequence peptide was synthesized based on the bee venom peptide apamin in which the helical residues of apamin were replaced with those from the recognition helix of the bacteriophage 434 repressor protein. Oxidation of the disufide bonds occured uniformly in the proper 1-11, 3-15 orientation, stabilizing the 434 sequence in an [alpha]-helix. The glycine residue stopped helix propagation. Helix propagation in 2,2,2-trifluoroethanol mixtures was investigated in a second hybrid sequence peptide using the apamin-derived disulfide scaffold and the S-peptide sequence. The helix-stop signal previously observed was not observed in the NMR NOESY spectrum. Helical connectivities were seen throughout the S-peptide sequence. The apamin/S-peptide hybrid binded to the S-protein (residues 21-166 of ribonuclease A) and reconstituted enzymatic activity.« less

  3. Structure and DNA-Binding Sites of the SWI1 AT-rich Interaction Domain (ARID) Suggest Determinants for Sequence-Specific DNA Recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Suhkmann; Zhang, Ziming; Upchurch, Sean

    2004-04-16

    2 ARID is a homologous family of DNA-binding domains that occur in DNA binding proteins from a wide variety of species, ranging from yeast to nematodes, insects, mammals and plants. SWI1, a member of the SWI/SNF protein complex that is involved in chromatin remodeling during transcription, contains the ARID motif. The ARID domain of human SWI1 (also known as p270) does not select for a specific DNA sequence from a random sequence pool. The lack of sequence specificity shown by the SWI1 ARID domain stands in contrast to the other characterized ARID domains, which recognize specific AT-rich sequences. We havemore » solved the three-dimensional structure of human SWI1 ARID using solution NMR methods. In addition, we have characterized non-specific DNA-binding by the SWI1 ARID domain. Results from this study indicate that a flexible long internal loop in ARID motif is likely to be important for sequence specific DNA-recognition. The structure of human SWI1 ARID domain also represents a distinct structural subfamily. Studies of ARID indicate that boundary of the DNA binding structural and functional domains can extend beyond the sequence homologous region in a homologous family of proteins. Structural studies of homologous domains such as ARID family of DNA-binding domains should provide information to better predict the boundary of structural and functional domains in structural genomic studies. Key Words: ARID, SWI1, NMR, structural genomics, protein-DNA interaction.« less

  4. Binding of nucleotides by T4 DNA ligase and T4 RNA ligase: optical absorbance and fluorescence studies.

    PubMed Central

    Cherepanov, A V; de Vries, S

    2001-01-01

    The interaction of nucleotides with T4 DNA and RNA ligases has been characterized using ultraviolet visible (UV-VIS) absorbance and fluorescence spectroscopy. Both enzymes bind nucleotides with the K(d) between 0.1 and 20 microM. Nucleotide binding results in a decrease of absorbance at 260 nm due to pi-stacking with an aromatic residue, possibly phenylalanine, and causes red-shifting of the absorbance maximum due to hydrogen bonding with the exocyclic amino group. T4 DNA ligase is shown to have, besides the catalytic ATP binding site, another noncovalent nucleotide binding site. ATP bound there alters the pi-stacking of the nucleotide in the catalytic site, increasing its optical extinction. The K(d) for the noncovalent site is approximately 1000-fold higher than for the catalytic site. Nucleotides quench the protein fluorescence showing that a tryptophan residue is located in the active site of the ligase. The decrease of absorbance around 298 nm suggests that the hydrogen bonding interactions of this tryptophan residue are weakened in the ligase-nucleotide complex. The excitation/emission properties of T4 RNA ligase indicate that its ATP binding pocket is in contact with solvent, which is excluded upon binding of the nucleotide. Overall, the spectroscopic analysis reveals important similarities between T4 ligases and related nucleotidyltransferases, despite the low sequence similarity. PMID:11721015

  5. Self-catalyzed cyclization of the intervening sequence RNA of Tetrahymena: inhibition by methidiumpropyl.EDTA and localization of the major dye binding sites.

    PubMed Central

    Tanner, N K; Cech, T R

    1985-01-01

    The intervening sequence (IVS) excised from the rRNA precursor of Tetrahymena thermophila is converted to a covalently closed circular RNA in the absence of proteins in vitro. This self-catalyzed cyclization reaction is inhibited by the intercalating dye methidiumpropyl.EDTA (MPE; R.P. Hertzberg and P.B. Dervan (1982) J. Am. Chem. Soc. 104, 313-315). The MPE binding sites have been localized by mapping the sites of MPE.Fe(II) cleavage of the IVS RNA. There are three major binding sites within the 414 nucleotide IVS RNA. Two of these sites coincide with the A.B and 9L.2 pairings. These are structural elements that are conserved in all group I introns and are implicated as being functionally important for splicing. We propose that interaction of MPE with these sites is responsible for dye inhibition of cyclization. The reactions of MPE.Fe(II) with an RNA of known structure, tRNAPhe, and with the IVS RNA were studied as a function of temperature, ionic strength and ethidium concentration. Based on the comparison of the reaction with these two RNAs, we conclude that the dye is a very useful probe for structural regions of large RNAs, while it provides more limited structural information about the small, compact tRNA molecule. Images PMID:2415924

  6. The neuraminidases of MDCK grown human influenza A(H3N2) viruses isolated since 1994 can demonstrate receptor binding.

    PubMed

    Mohr, Peter G; Deng, Yi-Mo; McKimm-Breschkin, Jennifer L

    2015-04-22

    The neuraminidases (NAs) of MDCK passaged human influenza A(H3N2) strains isolated since 2005 are reported to have dual functions of cleavage of sialic acid and receptor binding. NA agglutination of red blood cells (RBCs) can be inhibited by neuraminidase inhibitors (NAIs), thus distinguishing it from haemagglutinin (HA) binding. We wanted to know if viruses prior to 2005 can demonstrate this property. Pairs of influenza A(H3N2) isolates ranging from 1993-2008 passaged in parallel only in eggs or in MDCK cells were tested for inhibition of haemagglutination by various NAIs. Only viruses isolated since 1994 and cultured in MDCK cells bound chicken RBCs solely through their NA. NAI inhibition of agglutination of turkey RBCs was seen for some, but not all of these same MDCK grown viruses. Efficacy of inhibition of enzyme activity and haemagglutination differed between NAIs. For many viruses lower concentrations of oseltamivir could inhibit agglutination compared to zanamivir, although they could both inhibit enzyme activity at comparable concentrations. An E119V mutation reduced sensitivity to oseltamivir and 4-aminoDANA for both the enzyme assay and inhibition of agglutination. Sequence analysis of the NAs and HAs of some paired viruses revealed mutations in the haemagglutinin of all egg passaged viruses. For many of the paired egg and MDCK cultured viruses we found no differences in their NA sequences by Sanger sequencing. However, deep sequencing of MDCK grown isolates revealed low levels of variant populations with mutations at either D151 or T148 in the NA, suggesting mutations at either site may be able to confer this property. The NA active site of MDCK cultured human influenza A(H3N2) viruses isolated since 1994 can express dual enzyme and receptor binding functions. Binding correlated with either D151 or T148 mutations. The catalytic and receptor binding sites do not appear to be structurally identical since relative concentrations of the NAIs to inhibit enzyme activity and agglutination differ.

  7. Human renin 5'-flanking DNA to nucleotide-2750.

    PubMed

    Smith, D L; Jeyapalan, S; Lang, J A; Guo, X H; Sigmund, C D; Morris, B J

    1995-01-01

    Renin is one of the most important factors in blood pressure and electrolyte regulation in mammals and the renin locus has been implicated in hypertension. To assist studies of promoter control we therefore determined the 5'-flanking sequence of the human gene (REN) to residue -2750 relative to the transcription start site (+1). Sites of homology to consensus sequences for binding of trans-acting factors involved in transcriptional control of other genes were identified, and functionality for two of these (a CRE and Pit-1 site) have so far been demonstrated.

  8. c-Myb Binds to a Sequence in the Proximal Region of the RAG-2 Promoter and Is Essential for Promoter Activity in T-Lineage Cells

    PubMed Central

    Wang, Qian-Fei; Lauring, Josh; Schlissel, Mark S.

    2000-01-01

    The RAG-2 gene encodes a component of the V(D)J recombinase which is essential for the assembly of antigen receptor genes in B and T lymphocytes. Previously, we reported that the transcription factor BSAP (PAX-5) regulates the murine RAG-2 promoter in B-cell lines. A partially overlapping but distinct region of the proximal RAG-2 promoter was also identified as an important element for promoter activity in T cells; however, the responsible factor was unknown. In this report, we present data demonstrating that c-Myb binds to a Myb consensus site within the proximal promoter and is critical for its activity in T-lineage cells. We show that c-Myb can transactivate a RAG-2 promoter-reporter construct in cotransfection assays and that this transactivation depends on the proximal promoter Myb consensus site. By using a chromatin immunoprecipitation (ChIP) strategy, fractionation of chromatin with anti-c-Myb antibody specifically enriched endogenous RAG-2 promoter DNA sequences. DNase I genomic footprinting revealed that the c-Myb site is occupied in a tissue-specific fashion in vivo. Furthermore, an integrated RAG-2 promoter construct with mutations at the c-Myb site was not enriched in the ChIP assay, while a wild-type integrated promoter construct was enriched. Finally, this lack of binding of c-Myb to a chromosomally integrated mutant RAG-2 promoter construct in vivo was associated with a striking decrease in promoter activity. We conclude that c-Myb regulates the RAG-2 promoter in T cells by binding to this consensus c-Myb binding site. PMID:11094072

  9. Evolutionary Origin and Conserved Structural Building Blocks of Riboswitches and Ribosomal RNAs: Riboswitches as Probable Target Sites for Aminoglycosides Interaction.

    PubMed

    Mehdizadeh Aghdam, Elnaz; Barzegar, Abolfazl; Hejazi, Mohammad Saeid

    2014-01-01

    Riboswitches, as noncoding RNA sequences, control gene expression through direct ligand binding. Sporadic reports on the structural relation of riboswitches with ribosomal RNAs (rRNA), raises an interest in possible similarity between riboswitches and rRNAs evolutionary origins. Since aminoglycoside antibiotics affect microbial cells through binding to functional sites of the bacterial rRNA, finding any conformational and functional relation between riboswitches/rRNAs is utmost important in both of medicinal and basic research. Analysis of the riboswitches structures were carried out using bioinformatics and computational tools. The possible functional similarity of riboswitches with rRNAs was evaluated based on the affinity of paromomycin antibiotic (targeting "A site" of 16S rRNA) to riboswitches via docking method. There was high structural similarity between riboswitches and rRNAs, but not any particular sequence based similarity between them was found. The building blocks including "hairpin loop containing UUU", "peptidyl transferase center conserved hairpin A loop"," helix 45" and "S2 (G8) hairpin" as high identical rRNA motifs were detected in all kinds of riboswitches. Surprisingly, binding energies of paromomycin with different riboswitches are considerably better than the binding energy of paromomycin with "16S rRNA A site". Therefore the high affinity of paromomycin to bind riboswitches in comparison with rRNA "A site" suggests a new insight about riboswitches as possible targets for aminoglycoside antibiotics. These findings are considered as a possible supporting evidence for evolutionary origin of riboswitches/rRNAs and also their role in the exertion of antibiotics effects to design new drugs based on the concomitant effects via rRNA/riboswitches.

  10. Specific labeling of the thyroxine binding site in thyroxine-binding globulin: determination of the amino acid composition of a labeled peptide fragment isolated from a proteolytic digest of the derivatized protein.

    PubMed

    Tabachnick, M; Perret, V

    1987-08-01

    [125I] Thyroxine has been covalently bound to the thyroxine binding site in thyroxine-binding globulin by reaction with the bifunctional reagent, 1,5-difluoro-2,4-dinitrobenzene. An average of 0.47 mol of [125I] thyroxine was incorporated per mol protein; nonspecific binding amounted to 8%. A labeled peptide fragment was isolated from a proteolytic digest of the derivatized protein by HPLC and its amino acid composition was determined. Comparison with the amino acid sequence of thyroxine-binding globulin indicated partial correspondence of the labeled peptide with two possible regions in the protein. These regions also coincide with part of the barrel structure present in the closely homologous protein, alpha 1-antitrypsin.

  11. A Flexible Binding Site Architecture Provides New Insights into CcpA Global Regulation in Gram-Positive Bacteria.

    PubMed

    Yang, Yunpeng; Zhang, Lu; Huang, He; Yang, Chen; Yang, Sheng; Gu, Yang; Jiang, Weihong

    2017-01-24

    Catabolite control protein A (CcpA) is the master regulator in Gram-positive bacteria that mediates carbon catabolite repression (CCR) and carbon catabolite activation (CCA), two fundamental regulatory mechanisms that enable competitive advantages in carbon catabolism. It is generally regarded that CcpA exerts its regulatory role by binding to a typical 14- to 16-nucleotide (nt) consensus site that is called a catabolite response element (cre) within the target regions. However, here we report a previously unknown noncanonical flexible architecture of the CcpA-binding site in solventogenic clostridia, providing new mechanistic insights into catabolite regulation. This novel CcpA-binding site, named cre var , has a unique architecture that consists of two inverted repeats and an intervening spacer, all of which are variable in nucleotide composition and length, except for a 6-bp core palindromic sequence (TGTAAA/TTTACA). It was found that the length of the intervening spacer of cre var can affect CcpA binding affinity, and moreover, the core palindromic sequence of cre var is the key structure for regulation. Such a variable architecture of cre var shows potential importance for CcpA's diverse and fine regulation. A total of 103 potential cre var sites were discovered in solventogenic Clostridium acetobutylicum, of which 42 sites were picked out for electrophoretic mobility shift assays (EMSAs), and 30 sites were confirmed to be bound by CcpA. These 30 cre var sites are associated with 27 genes involved in many important pathways. Also of significance, the cre var sites are found to be widespread and function in a great number of taxonomically different Gram-positive bacteria, including pathogens, suggesting their global role in Gram-positive bacteria. In Gram-positive bacteria, the global regulator CcpA controls a large number of important physiological and metabolic processes. Although a typical consensus CcpA-binding site, cre, has been identified, it remains poorly explored for the diversity of CcpA-mediated catabolite regulation. Here, we discovered a novel flexible CcpA-binding site architecture (cre var ) that is highly variable in both length and base composition but follows certain principles, providing new insights into how CcpA can differentially recognize a variety of target genes to form a complicated regulatory network. A comprehensive search further revealed the wide distribution of cre var sites in Gram-positive bacteria, indicating it may have a universal function. This finding is the first to characterize such a highly flexible transcription factor-binding site architecture, which would be valuable for deeper understanding of CcpA-mediated global catabolite regulation in bacteria. Copyright © 2017 Yang et al.

  12. In-Silico Analysis of Binding Site Features and Substrate Selectivity in Plant Flavonoid-3-O Glycosyltransferases (F3GT) through Molecular Modeling, Docking and Dynamics Simulation Studies

    PubMed Central

    Sharma, Ranu; Panigrahi, Priyabrata; Suresh, C.G.

    2014-01-01

    Flavonoids are a class of plant secondary metabolites that act as storage molecules, chemical messengers, as well as participate in homeostasis and defense processes. They possess pharmaceutical properties important for cancer treatment such as antioxidant and anti-tumor activities. The drug-related properties of flavonoids can be improved by glycosylation. The enzymes glycosyltransferases (GTs) glycosylate acceptor molecules in a regiospecific manner with the help of nucleotide sugar donor molecules. Several plant GTs have been characterized and their amino acid sequences determined. However, three-dimensional structures of only a few are reported. Here, phylogenetic analysis using amino acid sequences have identified a group of GTs with the same regiospecific activity. The structures of these closely related GTs were modeled using homologous GT structures. Their substrate binding sites were elaborated by docking flavonoid acceptor and UDP-sugar donor molecules in the modeled structures. Eight regions near the acceptor binding site in the N- and C- terminal domain of GTs have been identified that bind and specifically glycosylate the 3-OH group of acceptor flavonoids. Similarly, a conserved motif in the C-terminal domain is known to bind a sugar donor substrate. In certain GTs, the substitution of a specific glutamine by histidine in this domain changes the preference of sugar from glucose to galactose as a result of changed pattern of interactions. The molecular modeling, docking, and molecular dynamics simulation studies have revealed the chemical and topological features of the binding site and thus provided insights into the basis of acceptor and donor recognition by GTs. PMID:24667893

  13. Metal Ion Binding at the Catalytic Site Induces Widely Distributed Changes in a Sequence Specific Protein–DNA Complex

    PubMed Central

    2016-01-01

    Metal ion cofactors can alter the energetics and specificity of sequence specific protein–DNA interactions, but it is unknown if the underlying effects on structure and dynamics are local or dispersed throughout the protein–DNA complex. This work uses EcoRV endonuclease as a model, and catalytically inactive lanthanide ions, which replace the Mg2+ cofactor. Nuclear magnetic resonance (NMR) titrations indicate that four Lu3+ or two La3+ cations bind, and two new crystal structures confirm that Lu3+ binding is confined to the active sites. NMR spectra show that the metal-free EcoRV complex with cognate (GATATC) DNA is structurally distinct from the nonspecific complex, and that metal ion binding sites are not assembled in the nonspecific complex. NMR chemical shift perturbations were determined for 1H–15N amide resonances, for 1H–13C Ile-δ-CH3 resonances, and for stereospecifically assigned Leu-δ-CH3 and Val-γ-CH3 resonances. Many chemical shifts throughout the cognate complex are unperturbed, so metal binding does not induce major conformational changes. However, some large perturbations of amide and side chain methyl resonances occur as far as 34 Å from the metal ions. Concerted changes in specific residues imply that local effects of metal binding are propagated via a β-sheet and an α-helix. Both amide and methyl resonance perturbations indicate changes in the interface between subunits of the EcoRV homodimer. Bound metal ions also affect amide hydrogen exchange rates for distant residues, including a distant subdomain that contacts DNA phosphates and promotes DNA bending, showing that metal ions in the active sites, which relieve electrostatic repulsion between protein and DNA, cause changes in slow dynamics throughout the complex. PMID:27786446

  14. Context influences on TALE–DNA binding revealed by quantitative profiling

    PubMed Central

    Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

    2015-01-01

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805

  15. Context influences on TALE-DNA binding revealed by quantitative profiling.

    PubMed

    Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

    2015-06-11

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

  16. The Human Splicing Factor ASF/SF2 can Specifically Recognize Pre-mRNA 5' Splice Sites

    NASA Astrophysics Data System (ADS)

    Zuo, Ping; Manley, James L.

    1994-04-01

    ASF/SF2 is a human protein previously shown to function in in vitro pre-mRNA splicing as an essential factor necessary for all splices and also as an alternative splicing factor, capable of switching selection of 5' splice sites. To begin to study the protein's mechanism of action, we have investigated the RNA binding properties of purified recombinant ASF/SF2. Using UV crosslinking and gel shift assays, we demonstrate that the RNA binding region of ASF/SF2 can interact with RNA in a sequence-specific manner, recognizing the 5' splice site in each of two different pre-mRNAs. Point mutations in the 5' splice site consensus can reduce binding by as much as a factor of 100, with the largest effects observed in competition assays. These findings support a model in which ASF/SF2 aids in the recognition of pre-mRNA 5' splice sites.

  17. Structural studies of bovine, equine, and leporine serum albumin complexes with naproxen.

    PubMed

    Bujacz, Anna; Zielinski, Kamil; Sekula, Bartosz

    2014-09-01

    Serum albumin, a protein naturally abundant in blood plasma, shows remarkable ligand binding properties of numerous endogenous and exogenous compounds. Most of serum albumin binding sites are able to interact with more than one class of ligands. Determining the protein-ligand interactions among mammalian serum albumins is essential for understanding the complexity of this transporter. We present three crystal structures of serum albumins in complexes with naproxen (NPS): bovine (BSA-NPS), equine (ESA-NPS), and leporine (LSA-NPS) determined to 2.58 Å (C2), 2.42 Å (P61), and 2.73 Å (P2₁2₁2₁) resolutions, respectively. A comparison of the structurally investigated complexes with the analogous complex of human serum albumin (HSA-NPS) revealed surprising differences in the number and distribution of naproxen binding sites. Bovine and leporine serum albumins possess three NPS binding sites, but ESA has only two. All three complexes of albumins studied here have two common naproxen locations, but BSA and LSA differ in the third NPS binding site. None of these binding sites coincides with the naproxen location in the HSA-NPS complex, which was obtained in the presence of other ligands besides naproxen. Even small differences in sequences of serum albumins from various species, especially in the area of the binding pockets, influence the affinity and the binding mode of naproxen to this transport protein. © 2014 Wiley Periodicals, Inc.

  18. Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.

    PubMed

    Jankowsky, Eckhard; Harris, Michael E

    2017-04-15

    To function in a biological setting, RNA binding proteins (RBPs) have to discriminate between alternative binding sites in RNAs. This discrimination can occur in the ground state of an RNA-protein binding reaction, in its transition state, or in both. The extent by which RBPs discriminate at these reaction states defines RBP specificity landscapes. Here, we describe the HiTS-Kin and HiTS-EQ techniques, which combine kinetic and equilibrium binding experiments with high throughput sequencing to quantitatively assess substrate discrimination for large numbers of substrate variants at ground and transition states of RNA-protein binding reactions. We discuss experimental design, practical considerations and data analysis and outline how a combination of HiTS-Kin and HiTS-EQ allows the mapping of RBP specificity landscapes. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. PepComposer: computational design of peptides binding to a given protein surface

    PubMed Central

    Obarska-Kosinska, Agnieszka; Iacoangeli, Alfredo; Lepore, Rosalba; Tramontano, Anna

    2016-01-01

    There is a wide interest in designing peptides able to bind to a specific region of a protein with the aim of interfering with a known interaction or as starting point for the design of inhibitors. Here we describe PepComposer, a new pipeline for the computational design of peptides binding to a given protein surface. PepComposer only requires the target protein structure and an approximate definition of the binding site as input. We first retrieve a set of peptide backbone scaffolds from monomeric proteins that harbor the same backbone arrangement as the binding site of the protein of interest. Next, we design optimal sequences for the identified peptide scaffolds. The method is fully automatic and available as a web server at http://biocomputing.it/pepcomposer/webserver. PMID:27131789

  20. Unusual sugar specificity of banana lectin from Musa paradisiaca and its probable evolutionary origin. Crystallographic and modelling studies.

    PubMed

    Singh, D D; Saikrishnan, K; Kumar, Prashant; Surolia, A; Sekar, K; Vijayan, M

    2005-10-01

    The crystal structure of a complex of methyl-alpha-D-mannoside with banana lectin from Musa paradisiaca reveals two primary binding sites in the lectin, unlike in other lectins with beta-prism I fold which essentially consists of three Greek key motifs. It has been suggested that the fold evolved through successive gene duplication and fusion of an ancestral Greek key motif. In other lectins, all from dicots, the primary binding site exists on one of the three motifs in the three-fold symmetric molecule. Banana is a monocot, and the three motifs have not diverged enough to obliterate sequence similarity among them. Two Greek key motifs in it carry one primary binding site each. A common secondary binding site exists on the third Greek key. Modelling shows that both the primary sites can support 1-2, 1-3, and 1-6 linked mannosides with the second residue interacting in each case primarily with the secondary binding site. Modelling also readily leads to a bound branched mannopentose with the nonreducing ends of the two branches anchored at the two primary binding sites, providing a structural explanation for the lectin's specificity for branched alpha-mannans. A comparison of the dimeric banana lectin with other beta-prism I fold lectins, provides interesting insights into the variability in their quaternary structure.

  1. A Dual-Specific Targeting Approach Based on the Simultaneous Recognition of Duplex and Quadruplex Motifs.

    PubMed

    Nguyen, Thi Quynh Ngoc; Lim, Kah Wai; Phan, Anh Tuân

    2017-09-20

    Small-molecule ligands targeting nucleic acids have been explored as potential therapeutic agents. Duplex groove-binding ligands have been shown to recognize DNA in a sequence-specific manner. On the other hand, quadruplex-binding ligands exhibit high selectivity between quadruplex and duplex, but show limited discrimination between different quadruplex structures. Here we propose a dual-specific approach through the simultaneous application of duplex- and quadruplex-binders. We demonstrated that a quadruplex-specific ligand and a duplex-specific ligand can simultaneously interact at two separate binding sites of a quadruplex-duplex hybrid harbouring both quadruplex and duplex structural elements. Such a dual-specific targeting strategy would combine the sequence specificity of duplex-binders and the strong binding affinity of quadruplex-binders, potentially allowing the specific targeting of unique quadruplex structures. Future research can be directed towards the development of conjugated compounds targeting specific genomic quadruplex-duplex sites, for which the linker would be highly context-dependent in terms of length and flexibility, as well as the attachment points onto both ligands.

  2. The role of differing probe and target strand lengths in DNA microarrays investigated via Monte Carlo molecular simulation

    NASA Astrophysics Data System (ADS)

    Rivard, Brea R.; Cooper, Sarah J.; Stubbs, John M.

    2018-02-01

    DNA duplexes consisting of a 25mer together with shorter complementary sequences were studied over a range of temperature and surface binding motifs using a coarse-grained two-site nucleotide model. Results were analyzed in terms of hydrogen bonding interactions and structural characteristics and indicate that hybridization is most stable when furthest from the surface binding site. Strand elongation and straightening near the bound end are found to be correlated to duplex destabilization.

  3. Mass spectrometry for identification of proteins that specifically bind to a distal enhancer of the Oct4 gene

    NASA Astrophysics Data System (ADS)

    Bakhmet, E. I.; Nazarov, I. B.; Artamonova, T. O.; Khodorkovsky, M. A.; Tomilin, A. N.

    2017-11-01

    Transcription factor Oct4 is a marker of pluripotent stem cells and has a significant role in their self-renewal. Oct4 gene is controlled by three cis-regulatory elements - proximal promoter, proximal enhancer and distal enhancer. All of these elements are targets for binding of regulatory proteins. Distal enhancer is in our research focus because of its activity in early stages of embryonic development. There are two main sequences called site 2A and site 2B that are presented in distal enhancer. For this moment proteins which bind to a site 2A (CCCCTCCCCCC) remain unknown. Using combination of in vitro method electrophoretic mobility shift assay (EMSA) and mass spectromery we identified several candidates that can regulate Oct4 gene expression through site 2A.

  4. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

    PubMed

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-02-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.

  5. Genome-Wide Screens for In Vivo Tinman Binding Sites Identify Cardiac Enhancers with Diverse Functional Architectures

    PubMed Central

    Jin, Hong; Stojnic, Robert; Adryan, Boris; Ozdemir, Anil; Stathopoulos, Angelike; Frasch, Manfred

    2013-01-01

    The NK homeodomain factor Tinman is a crucial regulator of early mesoderm patterning and, together with the GATA factor Pannier and the Dorsocross T-box factors, serves as one of the key cardiogenic factors during specification and differentiation of heart cells. Although the basic framework of regulatory interactions driving heart development has been worked out, only about a dozen genes involved in heart development have been designated as direct Tinman target genes to date, and detailed information about the functional architectures of their cardiac enhancers is lacking. We have used immunoprecipitation of chromatin (ChIP) from embryos at two different stages of early cardiogenesis to obtain a global overview of the sequences bound by Tinman in vivo and their linked genes. Our data from the analysis of ∼50 sequences with high Tinman occupancy show that the majority of such sequences act as enhancers in various mesodermal tissues in which Tinman is active. All of the dorsal mesodermal and cardiac enhancers, but not some of the others, require tinman function. The cardiac enhancers feature diverse arrangements of binding motifs for Tinman, Pannier, and Dorsocross. By employing these cardiac and non-cardiac enhancers in machine learning approaches, we identify a novel motif, termed CEE, as a classifier for cardiac enhancers. In vivo assays for the requirement of the binding motifs of Tinman, Pannier, and Dorsocross, as well as the CEE motifs in a set of cardiac enhancers, show that the Tinman sites are essential in all but one of the tested enhancers; although on occasion they can be functionally redundant with Dorsocross sites. The enhancers differ widely with respect to their requirement for Pannier, Dorsocross, and CEE sites, which we ascribe to their different position in the regulatory circuitry, their distinct temporal and spatial activities during cardiogenesis, and functional redundancies among different factor binding sites. PMID:23326246

  6. Structural and functional analysis of mouse Msx1 gene promoter: sequence conservation with human MSX1 promoter points at potential regulatory elements.

    PubMed

    Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E

    1998-06-01

    Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.

  7. Cloning of Novel Isoforms of the Human Gli2 Oncogene and Their Activities To Enhance Tax-Dependent Transcription of the Human T-Cell Leukemia Virus Type 1 Genome

    PubMed Central

    Tanimura, Akira; Dan, Shingo; Yoshida, Mitsuaki

    1998-01-01

    The expression of human T-cell leukemia virus type 1 (HTLV-1) is activated by interaction of a viral transactivator protein, Tax, and cellular transcription factor, CREB (cyclic AMP response element binding protein), which bind to a 21-bp enhancer in the long terminal repeats (LTR). THP (Tax-helping protein) was previously determined to enhance the transactivation by Tax protein. Here we report novel forms of the human homolog of a member of the Gli oncogene family, Gli2 (also termed Gli2/THP), an extended form of a zinc finger protein, THP, which was described previously. Four possible isoforms (hGli2 α, β, γ, and δ) are formed by combinations of two independent alternative splicings, and all the isoforms could bind to a DNA motif, TRE2S, in the LTR. The longer isoforms, α and β, were abundantly expressed in various cell lines including HTLV-1-infected T-cell lines. Fusion proteins of the hGli2 isoforms with the DNA-binding domain of Gal4 activated transcription when the reporter contained a Gal4-binding site and one copy of the 21-bp sequence, to which CREB binds. This activation was observed only in the presence of Tax. The 21-bp sequence in the reporter was also essential for the activation. These results suggest that simultaneous binding of hGli2 and CREB to the respective sites in the reporter seems to be critical for Tax protein to activate transcription. Consequently, it is probable that the LTR can be regulated by two independent signals through hGli2 and CREB, since the LTR contains the 21-bp and TRE2S sequences in the vicinity. PMID:9557682

  8. Structure of an N276-Dependent HIV-1 Neutralizing Antibody Targeting a Rare V5 Glycan Hole Adjacent to the CD4 Binding Site.

    PubMed

    Wibmer, Constantinos Kurt; Gorman, Jason; Anthony, Colin S; Mkhize, Nonhlanhla N; Druz, Aliaksandr; York, Talita; Schmidt, Stephen D; Labuschagne, Phillip; Louder, Mark K; Bailer, Robert T; Abdool Karim, Salim S; Mascola, John R; Williamson, Carolyn; Moore, Penny L; Kwong, Peter D; Morris, Lynn

    2016-11-15

    All HIV-1-infected individuals develop strain-specific neutralizing antibodies to their infecting virus, which in some cases mature into broadly neutralizing antibodies. Defining the epitopes of strain-specific antibodies that overlap conserved sites of vulnerability might provide mechanistic insights into how broadly neutralizing antibodies arise. We previously described an HIV-1 clade C-infected donor, CAP257, who developed broadly neutralizing plasma antibodies targeting an N276 glycan-dependent epitope in the CD4 binding site. The initial CD4 binding site response potently neutralized the heterologous tier 2 clade B viral strain RHPA, which was used to design resurfaced gp120 antigens for single-B-cell sorting. Here we report the isolation and structural characterization of CAP257-RH1, an N276 glycan-dependent CD4 binding site antibody representative of the early CD4 binding site plasma response in donor CAP257. The cocrystal structure of CAP257-RH1 bound to RHPA gp120 revealed critical interactions with the N276 glycan, loop D, and V5, but not with aspartic acid 368, similarly to HJ16 and 179NC75. The CAP257-RH1 monoclonal antibody was derived from the immunoglobulin-variable IGHV3-33 and IGLV3-10 genes and neutralized RHPA but not the transmitted/founder virus from donor CAP257. Its narrow neutralization breadth was attributed to a binding angle that was incompatible with glycosylated V5 loops present in almost all HIV-1 strains, including the CAP257 transmitted/founder virus. Deep sequencing of autologous CAP257 viruses, however, revealed minority variants early in infection that lacked V5 glycans. These glycan-free V5 loops are unusual holes in the glycan shield that may have been necessary for initiating this N276 glycan-dependent CD4 binding site B-cell lineage. The conserved CD4 binding site on gp120 is a major target for HIV-1 vaccine design, but key events in the elicitation and maturation of different antibody lineages to this site remain elusive. Studies have shown that strain-specific antibodies can evolve into broadly neutralizing antibodies or in some cases act as helper lineages. Therefore, characterizing the epitopes of strain-specific antibodies may help to inform the design of HIV-1 immunogens to elicit broadly neutralizing antibodies. In this study, we isolate a narrowly neutralizing N276 glycan-dependent antibody and use X-ray crystallography and viral deep sequencing to describe how gp120 lacking glycans in V5 might have elicited these early glycan-dependent CD4 binding site antibodies. These data highlight how glycan holes can play a role in the elicitation of B-cell lineages targeting the CD4 binding site. Copyright © 2016 Wibmer et al.

  9. Structure of an N276-Dependent HIV-1 Neutralizing Antibody Targeting a Rare V5 Glycan Hole Adjacent to the CD4 Binding Site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wibmer, Constantinos Kurt; Gorman, Jason; Anthony, Colin S.

    ABSTRACT All HIV-1-infected individuals develop strain-specific neutralizing antibodies to their infecting virus, which in some cases mature into broadly neutralizing antibodies. Defining the epitopes of strain-specific antibodies that overlap conserved sites of vulnerability might provide mechanistic insights into how broadly neutralizing antibodies arise. We previously described an HIV-1 clade C-infected donor, CAP257, who developed broadly neutralizing plasma antibodies targeting an N276 glycan-dependent epitope in the CD4 binding site. The initial CD4 binding site response potently neutralized the heterologous tier 2 clade B viral strain RHPA, which was used to design resurfaced gp120 antigens for single-B-cell sorting. Here we report themore » isolation and structural characterization of CAP257-RH1, an N276 glycan-dependent CD4 binding site antibody representative of the early CD4 binding site plasma response in donor CAP257. The cocrystal structure of CAP257-RH1 bound to RHPA gp120 revealed critical interactions with the N276 glycan, loop D, and V5, but not with aspartic acid 368, similarly to HJ16 and 179NC75. The CAP257-RH1 monoclonal antibody was derived from the immunoglobulin-variable IGHV3-33 and IGLV3-10 genes and neutralized RHPA but not the transmitted/founder virus from donor CAP257. Its narrow neutralization breadth was attributed to a binding angle that was incompatible with glycosylated V5 loops present in almost all HIV-1 strains, including the CAP257 transmitted/founder virus. Deep sequencing of autologous CAP257 viruses, however, revealed minority variants early in infection that lacked V5 glycans. These glycan-free V5 loops are unusual holes in the glycan shield that may have been necessary for initiating this N276 glycan-dependent CD4 binding site B-cell lineage. IMPORTANCEThe conserved CD4 binding site on gp120 is a major target for HIV-1 vaccine design, but key events in the elicitation and maturation of different antibody lineages to this site remain elusive. Studies have shown that strain-specific antibodies can evolve into broadly neutralizing antibodies or in some cases act as helper lineages. Therefore, characterizing the epitopes of strain-specific antibodies may help to inform the design of HIV-1 immunogens to elicit broadly neutralizing antibodies. In this study, we isolate a narrowly neutralizing N276 glycan-dependent antibody and use X-ray crystallography and viral deep sequencing to describe how gp120 lacking glycans in V5 might have elicited these early glycan-dependent CD4 binding site antibodies. These data highlight how glycan holes can play a role in the elicitation of B-cell lineages targeting the CD4 binding site.« less

  10. Structure of an N276-Dependent HIV-1 Neutralizing Antibody Targeting a Rare V5 Glycan Hole Adjacent to the CD4 Binding Site

    PubMed Central

    Wibmer, Constantinos Kurt; Gorman, Jason; Anthony, Colin S.; Mkhize, Nonhlanhla N.; Druz, Aliaksandr; York, Talita; Schmidt, Stephen D.; Labuschagne, Phillip; Louder, Mark K.; Bailer, Robert T.; Abdool Karim, Salim S.; Mascola, John R.; Williamson, Carolyn; Moore, Penny L.

    2016-01-01

    ABSTRACT All HIV-1-infected individuals develop strain-specific neutralizing antibodies to their infecting virus, which in some cases mature into broadly neutralizing antibodies. Defining the epitopes of strain-specific antibodies that overlap conserved sites of vulnerability might provide mechanistic insights into how broadly neutralizing antibodies arise. We previously described an HIV-1 clade C-infected donor, CAP257, who developed broadly neutralizing plasma antibodies targeting an N276 glycan-dependent epitope in the CD4 binding site. The initial CD4 binding site response potently neutralized the heterologous tier 2 clade B viral strain RHPA, which was used to design resurfaced gp120 antigens for single-B-cell sorting. Here we report the isolation and structural characterization of CAP257-RH1, an N276 glycan-dependent CD4 binding site antibody representative of the early CD4 binding site plasma response in donor CAP257. The cocrystal structure of CAP257-RH1 bound to RHPA gp120 revealed critical interactions with the N276 glycan, loop D, and V5, but not with aspartic acid 368, similarly to HJ16 and 179NC75. The CAP257-RH1 monoclonal antibody was derived from the immunoglobulin-variable IGHV3-33 and IGLV3-10 genes and neutralized RHPA but not the transmitted/founder virus from donor CAP257. Its narrow neutralization breadth was attributed to a binding angle that was incompatible with glycosylated V5 loops present in almost all HIV-1 strains, including the CAP257 transmitted/founder virus. Deep sequencing of autologous CAP257 viruses, however, revealed minority variants early in infection that lacked V5 glycans. These glycan-free V5 loops are unusual holes in the glycan shield that may have been necessary for initiating this N276 glycan-dependent CD4 binding site B-cell lineage. IMPORTANCE The conserved CD4 binding site on gp120 is a major target for HIV-1 vaccine design, but key events in the elicitation and maturation of different antibody lineages to this site remain elusive. Studies have shown that strain-specific antibodies can evolve into broadly neutralizing antibodies or in some cases act as helper lineages. Therefore, characterizing the epitopes of strain-specific antibodies may help to inform the design of HIV-1 immunogens to elicit broadly neutralizing antibodies. In this study, we isolate a narrowly neutralizing N276 glycan-dependent antibody and use X-ray crystallography and viral deep sequencing to describe how gp120 lacking glycans in V5 might have elicited these early glycan-dependent CD4 binding site antibodies. These data highlight how glycan holes can play a role in the elicitation of B-cell lineages targeting the CD4 binding site. PMID:27581986

  11. Molecular identification and transcriptional regulation of porcine IFIT2 gene.

    PubMed

    Yang, Xiuqin; Jing, Xiaoyan; Song, Yanfang; Zhang, Caixia; Liu, Di

    2018-04-06

    IFN-induced protein with tetratricopeptide repeats 2 (IFIT2) plays important roles in host defense against viral infection as revealed by studies in humans and mice. However, little is known on porcine IFIT2 (pIFIT2). Here, we performed molecular cloning, expression profile, and transcriptional regulation analysis of pIFIT2. pIFIT2 gene, located on chromosome 14, is composed of two exons and have a complete coding sequence of 1407 bp. The encoded polypeptide, 468 aa in length, has three tetratricopeptide repeat motifs. pIFIT2 gene was unevenly distributed in all eleven tissues studied with the most abundance in spleen. Poly(I:C) treatment notably strongly upregulated the mRNA level and promoter activity of pIFIT2 gene. Upstream sequence of 1759 bp from the start codon which was assigned +1 here has promoter activity, and deltaEF1 acts as transcription repressor through binding to sequences at position - 1774 to - 1764. Minimal promoter region exists within nucleotide position - 162 and - 126. Two adjacent interferon-stimulated response elements (ISREs) and two nuclear factor (NF)-κB binding sites were identified within position - 310 and - 126. The ISRE elements act alone and in synergy with the one closer to start codon having more strength, so do the NF-κB binding sites. Synergistic effect was also found between the ISRE and NF-κB binding sites. Additionally, a third ISRE element was identified within position - 1661 to - 1579. These findings will contribute to clarifying the antiviral effect and underlying mechanisms of pIFIT2.

  12. Regulation of expression of the ada gene controlling the adaptive response. Interactions with the ada promoter of the Ada protein and RNA polymerase.

    PubMed

    Sakumi, K; Sekiguchi, M

    1989-01-20

    The Ada protein of Escherichia coli catalyzes transfer of methyl groups from methylated DNA to its own molecule, and the methylated form of Ada protein promotes transcription of its own gene, ada. Using an in vitro reconstituted system, we found that both the sigma factor and the methylated Ada protein are required for transcription of the ada gene. To elucidate molecular mechanisms involved in the regulation of the ada transcription, we investigated interactions of the non-methylated and methylated forms of Ada protein and the RNA polymerase holo enzyme (the core enzyme and sigma factor) with a DNA fragment carrying the ada promoter region. Footprinting analyses revealed that the methylated Ada protein binds to a region from positions -63 to -31, which includes the ada regulatory sequence AAAGCGCA. No firm binding was observed with the non-methylated Ada protein, although some DNase I-hypersensitive sites were produced in the promoter by both types of Ada protein. RNA polymerase did bind to the promoter once the methylated Ada protein had bound to the upstream sequence. To correlate these phenomena with the process in vivo, we used the DNAs derived from promoter-defective mutants. No binding of Ada protein nor of RNA polymerase occurred with a mutant DNA having a C to G substitution at position -47 within the ada regulatory sequence. In the case of a -35 box mutant with a T to A change at position -34, the methylated Ada protein did bind to the ada regulatory sequence, yet there was no RNA polymerase binding. Thus, the binding of the methylated Ada protein to the upstream region apparently facilitates binding of the RNA polymerase to the proper region of the promoter. The Ada protein possesses two known methyl acceptor sites, Cys69 and Cys321. The role of methylation of each cysteine residue was investigated using mutant forms of the Ada protein. The Ada protein with the cysteine residue at position 69 replaced by alanine was incapable of binding to the ada promoter even when the cysteine residue at position 321 of the protein was methylated. When the Ada protein with alanine at position 321 was methylated, it acquired the potential to bind to the ada promoter. These results are compatible with the notion that methylation of the cysteine residue at position 69 causes a conformational change of the Ada protein, thereby facilitating binding of the protein to the upstream regulatory sequence.

  13. Prediction of TF target sites based on atomistic models of protein-DNA complexes

    PubMed Central

    Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno

    2008-01-01

    Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190

  14. Null alleles and sequence variations at primer binding sites of STR loci within multiplex typing systems.

    PubMed

    Yao, Yining; Yang, Qinrui; Shao, Chengchen; Liu, Baonian; Zhou, Yuxiang; Xu, Hongmei; Zhou, Yueqin; Tang, Qiqun; Xie, Jianhui

    2018-01-01

    Rare variants are widely observed in human genome and sequence variations at primer binding sites might impair the process of PCR amplification resulting in dropouts of alleles, named as null alleles. In this study, 5 cases from routine paternity testing using PowerPlex ® 21 System for STR genotyping were considered to harbor null alleles at TH01, FGA, D5S818, D8S1179, and D16S539, respectively. The dropout of alleles was confirmed by using alternative commercial kits AGCU Expressmarker 22 PCR amplification kit and AmpFℓSTR ® . Identifiler ® Plus Kit, and sequencing results revealed a single base variation at the primer binding site of each STR locus. Results from the collection of previous reports show that null alleles at D5S818 were frequently observed in population detected by two PowerPlex ® typing systems and null alleles at D19S433 were mostly observed in Japanese population detected by two AmpFℓSTR™ typing systems. Furthermore, the most popular mutation type appeared the transition from C to T with G to A, which might have a potential relationship with DNA methylation. Altogether, these results can provide helpful information in forensic practice to the elimination of genotyping discrepancy and the development of primer sets. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Structural and functional conservation of CLEC-2 with the species-specific regulation of transcript expression in evolution.

    PubMed

    Wang, Lan; Ren, Shifang; Zhu, Haiyan; Zhang, Dongmei; Hao, Yuqing; Ruan, Yuanyuan; Zhou, Lei; Lee, Chiayu; Qiu, Lin; Yun, Xiaojing; Xie, Jianhui

    2012-08-01

    CLEC-2 was first identified by sequence similarity to C-type lectin-like molecules with immune functions and has been reported as a receptor for the platelet-aggregating snake venom toxin rhodocytin and the endogenous sialoglycoprotein podoplanin. Recent researches indicate that CLEC-2-deficient mice were lethal at the embryonic stage associated with disorganized and blood-filled lymphatic vessels and severe edema. In view of a necessary role of CLEC-2 in the individual development, it is of interest to investigate its phylogenetic homology and highly conserved functional regions. In this work, we reported that CLEC-2 from different species holds with an extraordinary conservation by sequence alignment and phylogenetic tree analysis. The functional structures including N-linked oligosaccharide sites and ligand-binding domain implement a structural and functional conservation in a variety of species. The glycosylation sites (N120 and N134) are necessary for the surface expression CLEC-2. CLEC-2 from different species possesses the binding activity of mouse podoplanin. Nevertheless, the expression of CLEC-2 is regulated with a species-specific manner. The alternative splicing of pre-mRNA, a regulatory mechanism of gene expression, and the binding sites on promoter for several key transcription factors vary between different species. Therefore, CLEC-2 shares high sequence homology and functional identity. However the transcript expression might be tightly regulated by different mechanisms in evolution.

  16. RIPiT-Seq: A high-throughput approach for footprinting RNA:protein complexes

    PubMed Central

    Singh, Guramrit; Ricci, Emiliano P.; Moore, Melissa J.

    2013-01-01

    Development of high-throughput approaches to map the RNA interaction sites of individual RNA binding proteins (RBPs) transcriptome-wide is rapidly transforming our understanding of post-transcriptional gene regulatory mechanisms. Here we describe a ribonucleoprotein (RNP) footprinting approach we recently developed for identifying occupancy sites of both individual RBPs and multi-subunit RNP complexes. RNA:protein immunoprecipitation in tandem (RIPiT) yields highly specific RNA footprints of cellular RNPs isolated via two sequential purifications; the resulting RNA footprints can then be identified by high-throughput sequencing (Seq). RIPiT-Seq is broadly applicable to all RBPs regardless of their RNA binding mode and thus provides a means to map the RNA binding sites of RBPs with poor inherent ultraviolet (UV) crosslinkability. Further, among current high-throughput approaches, RIPiT has the unique capacity to differentiate binding sites of RNPs with overlapping protein composition. It is therefore particularly suited for studying dynamic RNP assemblages whose composition evolves as gene expression proceeds. PMID:24096052

  17. Programmable Oligomers Targeting 5′-GGGG-3′ in the Minor Groove of DNA and NF-κB Binding Inhibition

    PubMed Central

    Chenoweth, David M.; Poposki, Julie A.; Marques, Michael A.; Dervan, Peter B.

    2009-01-01

    A series of hairpin oligomers containing benzimidazole (Bi) and imidazopyridine (Ip) rings were synthesized and screened to target 5′-WGGGGW-3′, a core sequence in the DNA binding site of NF-κB, a prolific transcription factor important in biology and disease. Five Bi and Ip containing oligomers bound to the 5′-WGGGGW-3′ site with high affinity. One of the oligomers (Im-Im-Im-Im-γ-PyBi-PyBi-β-Dp) was able to inhibit DNA binding by the transcription factor NF-κB. PMID:17095230

  18. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    PubMed

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  19. Cloning of cDNA sequences encoding cowpea (Vigna unguiculata) vicilins: Computational simulations suggest a binding mode of cowpea vicilins to chitin oligomers.

    PubMed

    Rocha, Antônio J; Sousa, Bruno L; Girão, Matheus S; Barroso-Neto, Ito L; Monteiro-Júnior, José E; Oliveira, José T A; Nagano, Celso S; Carneiro, Rômulo F; Monteiro-Moreira, Ana C O; Rocha, Bruno A M; Freire, Valder N; Grangeiro, Thalles B

    2018-05-27

    Vicilins are 7S globulins which constitute the major seed storage proteins in leguminous species. Variant vicilins showing differential binding affinities for chitin have been implicated in the resistance and susceptibility of cowpea to the bruchid Callosobruchus maculatus. These proteins are members of the cupin superfamily, which includes a wide variety of enzymes and non-catalytic seed storage proteins. The cupin fold does not share similarity with any known chitin-biding domain. Therefore, it is poorly understood how these storage proteins bind to chitin. In this work, partial cDNA sequences encoding β-vignin, the major component of cowpea vicilins, were obtained from developing seeds. Three-dimensional molecular models of β-vignin showed the characteristic cupin fold and computational simulations revealed that each vicilin trimer contained 3 chitin-binding sites. Interaction models showed that chito-oligosaccharides bound to β-vignin were stabilized mainly by hydrogen bonds, a common structural feature of typical carbohydrate-binding proteins. Furthermore, many of the residues involved in the chitin-binding sites of β-vignin are conserved in other 7S globulins. These results support previous experimental evidences on the ability of vicilin-like proteins from cowpea and other leguminous species to bind in vitro to chitin as well as in vivo to chitinous structures of larval C. maculatus midgut. Copyright © 2018. Published by Elsevier B.V.

  20. Efficient computation of optimal oligo-RNA binding.

    PubMed

    Hodas, Nathan O; Aalberts, Daniel P

    2004-01-01

    We present an algorithm that calculates the optimal binding conformation and free energy of two RNA molecules, one or both oligomeric. This algorithm has applications to modeling DNA microarrays, RNA splice-site recognitions and other antisense problems. Although other recent algorithms perform the same calculation in time proportional to the sum of the lengths cubed, O((N1 + N2)3), our oligomer binding algorithm, called bindigo, scales as the product of the sequence lengths, O(N1*N2). The algorithm performs well in practice with the aid of a heuristic for large asymmetric loops. To demonstrate its speed and utility, we use bindigo to investigate the binding proclivities of U1 snRNA to mRNA donor splice sites.

  1. Cholesterol-Binding Sites in GIRK Channels: The Devil is in the Details.

    PubMed

    Rosenhouse-Dantsker, Avia

    2018-01-01

    In recent years, it has become evident that cholesterol plays a direct role in the modulation of a variety of ion channels. In most cases, cholesterol downregulates channel activity. In contrast, our earlier studies have demonstrated that atrial G protein inwardly rectifying potassium (GIRK) channels are upregulated by cholesterol. Recently, we have shown that hippocampal GIRK currents are also upregulated by cholesterol. A combined computational-experimental approach pointed to putative cholesterol-binding sites in the transmembrane domain of the GIRK2 channel, the primary subunit in hippocampal GIRK channels. In particular, the principal cholesterol-binding site was located in the center of the transmembrane domain in between the inner and outer α-helices of 2 adjacent subunits. Further studies pointed to a similar cholesterol-binding site in GIRK4, a major subunit in atrial GIRK channels. However, a close look at a sequence alignment of the transmembrane helices of the 2 channels reveals surprising differences among the residues that interact with the cholesterol molecule in these 2 channels. Here, we compare the residues that form putative cholesterol-binding sites in GIRK2 and GIRK4 and discuss the similarities and differences among them.

  2. Nuclear factor ETF specifically stimulates transcription from promoters without a TATA box.

    PubMed

    Kageyama, R; Merlino, G T; Pastan, I

    1989-09-15

    Transcription factor ETF stimulates the expression of the epidermal growth factor receptor (EGFR) gene which does not have a TATA box in the promoter region. Here, we show that ETF recognizes various GC-rich sequences including stretches of deoxycytidine or deoxyguanosine residues and GC boxes with similar affinities. ETF also binds to TATA boxes but with a lower affinity. ETF stimulated in vitro transcription from several promoters without TATA boxes but had little or no effect on TATA box-containing promoters even though they had strong ETF-binding sites. These inactive ETF-binding sites became functional when placed upstream of the EGFR promoter whose own ETF-binding sites were removed. Furthermore, when a TATA box was introduced into the EGFR promoter, the responsiveness to ETF was abolished. These results indicate that ETF is a specific transcription factor for promoters which do not contain TATA elements.

  3. Identification of Nucleic Acid Binding Sites on Translin-Associated Factor X (TRAX) Protein

    PubMed Central

    Gupta, Gagan Deep; Kumar, Vinay

    2012-01-01

    Translin and TRAX proteins play roles in very important cellular processes such as DNA recombination, spatial and temporal expression of mRNA, and in siRNA processing. Translin forms a homomeric nucleic acid binding complex and binds to ssDNA and RNA. However, a mutant translin construct that forms homomeric complex lacking nucleic acid binding activity is able to form fully active heteromeric translin-TRAX complex when co-expressed with TRAX. A substantial progress has been made in identifying translin sites that mediate its binding activity, while TRAX was thought not to bind DNA or RNA on its own. We here for the first time demonstrate nucleic acid binding to TRAX by crosslinking radiolabeled ssDNA to heteromeric translin-TRAX complex using UV-laser. The TRAX and translin, photochemically crosslinked with ssDNA, were individually detected on SDS-PAGE. We mutated two motifs in TRAX and translin, designated B2 and B3, to help define the nucleic acid binding sites in the TRAX sequence. The most pronounced effect was observed in the mutants of B3 motif that impaired nucleic acid binding activity of the heteromeric complexes. We suggest that both translin and TRAX are binding competent and contribute to the nucleic acid binding activity. PMID:22427937

  4. BiPPred: Combined sequence- and structure-based prediction of peptide binding to the Hsp70 chaperone BiP.

    PubMed

    Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris

    2016-10-01

    Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  5. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko

    2006-06-10

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, amore » cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.« less

  6. Sequence-Specific Targeting of Dosage Compensation in Drosophila Favors an Active Chromatin Context

    PubMed Central

    Gelbart, Marnie; Tolstorukov, Michael Y.; Plachetka, Annette; Kharchenko, Peter V.; Jung, Youngsook L.; Gorchakov, Andrey A.; Larschan, Erica; Gu, Tingting; Minoda, Aki; Riddle, Nicole C.; Schwartz, Yuri B.; Elgin, Sarah C. R.; Karpen, Gary H.; Pirrotta, Vincenzo; Kuroda, Mitzi I.; Park, Peter J.

    2012-01-01

    The Drosophila MSL complex mediates dosage compensation by increasing transcription of the single X chromosome in males approximately two-fold. This is accomplished through recognition of the X chromosome and subsequent acetylation of histone H4K16 on X-linked genes. Initial binding to the X is thought to occur at “entry sites” that contain a consensus sequence motif (“MSL recognition element” or MRE). However, this motif is only ∼2 fold enriched on X, and only a fraction of the motifs on X are initially targeted. Here we ask whether chromatin context could distinguish between utilized and non-utilized copies of the motif, by comparing their relative enrichment for histone modifications and chromosomal proteins mapped in the modENCODE project. Through a comparative analysis of the chromatin features in male S2 cells (which contain MSL complex) and female Kc cells (which lack the complex), we find that the presence of active chromatin modifications, together with an elevated local GC content in the surrounding sequences, has strong predictive value for functional MSL entry sites, independent of MSL binding. We tested these sites for function in Kc cells by RNAi knockdown of Sxl, resulting in induction of MSL complex. We show that ectopic MSL expression in Kc cells leads to H4K16 acetylation around these sites and a relative increase in X chromosome transcription. Collectively, our results support a model in which a pre-existing active chromatin environment, coincident with H3K36me3, contributes to MSL entry site selection. The consequences of MSL targeting of the male X chromosome include increase in nucleosome lability, enrichment for H4K16 acetylation and JIL-1 kinase, and depletion of linker histone H1 on active X-linked genes. Our analysis can serve as a model for identifying chromatin and local sequence features that may contribute to selection of functional protein binding sites in the genome. PMID:22570616

  7. TRX-LOGOS - a graphical tool to demonstrate DNA information content dependent upon backbone dynamics in addition to base sequence.

    PubMed

    Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A

    2015-01-01

    It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.

  8. Detection of the CLOCK/BMAL1 heterodimer using a nucleic acid probe with cycling probe technology.

    PubMed

    Nakagawa, Kazuhiro; Yamamoto, Takuro; Yasuda, Akio

    2010-09-15

    An isothermal signal amplification technique for specific DNA sequences, known as cycling probe technology (CPT), has enabled rapid acquisition of genomic information. Here we report an analogous technique for the detection of an activated transcription factor, a transcription element-binding assay with fluorescent amplification by apurinic/apyrimidinic (AP) site lysis cycle (TEFAL). This simple amplification assay can detect activated transcription factors by using a unique nucleic acid probe containing a consensus binding sequence and an AP site, which enables the CPT reaction with AP endonuclease. In this article, we demonstrate that this method detects the functional CLOCK/BMAL1 heterodimer via the TEFAL probe containing the E-box consensus sequence to which the CLOCK/BMAL1 heterodimer binds. Using TEFAL combined with immunoassays, we measured oscillations in the amount of CLOCK/BMAL1 heterodimer in serum-stimulated HeLa cells. Furthermore, we succeeded in measuring the circadian accumulation of the functional CLOCK/BMAL1 heterodimer in human buccal mucosa cells. TEFAL contributes greatly to the study of transcription factor activation in mammalian tissues and cell extracts and is a powerful tool for less invasive investigation of human circadian rhythms. 2010 Elsevier Inc. All rights reserved.

  9. Homo sapiens-Specific Binding Site Variants within Brain Exclusive Enhancers Are Subject to Accelerated Divergence across Human Population.

    PubMed

    Zehra, Rabail; Abbasi, Amir Ali

    2018-03-01

    Empirical assessments of human accelerated noncoding DNA frgaments have delineated presence of many cis-regulatory elements. Enhancers make up an important category of such accelerated cis-regulatory elements that efficiently control the spatiotemporal expression of many developmental genes. Establishing plausible reasons for accelerated enhancer sequence divergence in Homo sapiens has been termed significant in various previously published studies. This acceleration by including closely related primates and archaic human data has the potential to open up evolutionary avenues for deducing present-day brain structure. This study relied on empirically confirmed brain exclusive enhancers to avoid any misjudgments about their regulatory status and categorized among them a subset of enhancers with an exceptionally accelerated rate of lineage specific divergence in humans. In this assorted set, 13 distinct transcription factor binding sites were located that possessed unique existence in humans. Three of 13 such sites belonging to transcription factors SOX2, RUNX1/3, and FOS/JUND possessed single nucleotide variants that made them unique to H. sapiens upon comparisons with Neandertal and Denisovan orthologous sequences. These variants modifying the binding sites in modern human lineage were further substantiated as single nucleotide polymorphisms via exploiting 1000 Genomes Project Phase3 data. Long range haplotype based tests laid out evidence of positive selection to be governing in African population on two of the modern human motif modifying alleles with strongest results for SOX2 binding site. In sum, our study acknowledges acceleration in noncoding regulatory landscape of the genome and highlights functional parts within it to have undergone accelerated divergence in present-day human population.

  10. GuiTope: an application for mapping random-sequence peptides to protein sequences.

    PubMed

    Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert

    2012-01-03

    Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  11. Albumin Redhill (-1 Arg, 320 Ala yields Thr): A glycoprotein variant of human serum albumin whose precursor has an aberrant signal peptidase cleavage site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brennan, S.O.; Myles, T.; Peach, R.J.

    1990-01-01

    Albumin Redhill is an electrophoretically slow genetic variant of human serum albumin that does not bind {sup 63}Ni{sup 2+} and has a molecular mass 2.5 kDa higher than normal albumin. Its inability to bind Ni{sup 2+} was explained by the finding of an additional residue of Arg at position -1. This did not explain the molecular basis of the genetic variation or the increase in apparent molecular mass. Fractionation of tryptic digests on concanavalin A-Sepharose followed by peptide mapping of the bound and unbound fractions and sequence analysis of the glycopeptides identified a mutation of 320 Ala {yields} Thr. Thismore » introduces as Asn-Tyr-Thr oligosaccharide attachment sequence centered on Asn-318 and explains the increase in molecular mass. This, however, did not satisfactorily explain the presence of the additional Arg residue at position -1. DNA sequencing of polymerase chain reaction-amplified genomic DNA encoding the prepro sequence of albumin indicated an additional mutation of -2 Arg {yields} Cys. The authors propose that the new Phe-Cys-Arg sequence in the propeptide is an aberrant signal peptidase cleavage site and that the signal peptidase cleaves the propeptide of albumin Redhill in the lumen of the endoplasmic reticulum before it reaches the Golgi vesicles, the site of the diarginyl-specific proalbumin convertase.« less

  12. Comparison and correlation of binding mode of ATP in the kinase domains of Hexokinase family

    PubMed Central

    Kumar, Yellapu Nanda; Kumar, Pasupuleti Santhosh; Sowjenya, Gopal; Rao, Valasani Koteswara; Yeswanth, Sthanikam; Prasad, Uppu Venkateswara; Pradeepkiran, Jangampalli Adi; Sarma, PVGK; Bhaskar, Matcha

    2012-01-01

    Hexokinases (HKs) are the enzymes that catalyses the ATP dependent phosphorylation of Hexose sugars to Hexose-6-Phosphate (Hex-6-P). There exist four different forms of HKs namely HK-I, HK-II, HK-III and HK-IV and all of them share a common ATP binding site core surrounded by more variable sequence that determine substrate affinities. Although they share a common binding site but they differ in their kinetic functions, hence the present study is aimed to analyze the binding mode of ATP. The analysis revealed that the four ATP binding domains are showing 13 identical, 7 similar and 6 dissimilar residues with similar structural conformation. Molecular docking of ATP into the kinase domains using Molecular Operating Environment (MOE) soft ware tool clearly showed the variation in the binding mode of ATP with variable docking scores. This probably explains the variable phosphorylation rates among hexokinases family. PMID:22829728

  13. Shape-selective recognition of DNA abasic sites by metallohelices: inhibition of human AP endonuclease 1.

    PubMed

    Malina, Jaroslav; Scott, Peter; Brabec, Viktor

    2015-06-23

    Loss of a base in DNA leading to creation of an abasic (AP) site leaving a deoxyribose residue in the strand, is a frequent lesion that may occur spontaneously or under the action of various physical and chemical agents. Progress in the understanding of the chemistry and enzymology of abasic DNA largely relies upon the study of AP sites in synthetic duplexes. We report here on interactions of diastereomerically pure metallo-helical 'flexicate' complexes, bimetallic triple-stranded ferro-helicates [Fe2(NN-NN)3](4+) incorporating the common NN-NN bis(bidentate) helicand, with short DNA duplexes containing AP sites in different sequence contexts. The results show that the flexicates bind to AP sites in DNA duplexes in a shape-selective manner. They preferentially bind to AP sites flanked by purines on both sides and their binding is enhanced when a pyrimidine is placed in opposite orientation to the lesion. Notably, the Λ-enantiomer binds to all tested AP sites with higher affinity than the Δ-enantiomer. In addition, the binding of the flexicates to AP sites inhibits the activity of human AP endonuclease 1, which is as a valid anticancer drug target. Hence, this finding indicates the potential of utilizing well-defined metallo-helical complexes for cancer chemotherapy. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Probing binding hot spots at protein-RNA recognition sites.

    PubMed

    Barik, Amita; Nithin, Chandran; Karampudi, Naga Bhushana Rao; Mukherjee, Sunandan; Bahadur, Ranjit Prasad

    2016-01-29

    We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein-RNA interfaces to probe the binding hot spots at protein-RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein-protein and protein-RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein-RNA recognition sites with desired affinity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Identification and characterization of the sodium-binding site of activated protein C.

    PubMed

    He, X; Rezaie, A R

    1999-02-19

    Activated protein C (APC) requires both Ca2+ and Na+ for its optimal catalytic function. In contrast to the Ca2+-binding sites, the Na+-binding site(s) of APC has not been identified. Based on a recent study with thrombin, the 221-225 loop is predicted to be a potential Na+-binding site in APC. The sequence of this loop is not conserved in trypsin. We engineered a Gla domainless form of protein C (GDPC) in which the 221-225 loop was replaced with the corresponding loop of trypsin. We found that activated GDPC (aGDPC) required Na+ (or other alkali cations) for its amidolytic activity with dissociation constant (Kd(app)) = 44.1 +/- 8.6 mM. In the presence of Ca2+, however, the requirement for Na+ by aGDPC was eliminated, and Na+ stimulated the cleavage rate 5-6-fold with Kd(app) = 2.3 +/- 0.3 mM. Both cations were required for efficient factor Va inactivation by aGDPC. In the presence of Ca2+, the catalytic function of the mutant was independent of Na+. Unlike aGDPC, the mutant did not discriminate among monovalent cations. We conclude that the 221-225 loop is a Na+-binding site in APC and that an allosteric link between the Na+ and Ca2+ binding loops modulates the structure and function of this anticoagulant enzyme.

  16. Quantifying the Effect of DNA Packaging on Gene Expression Level

    NASA Astrophysics Data System (ADS)

    Kim, Harold

    2010-10-01

    Gene expression, the process by which the genetic code comes alive in the form of proteins, is one of the most important biological processes in living cells, and begins when transcription factors bind to specific DNA sequences in the promoter region upstream of a gene. The relationship between gene expression output and transcription factor input which is termed the gene regulation function is specific to each promoter, and predicting this gene regulation function from the locations of transcription factor binding sites is one of the challenges in biology. In eukaryotic organisms (for example, animals, plants, fungi etc), DNA is highly compacted into nucleosomes, 147-bp segments of DNA tightly wrapped around histone protein core, and therefore, the accessibility of transcription factor binding sites depends on their locations with respect to nucleosomes - sites inside nucleosomes are less accessible than those outside nucleosomes. To understand how transcription factor binding sites contribute to gene expression in a quantitative manner, we obtain gene regulation functions of promoters with various configurations of transcription factor binding sites by using fluorescent protein reporters to measure transcription factor input and gene expression output in single yeast cells. In this talk, I will show that the affinity of a transcription factor binding site inside and outside the nucleosome controls different aspects of the gene regulation function, and explain this finding based on a mass-action kinetic model that includes competition between nucleosomes and transcription factors.

  17. Evolution of the herpes thymidine kinase: identification and comparison of the equine herpesvirus 1 thymidine kinase gene reveals similarity to a cell-encoded thymidylate kinase.

    PubMed Central

    Robertson, G R; Whalley, J M

    1988-01-01

    We have identified the equine herpesvirus 1 (EHV-1) thymidine kinase gene (TK) by DNA-mediated transformation and by DNA sequencing. Alignment of the amino acid sequence of the EHV-1 TK with the TKs from 3 other herpesviruses revealed regions of homology, some of which correspond to the previously identified substrate binding sites, while others have as yet, no assigned function. In particular, the strict conservation of an aspartate within the proposed nucleoside binding site suggests a role in ATP binding for this residue. Comparison of 5 herpes TKs with the thymidylate kinase of yeast revealed significant similarity which was strongest in those regions important to catalytic activity of the herpes TKs, and, therefore we propose that the herpes TK may be derived from a cellular thymidylate kinase. The implications for the evolution of enzyme activities within a pathway of nucleotide metabolism are discussed. PMID:2849761

  18. RNA regulatory networks diversified through curvature of the PUF protein scaffold

    DOE PAGES

    Wilinski, Daniel; Qiu, Chen; Lapointe, Christopher P.; ...

    2015-09-14

    Proteins bind and control mRNAs, directing their localization, translation and stability. Members of the PUF family of RNA-binding proteins control multiple mRNAs in a single cell, and play key roles in development, stem cell maintenance and memory formation. Here we identified the mRNA targets of a S. cerevisiae PUF protein, Puf5p, by ultraviolet-crosslinking-affinity purification and high-throughput sequencing (HITS-CLIP). The binding sites recognized by Puf5p are diverse, with variable spacer lengths between two specific sequences. Each length of site correlates with a distinct biological function. Crystal structures of Puf5p–RNA complexes reveal that the protein scaffold presents an exceptionally flat and extendedmore » interaction surface relative to other PUF proteins. In complexes with RNAs of different lengths, the protein is unchanged. A single PUF protein repeat is sufficient to induce broadening of specificity. Changes in protein architecture, such as alterations in curvature, may lead to evolution of mRNA regulatory networks.« less

  19. CC-1065 functional analogues possessing different electron-withdrawing substituents and leaving groups: synthesis, kinetics, and sequence specificity of reaction with DNA and biological evaluation.

    PubMed

    Wang, Y; Gupta, R; Huang, L; Lown, J W

    1993-12-24

    Antitumor agent CC-1065 functional analogues possessing different electron-withdrawing substituents and leaving groups have been synthesized. The extent and the relative rates of DNA cleavage following alkylation by these CPI structures and thermal treatment were determined independently by an ethidium binding assay and by agarose gel electrophoresis experiments. The anticipated preferential covalent binding to adenine sites within the minor groove was confirmed by sequencing determination of selected agents on high-resolution gels. Certain of the synthetic agents, unlike CC-1065, also bind covalently to G sites with weaker intensity. The cytotoxicities of these compounds were also determined against KB cells in vitro. Compounds bearing a bromo or nitro group in the benzene ring and a methylsulfonyl as a leaving group are 10 and 5 times more potent than their unsubstituted counterparts, respectively. Compounds bearing a methylsulfonyl as a leaving group are more potent than those bearing a chlorine.

  20. RNA regulatory networks diversified through curvature of the PUF protein scaffold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilinski, Daniel; Qiu, Chen; Lapointe, Christopher P.

    Proteins bind and control mRNAs, directing their localization, translation and stability. Members of the PUF family of RNA-binding proteins control multiple mRNAs in a single cell, and play key roles in development, stem cell maintenance and memory formation. Here we identified the mRNA targets of a S. cerevisiae PUF protein, Puf5p, by ultraviolet-crosslinking-affinity purification and high-throughput sequencing (HITS-CLIP). The binding sites recognized by Puf5p are diverse, with variable spacer lengths between two specific sequences. Each length of site correlates with a distinct biological function. Crystal structures of Puf5p–RNA complexes reveal that the protein scaffold presents an exceptionally flat and extendedmore » interaction surface relative to other PUF proteins. In complexes with RNAs of different lengths, the protein is unchanged. A single PUF protein repeat is sufficient to induce broadening of specificity. Changes in protein architecture, such as alterations in curvature, may lead to evolution of mRNA regulatory networks.« less

  1. Binding sites for abundant nuclear factors modulate RNA polymerase I-dependent enhancer function in Saccharomyces cerevisiae.

    PubMed

    Kang, J J; Yokoi, T J; Holland, M J

    1995-12-01

    The 190-base pair (bp) rDNA enhancer within the intergenic spacer sequences of Saccharomyces cerevisiae rRNA cistrons activates synthesis of the 35S-rRNA precursor about 20-fold in vivo (Mestel,, R., Yip, M., Holland, J. P., Wang, E., Kang, J., and Holland, M. J. (1989) Mol. Cell. Biol. 9, 1243-1254). We now report identification and analysis of transcriptional activities mediated by three cis-acting sites within a 90-bp portion of the rDNA enhancer designated the modulator region. In vivo, these sequences mediated termination of transcription by RNA polymerase I and potentiated the activity of the rDNA enhancer element. Two trans-acting factors, REB1 and REB2, bind independently to sites within the modulator region (Morrow, B. E., Johnson, S. P., and Warner, J. R. (1989) J. Biol. Chem. 264, 9061-9068). We show that REB2 is identical to the ABF1 protien. Site-directed mutagenesis of REB1 and ABF1 binding sites demonstrated uncoupling of RNA polymerase I-dependent termination from transcriptional activation in vivo. We conclude that REB1 and ABF1 are required for RNA polymerase I-dependent termination and enhancer function, respectively, Since REB1 and ABF1 proteins also regulate expression of class II genes and other nuclear functions, our results suggest further similarities between RNA polymerase I and II regulatory mechanisms. Two rDNA enhancers flanking a rDNA minigene stimulated RNA polymerase I transcription in a "multiplicative" fashion. Deletion mapping analysis showed that similar cis-acting sequences were required for enhancer function when positioned upstream or downstream from a rDNA minigene.

  2. A comprehensive analysis of 3′ end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation

    PubMed Central

    Gruber, Andreas J.; Schmidt, Ralf; Gruber, Andreas R.; Martin, Georges; Ghosh, Souvik; Belmadani, Manuel; Keller, Walter

    2016-01-01

    Alternative polyadenylation (APA) is a general mechanism of transcript diversification in mammals, which has been recently linked to proliferative states and cancer. Different 3′ untranslated region (3′ UTR) isoforms interact with different RNA-binding proteins (RBPs), which modify the stability, translation, and subcellular localization of the corresponding transcripts. Although the heterogeneity of pre-mRNA 3′ end processing has been established with high-throughput approaches, the mechanisms that underlie systematic changes in 3′ UTR lengths remain to be characterized. Through a uniform analysis of a large number of 3′ end sequencing data sets, we have uncovered 18 signals, six of which are novel, whose positioning with respect to pre-mRNA cleavage sites indicates a role in pre-mRNA 3′ end processing in both mouse and human. With 3′ end sequencing we have demonstrated that the heterogeneous ribonucleoprotein C (HNRNPC), which binds the poly(U) motif whose frequency also peaks in the vicinity of polyadenylation (poly(A)) sites, has a genome-wide effect on poly(A) site usage. HNRNPC-regulated 3′ UTRs are enriched in ELAV-like RBP 1 (ELAVL1) binding sites and include those of the CD47 gene, which participate in the recently discovered mechanism of 3′ UTR–dependent protein localization (UDPL). Our study thus establishes an up-to-date, high-confidence catalog of 3′ end processing sites and poly(A) signals, and it uncovers an important role of HNRNPC in regulating 3′ end processing. It further suggests that U-rich elements mediate interactions with multiple RBPs that regulate different stages in a transcript's life cycle. PMID:27382025

  3. Multiple cis-acting elements involved in up-regulation of a cytochrome P450 gene conferring resistance to deltamethrin in smal brown planthopper, Laodelphax striatellus (Fallén).

    PubMed

    Pu, Jian; Sun, Haina; Wang, Jinda; Wu, Min; Wang, Kangxu; Denholm, Ian; Han, Zhaojun

    2016-11-01

    As well as arising from single point mutations in binding sites or detoxifying enzymes, it is likely that insecticide resistance mechanisms are frequently controlled by multiple genetic factors, resulting in resistance being inherited as a quantitative trait. However, empirical evidence for this is still rare. Here we analyse the causes of up-regulation of CYP6FU1, a monoxygenase implicated in resistance to deltamethrin in the rice pest Laodelphax striatellus. The 5'-flanking region of this gene was cloned and sequenced from individuals of a susceptible and a resistant strain. A luminescent reporter assay was used to evaluate different 5'-flanking regions and their fragments for promoter activity. Mutations enhancing promoter activity in various fragments were characterized, singly and in combination, by site mutation recovery. Nucleotide diversity in flanking sequences was greatly reduced in deltamethrin-resistant insects compared to susceptible ones. Phylogenetic sequence analysis found that CYP6FU1 had five different types of 5'-flanking region. All five types were present in a susceptible strain but only a single type showing the highest promoter activity was present in a resistant strain. Four cis-acting elements were identified whose influence on up-regulation was much more pronounced in combination than when present singly. Of these, two were new transcription factor (TF) binding sites produced by mutations, another one was also a new TF binding site alternated from an existing one, and the fourth was a unique transcription start site. These results demonstrate that multiple cis-acting elements are involved in up-regulating CYP6FU1 to generate a resistance phenotype. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Identification of the DNA-Binding Domains of Human Replication Protein A That Recognize G-Quadruplex DNA

    PubMed Central

    Prakash, Aishwarya; Natarajan, Amarnath; Marky, Luis A.; Ouellette, Michel M.; Borgstahl, Gloria E. O.

    2011-01-01

    Replication protein A (RPA), a key player in DNA metabolism, has 6 single-stranded DNA-(ssDNA-) binding domains (DBDs) A-F. SELEX experiments with the DBDs-C, -D, and -E retrieve a 20-nt G-quadruplex forming sequence. Binding studies show that RPA-DE binds preferentially to the G-quadruplex DNA, a unique preference not observed with other RPA constructs. Circular dichroism experiments show that RPA-CDE-core can unfold the G-quadruplex while RPA-DE stabilizes it. Binding studies show that RPA-C binds pyrimidine- and purine-rich sequences similarly. This difference between RPA-C and RPA-DE binding was also indicated by the inability of RPA-CDE-core to unfold an oligonucleotide containing a TC-region 5′ to the G-quadruplex. Molecular modeling studies of RPA-DE and telomere-binding proteins Pot1 and Stn1 reveal structural similarities between the proteins and illuminate potential DNA-binding sites for RPA-DE and Stn1. These data indicate that DBDs of RPA have different ssDNA recognition properties. PMID:21772997

  5. Structural Basis for Sialoglycan Binding by the Streptococcus sanguinis SrpA Adhesin*♦

    PubMed Central

    Bensing, Barbara A.; Loukachevitch, Lioudmila V.; McCulloch, Kathryn M.; Yu, Hai; Vann, Kendra R.; Wawrzak, Zdzislaw; Anderson, Spencer; Chen, Xi; Sullam, Paul M.; Iverson, T. M.

    2016-01-01

    Streptococcus sanguinis is a leading cause of infective endocarditis, a life-threatening infection of the cardiovascular system. An important interaction in the pathogenesis of infective endocarditis is attachment of the organisms to host platelets. S. sanguinis expresses a serine-rich repeat adhesin, SrpA, similar in sequence to platelet-binding adhesins associated with increased virulence in this disease. In this study, we determined the first crystal structure of the putative binding region of SrpA (SrpABR) both unliganded and in complex with a synthetic disaccharide ligand at 1.8 and 2.0 Å resolution, respectively. We identified a conserved Thr-Arg motif that orients the sialic acid moiety and is required for binding to platelet monolayers. Furthermore, we propose that sequence insertions in closely related family members contribute to the modulation of structural and functional properties, including the quaternary structure, the tertiary structure, and the ligand-binding site. PMID:26833566

  6. Genomic Heat Shock Element Sequences Drive Cooperative Human Heat Shock Factor 1 DNA Binding and Selectivity*

    PubMed Central

    Jaeger, Alex M.; Makley, Leah N.; Gestwicki, Jason E.; Thiele, Dennis J.

    2014-01-01

    The heat shock transcription factor 1 (HSF1) activates expression of a variety of genes involved in cell survival, including protein chaperones, the protein degradation machinery, anti-apoptotic proteins, and transcription factors. Although HSF1 activation has been linked to amelioration of neurodegenerative disease, cancer cells exhibit a dependence on HSF1 for survival. Indeed, HSF1 drives a program of gene expression in cancer cells that is distinct from that activated in response to proteotoxic stress, and HSF1 DNA binding activity is elevated in cycling cells as compared with arrested cells. Active HSF1 homotrimerizes and binds to a DNA sequence consisting of inverted repeats of the pentameric sequence nGAAn, known as heat shock elements (HSEs). Recent comprehensive ChIP-seq experiments demonstrated that the architecture of HSEs is very diverse in the human genome, with deviations from the consensus sequence in the spacing, orientation, and extent of HSE repeats that could influence HSF1 DNA binding efficacy and the kinetics and magnitude of target gene expression. To understand the mechanisms that dictate binding specificity, HSF1 was purified as either a monomer or trimer and used to evaluate DNA-binding site preferences in vitro using fluorescence polarization and thermal denaturation profiling. These results were compared with quantitative chromatin immunoprecipitation assays in vivo. We demonstrate a role for specific orientations of extended HSE sequences in driving preferential HSF1 DNA binding to target loci in vivo. These studies provide a biochemical basis for understanding differential HSF1 target gene recognition and transcription in neurodegenerative disease and in cancer. PMID:25204655

  7. Genotype to Phenotype Mapping of the E. coli lac Promoter

    NASA Astrophysics Data System (ADS)

    Otwinowski, Jakub; Nemenman, Ilya

    2014-03-01

    Genotype-to-phenotype maps and the related fitness landscapes that include epistatic interactions are difficult to measure because of their high dimensional structure. Here we construct such a map using the recently collected corpora of high-throughput sequence data from the 75 base pairs long mutagenized E. coli lac promoter region, where each sequence is associated with induced transcriptional activity measured by a fluorescent reporter. We find that the additive (non-epistatic) contributions of individual mutations account for about two-thirds of the explainable phenotype variance, while pairwise epistasis explains about 7% of the variance for the full mutagenized sequence and about 15% for the subsequence associated with protein binding sites. Surprisingly, there is no evidence for third order epistatic contributions, and our inferred fitness landscape is essentially single peaked, with a small amount of antagonistic epistasis. We identify transcription factor (CRP) and RNA polymerase binding sites in the promotor region and their interactions. We conclude with a cautionary note that inferred properties of fitness landscapes may be severely influenced by biases in the sequence data. Funded in part by HFSP and James S. McDonnell Foundation.

  8. Pharmacological lineage analysis revealed the binding affinity of broad-spectrum substance P antagonists to receptors for gonadotropin-releasing peptide.

    PubMed

    Arai, Kazune; Kashiwazaki, Aki; Fujiwara, Yoko; Tsuchiya, Hiroyoshi; Sakai, Nobuya; Shibata, Katsushi; Koshimizu, Taka-aki

    2015-02-15

    A group of synthetic substance P (SP) antagonists, such as [Arg(6),D-Trp(7,9),N(Me)Phe(8)]-substance P(6-11) and [D-Arg(1),D-Phe(5),D-Trp(7,9),Leu(11)]-substance P, bind to a range of distinct G-protein-coupled receptor (GPCR) family members, including V1a vasopressin receptors, and they competitively inhibit agonist binding. This extended accessibility enabled us to identify a GPCR subset with a partially conserved binding site structure. By combining pharmacological data and amino acid sequence homology matrices, a pharmacological lineage of GPCRs that are sensitive to these two SP antagonists was constructed. We found that sensitivity to the SP antagonists was not limited to the Gq-protein-coupled V1a and V1b receptors; Gs-coupled V2 receptors and oxytocin receptors, which couple with both Gq and Gi, also demonstrated sensitivity. Unexpectedly, a dendrogram based on the amino acid sequences of 222 known GPCRs showed that a group of receptors sensitive to the SP antagonists are located in close proximity to vasopressin/oxytocin receptors. Gonadotropin-releasing peptide receptors, located near the vasopressin receptors in the dendrogram, were also sensitive to the SP analogs, whereas α1B adrenergic receptors, located more distantly from the vasopressin receptors, were not sensitive. Our finding suggests that pharmacological lineage analysis is useful in selecting subsets of candidate receptors that contain a conserved binding site for a ligand with broad-spectrum binding abilities. The knowledge that the binding site of the two broad-spectrum SP analogs partially overlaps with that of distinct peptide agonists is valuable for understanding the specificity/broadness of peptide ligands. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. An RRM–ZnF RNA recognition module targets RBM10 to exonic sequences to promote exon exclusion

    PubMed Central

    Collins, Katherine M.; Kainov, Yaroslav A.; Christodolou, Evangelos; Ray, Debashish; Morris, Quaid; Hughes, Timothy; Taylor, Ian A.

    2017-01-01

    Abstract RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1–ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping. PMID:28379442

  10. An RRM-ZnF RNA recognition module targets RBM10 to exonic sequences to promote exon exclusion.

    PubMed

    Collins, Katherine M; Kainov, Yaroslav A; Christodolou, Evangelos; Ray, Debashish; Morris, Quaid; Hughes, Timothy; Taylor, Ian A; Makeyev, Eugene V; Ramos, Andres

    2017-06-20

    RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1-ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Expression cloning and characterization of a novel gene that encodes the RNA-binding protein FAU-1 from Pyrococcus furiosus.

    PubMed Central

    Kanai, Akio; Oida, Hanako; Matsuura, Nana; Doi, Hirofumi

    2003-01-01

    We systematically screened a genomic DNA library to identify proteins of the hyperthermophilic archaeon Pyrococcus furiosus using an expression cloning method. One gene product, which we named FAU-1 (P. furiosus AU-binding), demonstrated the strongest binding activity of all the genomic library-derived proteins tested against an AU-rich RNA sequence. The protein was purified to near homogeneity as a 54 kDa single polypeptide, and the gene locus corresponding to this FAU-1 activity was also sequenced. The FAU-1 gene encoded a 472-amino-acid protein that was characterized by highly charged domains consisting of both acidic and basic amino acids. The N-terminal half of the gene had a degree of similarity (25%) with RNase E from Escherichia coli. Five rounds of RNA-binding-site selection and footprinting analysis showed that the FAU-1 protein binds specifically to the AU-rich sequence in a loop region of a possible RNA ligand. Moreover, we demonstrated that the FAU-1 protein acts as an oligomer, and mainly as a trimer. These results showed that the FAU-1 protein is a novel heat-stable protein with an RNA loop-binding characteristic. PMID:12614195

  12. The Arabidopsis class I TCP transcription factor AtTCP11 is a developmental regulator with distinct DNA-binding properties due to the presence of a threonine residue at position 15 of the TCP domain.

    PubMed

    Viola, Ivana L; Uberti Manassero, Nora G; Ripoll, Rodrigo; Gonzalez, Daniel H

    2011-04-01

    The TCP domain is a DNA-binding domain present in plant transcription factors that modulate different processes. In the present study, we show that Arabidopsis class I TCP proteins are able to interact with a dyad-symmetric sequence composed of two GTGGG half-sites. TCP20 establishes symmetric interactions with the 5' half of each strand, whereas TCP11 interacts mainly with the 3' half. SELEX (systematic evolution of ligands by exponential enrichment) experiments with TCP15 and TCP20 indicated that these proteins have similar, although not identical, DNA-binding preferences and are able to interact with non-palindromic binding sites of the type GTGGGNCCNN. TCP11 shows a different DNA-binding specificity, with a preference for the sequence GTGGGCCNNN. The distinct DNA-binding properties of TCP11 are due to the presence of a threonine residue at position 15 of the TCP domain, a position that is occupied by an arginine residue in most TCP proteins. TCP11 also forms heterodimers with TCP15 that have increased DNA-binding efficiency. The expression in plants of a repressor form of TCP11 demonstrated that this protein is a developmental regulator that influences the growth of leaves, stems and petioles, and pollen development. The results suggest that changes in DNA-binding preferences may be one of the mechanisms through which class I TCP proteins achieve functional specificity.

  13. Mapping the binding site of aflatoxin B/sub 1/ in DNA: systematic analysis of the reactivity of aflatoxin B/sub 1/ with guanines in different DNA sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benasutti, M.; Ejadi, S.; Whitlow, M.D.

    The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less

  14. Deciphering the combinatorial architecture of a Drosophila homeotic gene enhancer

    PubMed Central

    Drewell, Robert A.; Nevarez, Michael J.; Kurata, Jessica S.; Winkler, Lauren N.; Li, Lily; Dresch, Jacqueline M.

    2013-01-01

    Summary In Drosophila, the 330 kb bithorax complex regulates cellular differentiation along the anterio-posterior axis during development in the thorax and abdomen and is comprised of three homeotic genes: Ultrabithorax, abdominal-A, and Abdominal-B. The expression of each of these genes is in turn controlled through interactions between transcription factors and a number of cis-regulatory modules in the neighboring intergenic regions. In this study, we examine how the sequence architecture of transcription factor binding sites mediates the functional activity of one of these cis-regulatory modules. Using computational, mathematical modeling and experimental molecular genetic approaches we investigate the IAB7b enhancer, which regulates Abdominal-B expression specifically in the presumptive seventh and ninth abdominal segments of the early embryo. A cross-species comparison of the IAB7b enhancer reveals an evolutionarily conserved signature motif containing two FUSHI-TARAZU activator transcription factor binding sites. We find that the transcriptional repressors KNIRPS, KRUPPEL and GIANT are able to restrict reporter gene expression to the posterior abdominal segments, using different molecular mechanisms including short-range repression and competitive binding. Additionally, we show the functional importance of the spacing between the two FUSHI-TARAZU binding sites and discuss the potential importance of cooperativity for transcriptional activation. Our results demonstrate that the transcriptional output of the IAB7b cis-regulatory module relies on a complex set of combinatorial inputs mediated by specific transcription factor binding and that the sequence architecture at this enhancer is critical to maintain robust regulatory function. PMID:24514265

  15. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120.

    PubMed

    deCamp, Allan C; Rolland, Morgane; Edlefsen, Paul T; Sanders-Buell, Eric; Hall, Breana; Magaret, Craig A; Fiore-Gartland, Andrew J; Juraska, Michal; Carpp, Lindsay N; Karuna, Shelly T; Bose, Meera; LePore, Steven; Miller, Shana; O'Sullivan, Annemarie; Poltavee, Kultida; Bai, Hongjun; Dommaraju, Kalpana; Zhao, Hong; Wong, Kim; Chen, Lennie; Ahmed, Hasan; Goodman, Derrick; Tay, Matthew Z; Gottardo, Raphael; Koup, Richard A; Bailer, Robert; Mascola, John R; Graham, Barney S; Roederer, Mario; O'Connell, Robert J; Michael, Nelson L; Robb, Merlin L; Adams, Elizabeth; D'Souza, Patricia; Kublin, James; Corey, Lawrence; Geraghty, Daniel E; Frahm, Nicole; Tomaras, Georgia D; McElrath, M Juliana; Frenkel, Lisa; Styrchak, Sheila; Tovanabutra, Sodsai; Sobieszczyk, Magdalena E; Hammer, Scott M; Kim, Jerome H; Mullins, James I; Gilbert, Peter B

    2017-01-01

    Although the HVTN 505 DNA/recombinant adenovirus type 5 vector HIV-1 vaccine trial showed no overall efficacy, analysis of breakthrough HIV-1 sequences in participants can help determine whether vaccine-induced immune responses impacted viruses that caused infection. We analyzed 480 HIV-1 genomes sampled from 27 vaccine and 20 placebo recipients and found that intra-host HIV-1 diversity was significantly lower in vaccine recipients (P ≤ 0.04, Q-values ≤ 0.09) in Gag, Pol, Vif and envelope glycoprotein gp120 (Env-gp120). Furthermore, Env-gp120 sequences from vaccine recipients were significantly more distant from the subtype B vaccine insert than sequences from placebo recipients (P = 0.01, Q-value = 0.12). These vaccine effects were associated with signatures mapping to CD4 binding site and CD4-induced monoclonal antibody footprints. These results suggest either (i) no vaccine efficacy to block acquisition of any viral genotype but vaccine-accelerated Env evolution post-acquisition; or (ii) vaccine efficacy against HIV-1s with Env sequences closest to the vaccine insert combined with increased acquisition due to other factors, potentially including the vaccine vector.

  16. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120

    PubMed Central

    Edlefsen, Paul T.; Sanders-Buell, Eric; Hall, Breana; Magaret, Craig A.; Fiore-Gartland, Andrew J.; Juraska, Michal; Carpp, Lindsay N.; Karuna, Shelly T.; Bose, Meera; LePore, Steven; Miller, Shana; O'Sullivan, Annemarie; Poltavee, Kultida; Bai, Hongjun; Dommaraju, Kalpana; Zhao, Hong; Wong, Kim; Chen, Lennie; Ahmed, Hasan; Goodman, Derrick; Tay, Matthew Z.; Gottardo, Raphael; Koup, Richard A.; Bailer, Robert; Mascola, John R.; Graham, Barney S.; Roederer, Mario; O’Connell, Robert J.; Michael, Nelson L.; Robb, Merlin L.; Adams, Elizabeth; D’Souza, Patricia; Kublin, James; Corey, Lawrence; Geraghty, Daniel E.; Frahm, Nicole; Tomaras, Georgia D.; McElrath, M. Juliana; Frenkel, Lisa; Styrchak, Sheila; Tovanabutra, Sodsai; Sobieszczyk, Magdalena E.; Hammer, Scott M.; Kim, Jerome H.; Mullins, James I.; Gilbert, Peter B.

    2017-01-01

    Although the HVTN 505 DNA/recombinant adenovirus type 5 vector HIV-1 vaccine trial showed no overall efficacy, analysis of breakthrough HIV-1 sequences in participants can help determine whether vaccine-induced immune responses impacted viruses that caused infection. We analyzed 480 HIV-1 genomes sampled from 27 vaccine and 20 placebo recipients and found that intra-host HIV-1 diversity was significantly lower in vaccine recipients (P ≤ 0.04, Q-values ≤ 0.09) in Gag, Pol, Vif and envelope glycoprotein gp120 (Env-gp120). Furthermore, Env-gp120 sequences from vaccine recipients were significantly more distant from the subtype B vaccine insert than sequences from placebo recipients (P = 0.01, Q-value = 0.12). These vaccine effects were associated with signatures mapping to CD4 binding site and CD4-induced monoclonal antibody footprints. These results suggest either (i) no vaccine efficacy to block acquisition of any viral genotype but vaccine-accelerated Env evolution post-acquisition; or (ii) vaccine efficacy against HIV-1s with Env sequences closest to the vaccine insert combined with increased acquisition due to other factors, potentially including the vaccine vector. PMID:29149197

  17. Activation of both acfA and acfD transcription by Vibrio cholerae ToxT requires binding to two centrally located DNA sites in an inverted repeat conformation.

    PubMed

    Withey, Jeffrey H; DiRita, Victor J

    2005-05-01

    The Gram-negative bacterium Vibrio cholerae is the infectious agent responsible for the disease Asiatic cholera. The genes required for V. cholerae virulence, such as those encoding the cholera toxin (CT) and toxin-coregulated pilus (TCP), are controlled by a cascade of transcriptional activators. Ultimately, the direct transcriptional activator of the majority of V. cholerae virulence genes is the AraC/XylS family member ToxT protein, the expression of which is activated by the ToxR and TcpP proteins. Previous studies have identified the DNA sites to which ToxT binds upstream of the ctx operon, encoding CT, and the tcpA operon, encoding, among other products, the major subunit of the TCP. These known ToxT binding sites are seemingly dissimilar in sequence other than being A/T rich. Further results suggested that ctx and tcpA each has a pair of ToxT binding sites arranged in a direct repeat orientation upstream of the core promoter elements. In this work, using both transcriptional lacZ fusions and in vitro copper-phenanthroline footprinting experiments, we have identified the ToxT binding sites between the divergently transcribed acfA and acfD genes, which encode components of the accessory colonization factor required for efficient intestinal colonization by V. cholerae. Our results indicate that ToxT binds to a pair of DNA sites between acfA and acfD in an inverted repeat orientation. Moreover, a mutational analysis of the ToxT binding sites indicates that both binding sites are required by ToxT for transcriptional activation of both acfA and acfD. Using copper-phenanthroline footprinting to assess the occupancy of ToxT on DNA having mutations in one of these binding sites, we found that protection by ToxT of the unaltered binding site was not affected, whereas protection by ToxT of the mutant binding site was significantly reduced in the region of the mutations. The results of further footprinting experiments using DNA templates having +5 bp and +10 bp insertions between the two ToxT binding sites indicate that both binding sites are occupied by ToxT regardless of their positions relative to each other. Based on these results, we propose that ToxT binds independently to two DNA sites between acfA and acfD to activate transcription of both genes.

  18. Two new insulator proteins, Pita and ZIPIC, target CP190 to chromatin.

    PubMed

    Maksimenko, Oksana; Bartkuhn, Marek; Stakhov, Viacheslav; Herold, Martin; Zolotarev, Nickolay; Jox, Theresa; Buxa, Melanie K; Kirsch, Ramona; Bonchuk, Artem; Fedotova, Anna; Kyrchanova, Olga; Renkawitz, Rainer; Georgiev, Pavel

    2015-01-01

    Insulators are multiprotein-DNA complexes that regulate the nuclear architecture. The Drosophila CP190 protein is a cofactor for the DNA-binding insulator proteins Su(Hw), CTCF, and BEAF-32. The fact that CP190 has been found at genomic sites devoid of either of the known insulator factors has until now been unexplained. We have identified two DNA-binding zinc-finger proteins, Pita, and a new factor named ZIPIC, that interact with CP190 in vivo and in vitro at specific interaction domains. Genomic binding sites for these proteins are clustered with CP190 as well as with CTCF and BEAF-32. Model binding sites for Pita or ZIPIC demonstrate a partial enhancer-blocking activity and protect gene expression from PRE-mediated silencing. The function of the CTCF-bound MCP insulator sequence requires binding of Pita. These results identify two new insulator proteins and emphasize the unifying function of CP190, which can be recruited by many DNA-binding insulator proteins. © 2015 Maksimenko et al.; Published by Cold Spring Harbor Laboratory Press.

  19. The influence of repressor DNA binding site architecture on transcriptional control.

    PubMed

    Park, Dan M; Kiley, Patricia J

    2014-08-26

    How the architecture of DNA binding sites dictates the extent of repression of promoters is not well understood. Here, we addressed the importance of the number and information content of the three direct repeats (DRs) in the binding and repression of the icdA promoter by the phosphorylated form of the global Escherichia coli repressor ArcA (ArcA-P). We show that decreasing the information content of the two sites with the highest information (DR1 and DR2) eliminated ArcA binding to all three DRs and ArcA repression of icdA. Unexpectedly, we also found that DR3 occupancy functions principally in repression, since mutation of this low-information-content site both eliminated DNA binding to DR3 and significantly weakened icdA repression, despite the fact that binding to DR1 and DR2 was intact. In addition, increasing the information content of any one of the three DRs or addition of a fourth DR increased ArcA-dependent repression but perturbed signal-dependent regulation of repression. Thus, our data show that the information content and number of DR elements are critical architectural features for maintaining a balance between high-affinity binding and signal-dependent regulation of icdA promoter function in response to changes in ArcA-P levels. Optimization of such architectural features may be a common strategy to either dampen or enhance the sensitivity of DNA binding among the members of the large OmpR/PhoB family of regulators as well as other transcription factors. In Escherichia coli, the response regulator ArcA maintains homeostasis of redox carriers under O2-limiting conditions through a comprehensive repression of carbon oxidation pathways that require aerobic respiration to recycle redox carriers. Although a binding site architecture comprised of a variable number of sequence recognition elements has been identified within the promoter regions of ArcA-repressed operons, it is unclear how this variable architecture dictates transcriptional regulation. By dissecting the role of multiple sequence elements within the icdA promoter, we provide insight into the design principles that allow ArcA to repress transcription within diverse promoter contexts. Our data suggest that the arrangement of recognition elements is tailored to achieve sufficient repression of a given promoter while maintaining appropriate signal-dependent regulation of repression, providing insight into how diverse binding site architectures link changes in O2 with the fine-tuning of carbon oxidation pathway levels. Copyright © 2014 Park and Kiley.

  20. DNA binding specificity of the basic-helix-loop-helix protein MASH-1.

    PubMed

    Meierhan, D; el-Ariss, C; Neuenschwander, M; Sieber, M; Stackhouse, J F; Allemann, R K

    1995-09-05

    Despite the high degree of sequence similarity in their basic-helix-loop-helix (BHLH) domains, MASH-1 and MyoD are involved in different biological processes. In order to define possible differences between the DNA binding specificities of these two proteins, we investigated the DNA binding properties of MASH-1 by circular dichroism spectroscopy and by electrophoretic mobility shift assays (EMSA). Upon binding to DNA, the BHLH domain of MASH-1 underwent a conformational change from a mainly unfolded to a largely alpha-helical form, and surprisingly, this change was independent of the specific DNA sequence. The same conformational transition could be induced by the addition of 20% 2,2,2-trifluoroethanol. The apparent dissociation constants (KD) of the complexes of full-length MASH-1 with various oligonucleotides were determined from half-saturation points in EMSAs. MASH-1 bound as a dimer to DNA sequences containing an E-box with high affinity KD = 1.4-4.1 x 10(-14) M2). However, the specificity of DNA binding was low. The dissociation constant for the complex between MASH-1 and the highest affinity E-box sequence (KD = 1.4 x 10(-14) M2) was only a factor of 10 smaller than for completely unrelated DNA sequences (KD = approximately 1 x 10(-13) M2). The DNA binding specificity of MASH-1 was not significantly increased by the formation of an heterodimer with the ubiquitous E12 protein. MASH-1 and MyoD displayed similar binding site preferences, suggesting that their different target gene specificities cannot be explained solely by differential DNA binding. An explanation for these findings is provided on the basis of the known crystal structure of the BHLH domain of MyoD.

  1. The evolution of energy-transducing systems: Studies with archaebacteria

    NASA Technical Reports Server (NTRS)

    Stan-Lotter, Helga

    1993-01-01

    N-ethylmaleimide (NEM) inhibits the ATPase of H. saccharovorum in a nucleotide protectable manner. The bulk of 14C-NEM is incorporated into subunit 1. Inhibition kinetics indicated a single binding site. To determine the sequence around this site, cyanogen bromide peptides of NEM-labeled ATPase enzyme were prepared and separated on Tris-Tricine gels. Autoradiography indicated that the NEM binding site is probably located in a fragment of Mr 10-12 K. This result will be confirmed by N-terminal sequencing of the peptide. Since the cysteinyl residue, to which NEM is bound, may be located at the C-terminal end, purification and proteolytic treatment of the 10 K peptide will be required. One inhibitor of V-type ATPases, fluoresceinisothiocyanate (FITC) inhibited also the ATPase of H. saccharovorum. Preliminary results indicated protection against inhibition by nucleotides. Localization of the binding sited to the major subunits is in progress. An extraction procedure for the membrane sector of the ATPase complex of H. saccharovorum yielded a preparation which was enriched in a peptide of Mr 5 500. Experiments to test the immunological crossreaction with subunit c from the Escherichia coli F-type ATPase and the labeling with 14C-DCCD are currently carried out. Polyclonal antiserum to the smaller of the major subunits of the ATPase from H. saccharovorum (subunit ll) reacts in Western blots strongly with the alpha and beta subunits of the F1 ATPase of E. coli, suggesting highly conserved regions on both types of ATPases. To elucidate further the regions of homology, cyanogen bromide peptides of the beta subunits were prepared for sequence analysis.

  2. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  3. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

    PubMed

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A

    2013-07-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.

  4. Binding sites for interaction of peroxiredoxin 6 with surfactant protein A

    PubMed Central

    Krishnaiah, Saikumari Y; Dodia, Chandra; Sorokina, Elena M; Li, Haitao; Feinstein, Sheldon I; Fisher, Aron B

    2016-01-01

    Peroxiredoxin 6 (Prdx6) is a bifunctional enzyme with peroxidase and phospholipase A2 (PLA2) activities. This protein participates in the degradation and remodeling of internalized dipalmitoylphosphatidylcholine (DPPC), the major phospholipid component of lung surfactant. We have shown previously that the PLA2 activity of Prdx6 is inhibited by the lung surfactant-associated protein called surfactant protein A (SP-A) through direct protein-protein interaction. Docking of SPA and Prdx6 was modeled using the ZDOCK (zlab.bu.edu) program in order to predict molecular sites for binding of the two proteins. The predicted peptide sequences were evaluated for binding to the opposite protein using isothermal titration calorimetry and circular dichroism measurement followed by determination of the effect of the SP-A peptide on the PLA2 activity of Prdx6. The sequences 195EEEAKKLFPK204.in the Prdx6 helix and 83DEELQTELYEIKHQIL99 in SP-A were identified as the sites for hydrophobic interaction and H+-bonding between the 2 proteins. Treatment of mouse endothelial cells with the SP-A peptide inhibited their recovery from lipid peroxidation associated with oxidative stress indicating inhibition of Prdx6 activity by the peptide in the intact cell. PMID:26723227

  5. A stochastic context free grammar based framework for analysis of protein sequences

    PubMed Central

    Dyrka, Witold; Nebel, Jean-Christophe

    2009-01-01

    Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins. In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis of protein sequences where grammars are induced using a genetic algorithm. Results This framework was implemented in a system aiming at the production of binding site descriptors. These descriptors not only allow detection of protein regions that are involved in these sites, but also provide insight in their structure. Grammars were induced using quantitative properties of amino acids to deal with the size of the protein alphabet. Moreover, we imposed some structural constraints on grammars to reduce the extent of the rule search space. Finally, grammars based on different properties were combined to convey as much information as possible. Evaluation was performed on sites of various sizes and complexity described either by PROSITE patterns, domain profiles or a set of patterns. Results show the produced binding site descriptors are human-readable and, hence, highlight biologically meaningful features. Moreover, they achieve good accuracy in both annotation and detection. In addition, findings suggest that, unlike current state-of-the-art methods, our system may be particularly suited to deal with patterns shared by non-homologous proteins. Conclusion A new Stochastic Context Free Grammar based framework has been introduced allowing the production of binding site descriptors for analysis of protein sequences. Experiments have shown that not only is this new approach valid, but produces human-readable descriptors for binding sites which have been beyond the capability of current machine learning techniques. PMID:19814800

  6. Oxidation-induced Structural Changes of Ceruloplasmin Foster NGR Motif Deamidation That Promotes Integrin Binding and Signaling

    PubMed Central

    Barbariga, Marco; Curnis, Flavio; Spitaleri, Andrea; Andolfo, Annapaola; Zucchelli, Chiara; Lazzaro, Massimo; Magnani, Giuseppe; Musco, Giovanna; Corti, Angelo; Alessio, Massimo

    2014-01-01

    Asparagine deamidation occurs spontaneously in proteins during aging; deamidation of Asn-Gly-Arg (NGR) sites can lead to the formation of isoAsp-Gly-Arg (isoDGR), a motif that can recognize the RGD-binding site of integrins. Ceruloplasmin (Cp), a ferroxidase present in the cerebrospinal fluid (CSF), contains two NGR sites in its sequence: one exposed on the protein surface (568NGR) and the other buried in the tertiary structure (962NGR). Considering that Cp can undergo oxidative modifications in the CSF of neurodegenerative diseases, we investigated the effect of oxidation on the deamidation of both NGR motifs and, consequently, on the acquisition of integrin binding properties. We observed that the exposed 568NGR site can deamidate under conditions mimicking accelerated Asn aging. In contrast, the hidden 962NGR site can deamidate exclusively when aging occurs under oxidative conditions, suggesting that oxidation-induced structural changes foster deamidation at this site. NGR deamidation in Cp was associated with gain of integrin-binding function, intracellular signaling, and cell pro-adhesive activity. Finally, Cp aging in the CSF from Alzheimer disease patients, but not in control CSF, causes Cp deamidation with gain of integrin-binding function, suggesting that this transition might also occur in pathological conditions. In conclusion, both Cp NGR sites can deamidate during aging under oxidative conditions, likely as a consequence of oxidative-induced structural changes, thereby promoting a gain of function in integrin binding, signaling, and cell adhesion. PMID:24366863

  7. Structural and functional analysis of an enhancer GPEI having a phorbol 12-O-tetradecanoate 13-acetate responsive element-like sequence found in the rat glutathione transferase P gene.

    PubMed

    Okuda, A; Imagawa, M; Maeda, Y; Sakai, M; Muramatsu, M

    1989-10-05

    We have recently identified a typical enhancer, termed GPEI, located about 2.5 kilobases upstream from the transcription initiation site of the rat glutathione transferase P gene. Analyses of 5' and 3' deletion mutants revealed that the cis-acting sequence of GPEI contained the phorbol 12-O-tetradecanoate 13-acetate responsive element (TRE)-like sequence in it. For the maximal activity, however, GPEI required an adjacent upstream sequence of about 19 base pairs in addition to the TRE-like sequence. With the DNA binding gel-shift assay, we could detect protein(s) that specifically binds to the TRE-like sequence of GPEI fragment, which was possibly c-jun.c-fos complex or a similar protein complex. The sequence immediately upstream of the TRE-like sequence did not have any activity by itself, but augmented the latter activity by about 5-fold.

  8. Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

    PubMed Central

    Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

    2010-01-01

    Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10–20% nucleotide deviation from the canonical ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966

  9. Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements.

    PubMed

    Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B

    2010-04-01

    Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10-20% nucleotide deviation from the canonical ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.

  10. Simultaneously learning DNA motif along with its position and sequence rank preferences through expectation maximization algorithm.

    PubMed

    Zhang, ZhiZhuo; Chang, Cheng Wei; Hugo, Willy; Cheung, Edwin; Sung, Wing-Kin

    2013-03-01

    Although de novo motifs can be discovered through mining over-represented sequence patterns, this approach misses some real motifs and generates many false positives. To improve accuracy, one solution is to consider some additional binding features (i.e., position preference and sequence rank preference). This information is usually required from the user. This article presents a de novo motif discovery algorithm called SEME (sampling with expectation maximization for motif elicitation), which uses pure probabilistic mixture model to model the motif's binding features and uses expectation maximization (EM) algorithms to simultaneously learn the sequence motif, position, and sequence rank preferences without asking for any prior knowledge from the user. SEME is both efficient and accurate thanks to two important techniques: the variable motif length extension and importance sampling. Using 75 large-scale synthetic datasets, 32 metazoan compendium benchmark datasets, and 164 chromatin immunoprecipitation sequencing (ChIP-Seq) libraries, we demonstrated the superior performance of SEME over existing programs in finding transcription factor (TF) binding sites. SEME is further applied to a more difficult problem of finding the co-regulated TF (coTF) motifs in 15 ChIP-Seq libraries. It identified significantly more correct coTF motifs and, at the same time, predicted coTF motifs with better matching to the known motifs. Finally, we show that the learned position and sequence rank preferences of each coTF reveals potential interaction mechanisms between the primary TF and the coTF within these sites. Some of these findings were further validated by the ChIP-Seq experiments of the coTFs. The application is available online.

  11. Design of Cyclic Peptide Based Glucose Receptors and Their Application in Glucose Sensing.

    PubMed

    Li, Chao; Chen, Xin; Zhang, Fuyuan; He, Xingxing; Fang, Guozhen; Liu, Jifeng; Wang, Shuo

    2017-10-03

    Glucose assay is of great scientific significance in clinical diagnostics and bioprocess monitoring, and to design a new glucose receptor is necessary for the development of more sensitive, selective, and robust glucose detection techniques. Herein, a series of cyclic peptide (CP) glucose receptors were designed to mimic the binding sites of glucose binding protein (GBP), and CPs' sequence contained amino acid sites Asp, Asn, His, Asp, and Arg, which constituted the first layer interactions of GBP. The properties of these CPs used as a glucose receptor or substitute for the GBP were studied by using a quartz crystal microbalance (QCM) technique. It was found that CPs can form a self-assembled monolayer at the Au quartz electrode surface, and the monolayer's properties were characterized by using cyclic voltammetry, electrochemical impedance spectroscopy, and atomic force microscopy. The CPs' binding affinity to saccharide (i.e., galactose, fructose, lactose, sucrose, and maltose) was investigated, and the CPs' sensitivity and selectivity toward glucose were found to be dependent upon the configuration,i.e., the amino acids sequence of the CPs. The cyclic unit with a cyclo[-CNDNHCRDNDC-] sequence gave the highest selectivity and sensitivity for glucose sensing. This work suggests that a synthetic peptide bearing a particular functional sequence could be applied for developing a new generation of glucose receptors and would find huge application in biological, life science, and clinical diagnostics fields.

  12. Molecular determinants of origin discrimination by Orc1 initiators in archaea.

    PubMed

    Dueber, Erin C; Costa, Alessandro; Corn, Jacob E; Bell, Stephen D; Berger, James M

    2011-05-01

    Unlike bacteria, many eukaryotes initiate DNA replication from genomic sites that lack apparent sequence conservation. These loci are identified and bound by the origin recognition complex (ORC), and subsequently activated by a cascade of events that includes recruitment of an additional factor, Cdc6. Archaeal organisms generally possess one or more Orc1/Cdc6 homologs, belonging to the Initiator clade of ATPases associated with various cellular activities (AAA(+)) superfamily; however, these proteins recognize specific sequences within replication origins. Atomic resolution studies have shown that archaeal Orc1 proteins contact double-stranded DNA through an N-terminal AAA(+) domain and a C-terminal winged-helix domain (WHD), but use remarkably few base-specific contacts. To investigate the biochemical effects of these associations, we mutated the DNA-interacting elements of the Orc1-1 and Orc1-3 paralogs from the archaeon Sulfolobus solfataricus, and tested their effect on origin binding and deformation. We find that the AAA(+) domain has an unpredicted role in controlling the sequence selectivity of DNA binding, despite an absence of base-specific contacts to this region. Our results show that both the WHD and ATPase region influence origin recognition by Orc1/Cdc6, and suggest that not only DNA sequence, but also local DNA structure help define archaeal initiator binding sites. © The Author(s) 2011. Published by Oxford University Press.

  13. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  14. Improved bioactivity of G-rich triplex-forming oligonucleotides containing modified guanine bases

    PubMed Central

    Rogers, Faye A; Lloyd, Janice A; Tiwari, Meetu Kaushik

    2014-01-01

    Triplex structures generated by sequence-specific triplex-forming oligonucleotides (TFOs) have proven to be promising tools for gene targeting strategies. In addition, triplex technology has been highly utilized to study the molecular mechanisms of DNA repair, recombination and mutagenesis. However, triplex formation utilizing guanine-rich oligonucleotides as third strands can be inhibited by potassium-induced self-association resulting in G-quadruplex formation. We report here that guanine-rich TFOs partially substituted with 8-aza-7-deaza-guanine (PPG) have improved target site binding in potassium compared with TFOs containing the natural guanine base. We designed PPG-substituted TFOs to bind to a polypurine sequence in the supFG1 reporter gene. The binding efficiency of PPG-substituted TFOs to the target sequence was analyzed using electrophoresis mobility gel shift assays. We have determined that in the presence of potassium, the non-substituted TFO, AG30 did not bind to its target sequence, however binding was observed with the PPG-substituted AG30 under conditions with up to 140 mM KCl. The PPG-TFOs were able to maintain their ability to induce genomic modifications as measured by an assay for gene-targeted mutagenesis. In addition, these compounds were capable of triplex-induced DNA double strand breaks, which resulted in activation of apoptosis. PMID:25483840

  15. Discovery of 12-mer peptides that bind to wood lignin

    PubMed Central

    Yamaguchi, Asako; Isozaki, Katsuhiro; Nakamura, Masaharu; Takaya, Hikaru; Watanabe, Takashi

    2016-01-01

    Lignin, an abundant terrestrial polymer, is the only large-volume renewable feedstock composed of an aromatic skeleton. Lignin has been used mostly as an energy source during paper production; however, recent interest in replacing fossil fuels with renewable resources has highlighted its potential value in providing aromatic chemicals. Highly selective degradation of lignin is pivotal for industrial production of paper, biofuels, chemicals, and materials. However, few studies have examined natural and synthetic molecular components recognizing the heterogeneous aromatic polymer. Here, we report the first identification of lignin-binding peptides possessing characteristic sequences using a phage display technique. The consensus sequence HFPSP was found in several lignin-binding peptides, and the outer amino acid sequence affected the binding affinity of the peptides. Substitution of phenylalanine7 with Ile in the lignin-binding peptide C416 (HFPSPIFQRHSH) decreased the affinity of the peptide for softwood lignin without changing its affinity for hardwood lignin, indicating that C416 recognised structural differences between the lignins. Circular dichroism spectroscopy demonstrated that this peptide adopted a highly flexible random coil structure, allowing key residues to be appropriately arranged in relation to the binding site in lignin. These results provide a useful platform for designing synthetic and biological catalysts selectively bind to lignin. PMID:26903196

  16. Characterization of BreR Interaction with the Bile Response Promoters breAB and breR in Vibrio cholerae

    PubMed Central

    Cerda-Maira, Francisca A.; Kovacikova, Gabriela; Jude, Brooke A.; Skorupski, Karen

    2013-01-01

    The Vibrio cholerae BreR protein is a transcriptional repressor of the breAB efflux system operon, which encodes proteins involved in bile resistance. In a previous study (F. A. Cerda-Maira, C. S. Ringelberg, and R. K. Taylor, J. Bacteriol. 190:7441–7452, 2008), we used gel mobility shift assays to determine that BreR binds at two independent binding sites at the breAB promoter and a single site at its own promoter. Here it is shown, by DNase I footprinting and site-directed mutagenesis, that BreR is able to bind at a distal and a proximal site in the breAB promoter. However, only one of these sites, the proximal 29-bp site, is necessary for BreR-mediated transcriptional repression of breAB expression. In addition, it was determined that BreR represses its own expression by recognizing a 28-bp site at the breR promoter. These sites comprise regions of dyad symmetry within which residues critical for BreR function could be identified. The BreR consensus sequence AANGTANAC-N6-GTNTACNTT overlaps the −35 region at both promoters, implying that the repression of gene expression is achieved by interfering with RNA polymerase binding at these promoters. PMID:23144245

  17. Principles of regulatory information conservation between mouse and human.

    PubMed

    Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P

    2014-11-20

    To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.

  18. Mechanism of pathogen recognition by human dectin-2.

    PubMed

    Feinberg, Hadar; Jégouzo, Sabine A F; Rex, Maximus J; Drickamer, Kurt; Weis, William I; Taylor, Maureen E

    2017-08-11

    Dectin-2, a C-type lectin on macrophages and other cells of the innate immune system, functions in response to pathogens, particularly fungi. The carbohydrate-recognition domain (CRD) in dectin-2 is linked to a transmembrane sequence that interacts with the common Fc receptor γ subunit to initiate immune signaling. The molecular mechanism by which dectin-2 selectively binds to pathogens has been investigated by characterizing the CRD expressed in a bacterial system. Competition binding studies indicated that the CRD binds to monosaccharides with modest affinity and that affinity was greatly enhanced for mannose-linked α1-2 or α1-4 to a second mannose residue. Glycan array analysis confirmed selective binding of the CRD to glycans that contain Manα1-2Man epitopes. Crystals of the CRD in complex with a mammalian-type high-mannose Man 9 GlcNAc 2 oligosaccharide exhibited interaction with Manα1-2Man on two different termini of the glycan, with the reducing-end mannose residue ligated to Ca 2+ in a primary binding site and the nonreducing terminal mannose residue occupying an adjacent secondary site. Comparison of the binding sites in DC-SIGN and langerin, two other pathogen-binding receptors of the innate immune system, revealed why these two binding sites accommodate only terminal Manα1-2Man structures, whereas dectin-2 can bind Manα1-2Man in internal positions in mannans and other polysaccharides. The specificity and geometry of the dectin-2-binding site provide the molecular mechanism for binding of dectin-2 to fungal mannans and also to bacterial lipopolysaccharides, capsular polysaccharides, and lipoarabinomannans that contain the Manα1-2Man disaccharide unit. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  19. Crystal structure of the DNA-binding domain of the LysR-type transcriptional regulator CbnR in complex with a DNA fragment of the recognition-binding site in the promoter region.

    PubMed

    Koentjoro, Maharani Pertiwi; Adachi, Naruhiko; Senda, Miki; Ogawa, Naoto; Senda, Toshiya

    2018-03-01

    LysR-type transcriptional regulators (LTTRs) are among the most abundant transcriptional regulators in bacteria. CbnR is an LTTR derived from Cupriavidus necator (formerly Alcaligenes eutrophus or Ralstonia eutropha) NH9 and is involved in transcriptional activation of the cbnABCD genes encoding chlorocatechol degradative enzymes. CbnR interacts with a cbnA promoter region of approximately 60 bp in length that contains the recognition-binding site (RBS) and activation-binding site (ABS). Upon inducer binding, CbnR seems to undergo conformational changes, leading to the activation of the transcription. Since the interaction of an LTTR with RBS is considered to be the first step of the transcriptional activation, the CbnR-RBS interaction is responsible for the selectivity of the promoter to be activated. To understand the sequence selectivity of CbnR, we determined the crystal structure of the DNA-binding domain of CbnR in complex with RBS of the cbnA promoter at 2.55 Å resolution. The crystal structure revealed details of the interactions between the DNA-binding domain and the promoter DNA. A comparison with the previously reported crystal structure of the DNA-binding domain of BenM in complex with its cognate RBS showed several differences in the DNA interactions, despite the structural similarity between CbnR and BenM. These differences explain the observed promoter sequence selectivity between CbnR and BenM. Particularly, the difference between Thr33 in CbnR and Ser33 in BenM appears to affect the conformations of neighboring residues, leading to the selective interactions with DNA. Atomic coordinates and structure factors for the DNA-binding domain of Cupriavidus necatorNH9 CbnR in complex with RBS are available in the Protein Data Bank under the accession code 5XXP. © 2018 Federation of European Biochemical Societies.

  20. Convergent transmission of RNAi guide-target mismatch information across Argonaute internal allosteric network.

    PubMed

    Joseph, Thomas T; Osman, Roman

    2012-01-01

    In RNA interference, a guide strand derived from a short dsRNA such as a microRNA (miRNA) is loaded into Argonaute, the central protein in the RNA Induced Silencing Complex (RISC) that silences messenger RNAs on a sequence-specific basis. The positions of any mismatched base pairs in an miRNA determine which Argonaute subtype is used. Subsequently, the Argonaute-guide complex binds and silences complementary target mRNAs; certain Argonautes cleave the target. Mismatches between guide strand and the target mRNA decrease cleavage efficiency. Thus, loading and silencing both require that signals about the presence of a mismatched base pair are communicated from the mismatch site to effector sites. These effector sites include the active site, to prevent target cleavage; the binding groove, to modify nucleic acid binding affinity; and surface allosteric sites, to control recruitment of additional proteins to form the RISC. To examine how such signals may be propagated, we analyzed the network of internal allosteric pathways in Argonaute exhibited through correlations of residue-residue interactions. The emerging network can be described as a set of pathways emanating from the core of the protein near the active site, distributed into the bulk of the protein, and converging upon a distributed cluster of surface residues. Nucleotides in the guide strand "seed region" have a stronger relationship with the protein than other nucleotides, concordant with their importance in sequence selectivity. Finally, any of several seed region guide-target mismatches cause certain Argonaute residues to have modified correlations with the rest of the protein. This arises from the aggregation of relatively small interaction correlation changes distributed across a large subset of residues. These residues are in effector sites: the active site, binding groove, and surface, implying that direct functional consequences of guide-target mismatches are mediated through the cumulative effects of a large number of internal allosteric pathways.

  1. Convergent Transmission of RNAi Guide-Target Mismatch Information across Argonaute Internal Allosteric Network

    PubMed Central

    Joseph, Thomas T.; Osman, Roman

    2012-01-01

    In RNA interference, a guide strand derived from a short dsRNA such as a microRNA (miRNA) is loaded into Argonaute, the central protein in the RNA Induced Silencing Complex (RISC) that silences messenger RNAs on a sequence-specific basis. The positions of any mismatched base pairs in an miRNA determine which Argonaute subtype is used. Subsequently, the Argonaute-guide complex binds and silences complementary target mRNAs; certain Argonautes cleave the target. Mismatches between guide strand and the target mRNA decrease cleavage efficiency. Thus, loading and silencing both require that signals about the presence of a mismatched base pair are communicated from the mismatch site to effector sites. These effector sites include the active site, to prevent target cleavage; the binding groove, to modify nucleic acid binding affinity; and surface allosteric sites, to control recruitment of additional proteins to form the RISC. To examine how such signals may be propagated, we analyzed the network of internal allosteric pathways in Argonaute exhibited through correlations of residue-residue interactions. The emerging network can be described as a set of pathways emanating from the core of the protein near the active site, distributed into the bulk of the protein, and converging upon a distributed cluster of surface residues. Nucleotides in the guide strand “seed region” have a stronger relationship with the protein than other nucleotides, concordant with their importance in sequence selectivity. Finally, any of several seed region guide-target mismatches cause certain Argonaute residues to have modified correlations with the rest of the protein. This arises from the aggregation of relatively small interaction correlation changes distributed across a large subset of residues. These residues are in effector sites: the active site, binding groove, and surface, implying that direct functional consequences of guide-target mismatches are mediated through the cumulative effects of a large number of internal allosteric pathways. PMID:23028290

  2. Cryptic glucocorticoid receptor-binding sites pervade genomic NF-κB response elements.

    PubMed

    Hudson, William H; Vera, Ian Mitchelle S de; Nwachukwu, Jerome C; Weikum, Emily R; Herbst, Austin G; Yang, Qin; Bain, David L; Nettles, Kendall W; Kojetin, Douglas J; Ortlund, Eric A

    2018-04-06

    Glucocorticoids (GCs) are potent repressors of NF-κB activity, making them a preferred choice for treatment of inflammation-driven conditions. Despite the widespread use of GCs in the clinic, current models are inadequate to explain the role of the glucocorticoid receptor (GR) within this critical signaling pathway. GR binding directly to NF-κB itself-tethering in a DNA binding-independent manner-represents the standing model of how GCs inhibit NF-κB-driven transcription. We demonstrate that direct binding of GR to genomic NF-κB response elements (κBREs) mediates GR-driven repression of inflammatory gene expression. We report five crystal structures and solution NMR data of GR DBD-κBRE complexes, which reveal that GR recognizes a cryptic response element between the binding footprints of NF-κB subunits within κBREs. These cryptic sequences exhibit high sequence and functional conservation, suggesting that GR binding to κBREs is an evolutionarily conserved mechanism of controlling the inflammatory response.

  3. A novel Arg H52/Tyr H33 conservative motif in antibodies: A correlation between sequence of antibodies and antigen binding.

    PubMed

    Petrov, Artem; Arzhanik, Vladimir; Makarov, Gennady; Koliasnikov, Oleg

    2016-08-01

    Antibodies are the family of proteins, which are responsible for antigen recognition. The computational modeling of interaction between an antigen and an antibody is very important when crystallographic structure is unavailable. In this research, we have discovered the correlation between the amino acid sequence of antibody and its specific binding characteristics on the example of the novel conservative binding motif, which consists of four residues: Arg H52, Tyr H33, Thr H59, and Glu H61. These residues are specifically oriented in the binding site and interact with each other in a specific manner. The residues of the binding motif are involved in interaction strictly with negatively charged groups of antigens, and form a binding complex. Mechanism of interaction and characteristics of the complex were also discovered. The results of this research can be used to increase the accuracy of computational antibody-antigen interaction modeling and for post-modeling quality control of the modeled structures.

  4. Generation of tumour-necrosis-factor-alpha-specific affibody molecules capable of blocking receptor binding in vitro.

    PubMed

    Jonsson, Andreas; Wållberg, Helena; Herne, Nina; Ståhl, Stefan; Frejd, Fredrik Y

    2009-08-17

    Affibody molecules specific for human TNF-alpha (tumour necrosis factor-alpha) were selected by phage-display technology from a library based on the 58-residue Protein A-derived Z domain. TNF-alpha is a proinflammatory cytokine involved in several inflammatory diseases and, to this day, four TNF-alpha-blocking protein pharmaceuticals have been approved for clinical use. The phage selection generated 18 unique cysteine-free affibody sequences of which 12 were chosen, after sequence cluster analysis, for characterization as proteins. Biosensor binding studies of the 12 Escherichia coli-produced and IMAC (immobilized-metal-ion affinity chromatography)-purified affibody molecules revealed three variants that demonstrated the strongest binding to human TNF-alpha. These three affibody molecules were subjected to kinetic binding analysis and also tested for their binding to mouse, rat and pig TNF-alpha. For ZTNF-alpha:185, subnanomolar affinity (KD=0.1-0.5 nM) for human TNF-alpha was demonstrated, as well as significant binding to TNF-alpha from the other species. Furthermore, the binding site was found to overlap with the binding site for the TNF-alpha receptor, since this interaction could be efficiently blocked by the ZTNF-alpha:185 affibody. When investigating six dimeric affibody constructs with different linker lengths, and one trimeric construct, it was found that the inhibition of the TNF-alpha binding to its receptor could be further improved by using dimers with extended linkers and/or a trimeric affibody construct. The potential implication of the results for the future design of affibody-based reagents for the diagnosis of inflammation is discussed.

  5. ETS target genes: Identification of Egr1 as a target by RNA differential display and whole genome PCR techniques

    PubMed Central

    Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun

    1997-01-01

    ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063

  6. Interaction of the Transcription Start Site Core Region and Transcription Factor YY1 Determine Ascorbate Transporter SVCT2 Exon 1a Promoter Activity

    PubMed Central

    Qiao, Huan; May, James M.

    2012-01-01

    Transcription of the ascorbate transporter, SVCT2, is driven by two distinct promoters in exon 1 of the transporter sequence. The exon 1a promoter lacks a classical transcription start site and little is known about regulation of promoter activity in the transcription start site core (TSSC) region. Here we present evidence that the TSSC binds the multifunctional initiator-binding protein YY1. Electrophoresis shift assays using YY1 antibody showed that YY1 is present as one of two major complexes that specifically bind to the TSSC. The other complex contains the transcription factor NF-Y. Mutations in the TSSC that decreased YY1 binding also impaired the exon 1a promoter activity despite the presence of an upstream activating NF-Y/USF complex, suggesting that YY1 is involved in the regulation of the exon 1a transcription. Furthermore, YY1 interaction with NF-Y and/or USF synergistically enhanced the exon 1a promoter activity in transient transfections and co-activator p300 enhanced their synergistic activation. We propose that the TSSC plays a vital role in the exon 1a transcription and that this function is partially carried out by the transcription factor YY1. Moreover, co-activator p300 might be able to synergistically enhance the TSSC function via a “bridge” mechanism with upstream sequences. PMID:22532872

  7. From reads to regions: a Bioconductor workflow to detect differential binding in ChIP-seq data

    PubMed Central

    Lun, Aaron T. L.; Smyth, Gordon K.

    2016-01-01

    Chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) is widely used to identify the genomic binding sites for protein of interest. Most conventional approaches to ChIP-seq data analysis involve the detection of the absolute presence (or absence) of a binding site. However, an alternative strategy is to identify changes in the binding intensity between two biological conditions, i.e., differential binding (DB). This may yield more relevant results than conventional analyses, as changes in binding can be associated with the biological difference being investigated. The aim of this article is to facilitate the implementation of DB analyses, by comprehensively describing a computational workflow for the detection of DB regions from ChIP-seq data. The workflow is based primarily on R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, from alignment of read sequences to interpretation and visualization of putative DB regions. In particular, detection of DB regions will be conducted using the counts for sliding windows from the csaw package, with statistical modelling performed using methods in the edgeR package. Analyses will be demonstrated on real histone mark and transcription factor data sets. This will provide readers with practical usage examples that can be applied in their own studies. PMID:26834993

  8. Polar bear hemoglobin and human Hb A0: same 2,3-diphosphoglycerate binding site but asymmetry of the binding?

    PubMed

    Pomponi, Massimo; Bertonati, Claudia; Patamia, Maria; Marta, Maurizio; Derocher, Andrew E; Lydersen, Christian; Kovacs, Kit M; Wiig, Oystein; Bårdgard, Astrid J

    2002-11-01

    Polar bear (Ursus maritimus) hemoglobin (Hb) shows a low response to 2,3-diphosphoglycerate (2,3-DPG), compared to human Hb A0, even though these proteins have the same 2,3-DPG-binding site. In addition, polar bear Hb shows a high response to chloride and an alkaline Bohr effect (deltalog P50/deltapH) that is significantly greater than that of human Hb A0. The difference in sequence Pro (Hb A0)-->Gly (polar bear Hb) at position A2 in the A helix seems to be critical for reduced binding of 2,3-DPG. Our results also show that the A2 position may influence not only the flexibility of the A helix, but that differences in flexibility of the first turn of the A helix may affect the unloading of oxygen for the intrinsic ligand affinities of the alpha and beta chains. However, preferential binding to either chain can only take place if there is appreciable asymmetric binding of the phosphoric effector. Regarding this point, 31P NMR data suggest a loss of symmetry of the 2,3-DPG-binding site in the deoxyHb-2,3-DPG complex.

  9. Microfabricated, flowthrough porous apparatus for discrete detection of binding reactions

    DOEpatents

    Beattie, Kenneth L.

    1998-01-01

    An improved microfabricated apparatus for conducting a multiplicity of individual and simultaneous binding reactions is described. The apparatus comprises a substrate on which are located discrete and isolated sites for binding reactions. The apparatus is characterized by discrete and isolated regions that extend through said substrate and terminate on a second surface thereof such that when a test sample is allowed to the substrate, it is capable of penetrating through each such region during the course of said binding reaction. The apparatus is especially useful for sequencing by hybridization of DNA molecules.

  10. TRANSFAC: an integrated system for gene expression regulation.

    PubMed

    Wingender, E; Chen, X; Hehl, R; Karas, H; Liebich, I; Matys, V; Meinhardt, T; Prüss, M; Reuter, I; Schacherer, F

    2000-01-01

    TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles (http://transfac.gbf.de/TRANSFAC/). Its content has been enhanced, in particular by information about training sequences used for the construction of nucleotide matrices as well as by data on plant sites and factors. Moreover, TRANSFAC has been extended by two new modules: PathoDB provides data on pathologically relevant mutations in regulatory regions and transcription factor genes, whereas S/MARt DB compiles features of scaffold/matrix attached regions (S/MARs) and the proteins binding to them. Additionally, the databases TRANSPATH, about signal transduction, and CYTOMER, about organs and cell types, have been extended and are increasingly integrated with the TRANSFAC data sources.

  11. Human T-cell leukemia virus type 1 Tax requires direct access to DNA for recruitment of CREB binding protein to the viral promoter.

    PubMed

    Lenzmeier, B A; Giebler, H A; Nyborg, J K

    1998-02-01

    Efficient human T-cell leukemia virus type 1 (HTLV-1) replication and viral gene expression are dependent upon the virally encoded oncoprotein Tax. To activate HTLV-1 transcription, Tax interacts with the cellular DNA binding protein cyclic AMP-responsive element binding protein (CREB) and recruits the coactivator CREB binding protein (CBP), forming a nucleoprotein complex on the three viral cyclic AMP-responsive elements (CREs) in the HTLV-1 promoter. Short stretches of dG-dC-rich (GC-rich) DNA, immediately flanking each of the viral CREs, are essential for Tax recruitment of CBP in vitro and Tax transactivation in vivo. Although the importance of the viral CRE-flanking sequences is well established, several studies have failed to identify an interaction between Tax and the DNA. The mechanistic role of the viral CRE-flanking sequences has therefore remained enigmatic. In this study, we used high resolution methidiumpropyl-EDTA iron(II) footprinting to show that Tax extended the CREB footprint into the GC-rich DNA flanking sequences of the viral CRE. The Tax-CREB footprint was enhanced but not extended by the KIX domain of CBP, suggesting that the coactivator increased the stability of the nucleoprotein complex. Conversely, the footprint pattern of CREB on a cellular CRE lacking GC-rich flanking sequences did not change in the presence of Tax or Tax plus KIX. The minor-groove DNA binding drug chromomycin A3 bound to the GC-rich flanking sequences and inhibited the association of Tax and the Tax-CBP complex without affecting CREB binding. Tax specifically cross-linked to the viral CRE in the 5'-flanking sequence, and this cross-link was blocked by chromomycin A3. Together, these data support a model where Tax interacts directly with both CREB and the minor-groove viral CRE-flanking sequences to form a high-affinity binding site for the recruitment of CBP to the HTLV-1 promoter.

  12. CasA mediates Cas3-catalyzed target degradation during CRISPR RNA-guided interference.

    PubMed

    Hochstrasser, Megan L; Taylor, David W; Bhat, Prashant; Guegler, Chantal K; Sternberg, Samuel H; Nogales, Eva; Doudna, Jennifer A

    2014-05-06

    In bacteria, the clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) DNA-targeting complex Cascade (CRISPR-associated complex for antiviral defense) uses CRISPR RNA (crRNA) guides to bind complementary DNA targets at sites adjacent to a trinucleotide signature sequence called the protospacer adjacent motif (PAM). The Cascade complex then recruits Cas3, a nuclease-helicase that catalyzes unwinding and cleavage of foreign double-stranded DNA (dsDNA) bearing a sequence matching that of the crRNA. Cascade comprises the CasA-E proteins and one crRNA, forming a structure that binds and unwinds dsDNA to form an R loop in which the target strand of the DNA base pairs with the 32-nt RNA guide sequence. Single-particle electron microscopy reconstructions of dsDNA-bound Cascade with and without Cas3 reveal that Cascade positions the PAM-proximal end of the DNA duplex at the CasA subunit and near the site of Cas3 association. The finding that the DNA target and Cas3 colocalize with CasA implicates this subunit in a key target-validation step during DNA interference. We show biochemically that base pairing of the PAM region is unnecessary for target binding but critical for Cas3-mediated degradation. In addition, the L1 loop of CasA, previously implicated in PAM recognition, is essential for Cas3 activation following target binding by Cascade. Together, these data show that the CasA subunit of Cascade functions as an essential partner of Cas3 by recognizing DNA target sites and positioning Cas3 adjacent to the PAM to ensure cleavage.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pasek, Marta; Boeggeman, Elizabeth; Ramakrishnan, Boopathy

    The expression of recombinant proteins in Escherichia coli often leads to inactive aggregated proteins known as the inclusion bodies. To date, the best available tool has been the use of fusion tags, including the carbohydrate-binding protein; e.g., the maltose-binding protein (MBP) that enhances the solubility of recombinant proteins. However, none of these fusion tags work universally with every partner protein. We hypothesized that galectins, which are also carbohydrate-binding proteins, may help as fusion partners in folding the mammalian proteins in E. coli. Here we show for the first time that a small soluble lectin, human galectin-1, one member of amore » large galectin family, can function as a fusion partner to produce soluble folded recombinant human glycosyltransferase, {beta}-1,4-galactosyltransferase-7 ({beta}4Gal-T7), in E. coli. The enzyme {beta}4Gal-T7 transfers galactose to xylose during the synthesis of the tetrasaccharide linker sequence attached to a Ser residue of proteoglycans. Without a fusion partner, {beta}4Gal-T7 is expressed in E. coli as inclusion bodies. We have designed a new vector construct, pLgals1, from pET-23a that includes the sequence for human galectin-1, followed by the Tev protease cleavage site, a 6x His-coding sequence, and a multi-cloning site where a cloned gene is inserted. After lactose affinity column purification of galectin-1-{beta}4Gal-T7 fusion protein, the unique protease cleavage site allows the protein {beta}4Gal-T7 to be cleaved from galectin-1 that binds and elutes from UDP-agarose column. The eluted protein is enzymatically active, and shows CD spectra comparable to the folded {beta}4Gal-T1. The engineered galectin-1 vector could prove to be a valuable tool for expressing other proteins in E. coli.« less

  14. Cloning and characterization of a novel human STAR domain containing cDNA KHDRBS2.

    PubMed

    Wang, Liu; Xu, Jian; Zeng, Li; Ye, Xin; Wu, Qihan; Dai, Jianfeng; Ji, Chaoneng; Gu, Shaohua; Zhao, Chunhua; Xie, Yi; Mao, Yumin

    2002-12-01

    KHDRBS2, KH domain containing, RNA binding, signal transduction associated 2, is an RNA-binding protein that is tyrosine phosphorylated by Src during mitosis. It contains a KH domain,which is embedded in a larger conserved domain called the STAR domain. This protein has a 99% sequence identity with rat SLM-1 (the Sam68-like mammalian protein 1) and 98% sequence identity with mouse SLM-1 in its STAR domain. KHDRBS2 has the characteristic Sam68 SH2 and SH3 domain binding sites. RT-PCR analysis showed its transcript is ubiquitously expressed. The characterization of KHDRBS2 indicates it may link tyrosine kinase signaling cascades with some aspect of RNA metabolism.

  15. Identification of a factor in HeLa cells specific for an upstream transcriptional control sequence of an EIA-inducible adenovirus promoter and its relative abundance in infected and uninfected cells.

    PubMed Central

    SivaRaman, L; Subramanian, S; Thimmappaya, B

    1986-01-01

    Utilizing the gel electrophoresis/DNA binding assay, a factor specific for the upstream transcriptional control sequence of the EIA-inducible adenovirus EIIA-early promoter has been detected in HeLa cell nuclear extract. Analysis of linker-scanning mutants of the promoter by DNA binding assays and methylation-interference experiments show that the factor binds to the 17-nucleotide sequence 5' TGGAGATGACGTAGTTT 3' located between positions -66 and -82 upstream from the cap site. This sequence has been shown to be essential for transcription of this promoter. The EIIA-early-promoter specific factor was found to be present at comparable levels in uninfected HeLa cells and in cells infected with either wild-type adenovirus or the EIA-deletion mutant dl312 under conditions in which the EIA proteins are induced to high levels [7 or 20 hr after infection in the presence of arabinonucleoside (cytosine arabinoside)]. Based on the quantitation in DNA binding assays, it appears that the mechanism of EIA-activated transcription of the EIIA-early promoter does not involve a net change in the amounts of this factor. Images PMID:2942943

  16. In vitro fluorescence studies of transcription factor IIB-DNA interaction.

    PubMed

    Górecki, Andrzej; Figiel, Małgorzata; Dziedzicka-Wasylewska, Marta

    2015-01-01

    General transcription factor TFIIB is one of the basal constituents of the preinitiation complex of eukaryotic RNA polymerase II, acting as a bridge between the preinitiation complex and the polymerase, and binding promoter DNA in an asymmetric manner, thereby defining the direction of the transcription. Methods of fluorescence spectroscopy together with circular dichroism spectroscopy were used to observe conformational changes in the structure of recombinant human TFIIB after binding to specific DNA sequence. To facilitate the exploration of the structural changes, several site-directed mutations have been introduced altering the fluorescence properties of the protein. Our observations showed that binding of specific DNA sequences changed the protein structure and dynamics, and TFIIB may exist in two conformational states, which can be described by a different microenvironment of W52. Fluorescence studies using both intrinsic and exogenous fluorophores showed that these changes significantly depended on the recognition sequence and concerned various regions of the protein, including those interacting with other transcription factors and RNA polymerase II. DNA binding can cause rearrangements in regions of proteins interacting with the polymerase in a manner dependent on the recognized sequences, and therefore, influence the gene expression.

  17. GBshape: a genome browser database for DNA shape annotations

    PubMed Central

    Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J.; Parker, Stephen C.J.; Nuzhdin, Sergey V.; Tullius, Thomas D.; Rohs, Remo

    2015-01-01

    Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. PMID:25326329

  18. Structural analysis of the 5{prime} region of mouse and human Huntington disease genes reveals conservation of putative promoter region and Di- and trinucleotide polymorphisms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Biaoyang; Nasir, J.; Kalchman, M.A.

    1995-02-10

    We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less

  19. Generalized theory on the mechanism of site-specific DNA-protein interactions

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-05-01

    We develop a generalized theoretical framework on the binding of transcription factor proteins (TFs) with specific sites on DNA that takes into account the interplay of various factors regarding overall electrostatic potential at the DNA-protein interface, occurrence of kinetic traps along the DNA sequence, presence of other roadblock protein molecules along DNA and crowded environment, conformational fluctuations in the DNA binding domains (DBDs) of TFs, and the conformational state of the DNA. Starting from a Smolochowski type theoretical framework on site-specific binding of TFs we logically build our model by adding the effects of these factors one by one. Our generalized two-step model suggests that the electrostatic attractive forces present inbetween the positively charged DBDs of TFs and the negatively charged phosphate backbone of DNA, along with the counteracting shielding effects of solvent ions, is the core factor that creates a fluidic type environment at the DNA-protein interface. This in turn facilitates various one-dimensional diffusion (1Dd) processes such as sliding, hopping and intersegmental transfers. These facilitating processes as well as flipping dynamics of conformational states of DBDs of TFs between stationary and mobile states can enhance the 1Dd coefficient on a par with three-dimensional diffusion (3Dd). The random coil conformation of DNA also plays critical roles in enhancing the site-specific association rate. The extent of enhancement over the 3Dd controlled rate seems to be directly proportional to the maximum possible 1Dd length. We show that the overall site-specific binding rate scales with the length of DNA in an asymptotic way. For relaxed DNA, the specific binding rate will be independent of the length of DNA as length increases towards infinity. For condensed DNA as in in vivo conditions, the specific binding rate depends on the length of DNA in a turnover way with a maximum. This maximum rate seems to scale with the maximum possible 1Dd length of TFs in a square root manner. Results suggest that 1Dd processes contribute much less to the enhancement of specific binding rate under in vivo conditions for condensed DNA. There exists a critical length of binding stretch of TFs beyond which the probability associated with the random occurrence of similar specific binding sites will be close to zero. TFs in natural systems from prokaryotes to eukaryotes seem to handle sequence-mediated kinetic traps via increasing the length of their recognition stretch or combinatorial binding. TFs overcome the hurdles of roadblocks via switching efficiently between sliding, hopping and intersegmental transfer modes. The site-specific binding rate as well as the maximum possible 1Dd length seem to be directly proportional to the square root of the probability (p R) of finding a nonspecific binding site to be free from dynamic roadblocks. Here p R seems to be a function of the number of nsbs available per DNA binding protein (ϕ) inside the living cell. It seems that p R  >  0.8 when ϕ  >  10 which is true for the Escherichia coli cell system.

  20. Improve the prediction of RNA-binding residues using structural neighbours.

    PubMed

    Li, Quan; Cao, Zanxia; Liu, Haiyan

    2010-03-01

    The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

Top