Sample records for minimal binding sequence

  1. Functional display of platelet-binding VWF fragments on filamentous bacteriophage.

    PubMed

    Yee, Andrew; Tan, Fen-Lai; Ginsburg, David

    2013-01-01

    von Willebrand factor (VWF) tethers platelets to sites of vascular injury via interaction with the platelet surface receptor, GPIb. To further define the VWF sequences required for VWF-platelet interaction, a phage library displaying random VWF protein fragments was screened against formalin-fixed platelets. After 3 rounds of affinity selection, DNA sequencing of platelet-bound clones identified VWF peptides mapping exclusively to the A1 domain. Aligning these sequences defined a minimal, overlapping segment spanning P1254-A1461, which encompasses the C1272-C1458 cystine loop. Analysis of phage carrying a mutated A1 segment (C1272/1458A) confirmed the requirement of the cystine loop for optimal binding. Four rounds of affinity maturation of a randomly mutagenized A1 phage library identified 10 and 14 unique mutants associated with enhanced platelet binding in the presence and absence of botrocetin, respectively, with 2 mutants (S1370G and I1372V) common to both conditions. These results demonstrate the utility of filamentous phage for studying VWF protein structure-function and identify a minimal, contiguous peptide that bind to formalin-fixed platelets, confirming the importance of the VWF A1 domain with no evidence for another independently platelet-binding segment within VWF. These findings also point to key structural elements within the A1 domain that regulate VWF-platelet adhesion.

  2. In vitro selection using a dual RNA library that allows primerless selection

    PubMed Central

    Jarosch, Florian; Buchner, Klaus; Klussmann, Sven

    2006-01-01

    High affinity target-binding aptamers are identified from random oligonucleotide libraries by an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX). Since the SELEX process includes a PCR amplification step the randomized region of the oligonucleotide libraries need to be flanked by two fixed primer binding sequences. These primer binding sites are often difficult to truncate because they may be necessary to maintain the structure of the aptamer or may even be part of the target binding motif. We designed a novel type of RNA library that carries fixed sequences which constrain the oligonucleotides into a partly double-stranded structure, thereby minimizing the risk that the primer binding sequences become part of the target-binding motif. Moreover, the specific design of the library including the use of tandem RNA Polymerase promoters allows the selection of oligonucleotides without any primer binding sequences. The library was used to select aptamers to the mirror-image peptide of ghrelin. Ghrelin is a potent stimulator of growth-hormone release and food intake. After selection, the identified aptamer sequences were directly synthesized in their mirror-image configuration. The final 44 nt-Spiegelmer, named NOX-B11-3, blocks ghrelin action in a cell culture assay displaying an IC50 of 4.5 nM at 37°C. PMID:16855281

  3. Systematic optimization model and algorithm for binding sequence selection in computational enzyme design

    PubMed Central

    Huang, Xiaoqiang; Han, Kehang; Zhu, Yushan

    2013-01-01

    A systematic optimization model for binding sequence selection in computational enzyme design was developed based on the transition state theory of enzyme catalysis and graph-theoretical modeling. The saddle point on the free energy surface of the reaction system was represented by catalytic geometrical constraints, and the binding energy between the active site and transition state was minimized to reduce the activation energy barrier. The resulting hyperscale combinatorial optimization problem was tackled using a novel heuristic global optimization algorithm, which was inspired and tested by the protein core sequence selection problem. The sequence recapitulation tests on native active sites for two enzyme catalyzed hydrolytic reactions were applied to evaluate the predictive power of the design methodology. The results of the calculation show that most of the native binding sites can be successfully identified if the catalytic geometrical constraints and the structural motifs of the substrate are taken into account. Reliably predicting active site sequences may have significant implications for the creation of novel enzymes that are capable of catalyzing targeted chemical reactions. PMID:23649589

  4. Architecture of a Fur Binding Site: a Comparative Analysis

    PubMed Central

    Lavrrar, Jennifer L.; McIntosh, Mark A.

    2003-01-01

    Fur is an iron-binding transcriptional repressor that recognizes a 19-bp consensus site of the sequence 5′-GATAATGATAATCATTATC-3′. This site can be defined as three adjacent hexamers of the sequence 5′-GATAAT-3′, with the third being slightly imperfect (an F-F-F configuration), or as two hexamers in the forward orientation separated by one base pair from a third hexamer in the reverse orientation (an F-F-x-R configuration). Although Fur can bind synthetic DNA sequences containing the F-F-F arrangement, most natural binding sites are variations of the F-F-x-R arrangement. The studies presented here compared the ability of Fur to recognize synthetic DNA sequences containing two to four adjacent hexamers with binding to sequences containing variations of the F-F-x-R arrangement (including natural operator sequences from the entS and fepB promoter regions of Escherichia coli). Gel retardation assays showed that the F-F-x-R architecture was necessary for high-affinity Fur-DNA interactions and that contiguous hexamers were not recognized as effectively. In addition, the stoichiometry of Fur at each binding site was determined, showing that Fur interacted with its minimal 19-bp binding site as two overlapping dimers. These data confirm the proposed overlapping-dimer binding model, where the unit of interaction with a single Fur dimer is two inverted hexamers separated by a C:G base pair, with two overlapping units comprising the 19-bp consensus binding site required for the high-affinity interaction with two Fur dimers. PMID:12644489

  5. Adenovirus sequences required for replication in vivo.

    PubMed Central

    Wang, K; Pearson, G D

    1985-01-01

    We have studied the in vivo replication properties of plasmids carrying deletion mutations within cloned adenovirus terminal sequences. Deletion mapping located the adenovirus DNA replication origin entirely within the first 67 bp of the adenovirus inverted terminal repeat. This region could be further subdivided into two functional domains: a minimal replication origin and an adjacent auxillary region which boosted the efficiency of replication by more than 100-fold. The minimal origin occupies the first 18 to 21 bp and includes sequences conserved between all adenovirus serotypes. The adjacent auxillary region extends past nucleotide 36 but not past nucleotide 67 and contains the binding site for nuclear factor I. Images PMID:2991857

  6. Convergent evolution of adenosine aptamers spanning bacterial, human, and random sequences revealed by structure-based bioinformatics and genomic SELEX

    PubMed Central

    Vu, Michael M. K.; Jameson, Nora E.; Masuda, Stuart J.; Lin, Dana; Larralde-Ridaura, Rosa; Lupták, Andrej

    2012-01-01

    SUMMARY Aptamers are structured macromolecules in vitro evolved to bind molecular targets, whereas in nature they form the ligand-binding domains of riboswitches. Adenosine aptamers of a single structural family were isolated several times from random pools but they have not been identified in genomic sequences. We used two unbiased methods, structure-based bioinformatics and human genome-based in vitro selection, to identify aptamers that form the same adenosine-binding structure in a bacterium, and several vertebrates, including humans. Two of the human aptamers map to introns of RAB3C and FGD3 genes. The RAB3C aptamer binds ATP with dissociation constants about ten times lower than physiological ATP concentration, while the minimal FGD3 aptamer binds ATP only co-transcriptionally. PMID:23102219

  7. BIPAD: A web server for modeling bipartite sequence elements

    PubMed Central

    Bi, Chengpeng; Rogan, Peter K

    2006-01-01

    Background Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. Results We introduce the Bipad Server [1], a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. Conclusion The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins. PMID:16503993

  8. The General Definition of the p97/Valosin-containing Protein (VCP)-interacting Motif (VIM) Delineates a New Family of p97 Cofactors*

    PubMed Central

    Stapf, Christopher; Cartwright, Edward; Bycroft, Mark; Hofmann, Kay; Buchberger, Alexander

    2011-01-01

    Cellular functions of the essential, ubiquitin-selective AAA ATPase p97/valosin-containing protein (VCP) are controlled by regulatory cofactors determining substrate specificity and fate. Most cofactors bind p97 through a ubiquitin regulatory X (UBX) or UBX-like domain or linear sequence motifs, including the hitherto ill defined p97/VCP-interacting motif (VIM). Here, we present the new, minimal consensus sequence RX5AAX2R as a general definition of the VIM that unites a novel family of known and putative p97 cofactors, among them UBXD1 and ZNF744/ANKZF1. We demonstrate that this minimal VIM consensus sequence is necessary and sufficient for p97 binding. Using NMR chemical shift mapping, we identified several residues of the p97 N-terminal domain (N domain) that are critical for VIM binding. Importantly, we show that cellular stress resistance conferred by the yeast VIM-containing cofactor Vms1 depends on the physical interaction between its VIM and the critical N domain residues of the yeast p97 homolog, Cdc48. Thus, the VIM-N domain interaction characterized in this study is required for the physiological function of Vms1 and most likely other members of the newly defined VIM family of cofactors. PMID:21896481

  9. The NH2-terminal php domain of the alpha subunit of the Escherichia coli replicase binds the epsilon proofreading subunit.

    PubMed

    Wieczorek, Anna; McHenry, Charles S

    2006-05-05

    The alpha subunit of the replicase of all bacteria contains a php domain, initially identified by its similarity to histidinol phosphatase but of otherwise unknown function (Aravind, L., and Koonin, E. V. (1998) Nucleic Acids Res. 26, 3746-3752). Deletion of 60 residues from the NH2 terminus of the alpha php domain destroys epsilon binding. The minimal 255-residue php domain, estimated by sequence alignment with homolog YcdX, is insufficient for epsilon binding. However, a 320-residue segment including sequences that immediately precede the polymerase domain binds epsilon with the same affinity as the 1160-residue full-length alpha subunit. A subset of mutations of a conserved acidic residue (Asp43 in Escherichia coli alpha) present in the php domain of all bacterial replicases resulted in defects in epsilon binding. Using sequence alignments, we show that the prototypical gram+ Pol C, which contains the polymerase and proofreading activities within the same polypeptide chain, has an epsilon-like sequence inserted in a surface loop near the center of the homologous YcdX protein. These findings suggest that the php domain serves as a platform to enable coordination of proofreading and polymerase activities during chromosomal replication.

  10. Electrostatically Biased Binding of Kinesin to Microtubules

    PubMed Central

    Zheng, Wenjun; Alonso, Maria; Huber, Gary; Dlugosz, Maciej; McCammon, J. Andrew; Cross, Robert A.

    2011-01-01

    The minimum motor domain of kinesin-1 is a single head. Recent evidence suggests that such minimal motor domains generate force by a biased binding mechanism, in which they preferentially select binding sites on the microtubule that lie ahead in the progress direction of the motor. A specific molecular mechanism for biased binding has, however, so far been lacking. Here we use atomistic Brownian dynamics simulations combined with experimental mutagenesis to show that incoming kinesin heads undergo electrostatically guided diffusion-to-capture by microtubules, and that this produces directionally biased binding. Kinesin-1 heads are initially rotated by the electrostatic field so that their tubulin-binding sites face inwards, and then steered towards a plus-endwards binding site. In tethered kinesin dimers, this bias is amplified. A 3-residue sequence (RAK) in kinesin helix alpha-6 is predicted to be important for electrostatic guidance. Real-world mutagenesis of this sequence powerfully influences kinesin-driven microtubule sliding, with one mutant producing a 5-fold acceleration over wild type. We conclude that electrostatic interactions play an important role in the kinesin stepping mechanism, by biasing the diffusional association of kinesin with microtubules. PMID:22140358

  11. H-2RIIBP, a member of the nuclear hormone receptor superfamily that binds to both the regulatory element of major histocompatibility class I genes and the estrogen response element.

    PubMed

    Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K

    1989-11-01

    Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, but not to other MHC cis-acting sequences or to mutant region II sequences, similar to the naturally occurring region II factor in mouse cells. The deduced amino acid sequence of H-2RIIBP revealed two putative zinc fingers homologous to the DNA-binding domain of steroid/thyroid hormone receptors. Although sequence similarity in other regions was minimal, H-2RIIBP has apparent modular domains characteristic of the nuclear hormone receptors. Further analyses showed that both H-2RIIBP and the natural region II factor bind to the estrogen response element (ERE) of the vitellogenin A2 gene. The ERE is composed of a palindrome, and half of this palindrome resembles the region II binding site of the MHC CRE. These results indicate that H-2RIIBP (i) is a member of the superfamily of nuclear hormone receptors and (ii) may regulate not only MHC class I genes but also genes containing the ERE and related sequences. Sequences homologous to the H-2RIIBP gene are widely conserved in the animal kingdom. H-2RIIBP mRNA is expressed in many mouse tissues, in agreement with the distribution of the natural region II factor.

  12. Methylene blue binding to DNA with alternating AT base sequence: minor groove binding is favored over intercalation.

    PubMed

    Rohs, Remo; Sklenar, Heinz

    2004-04-01

    The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.

  13. De novo design and engineering of functional metal and porphyrin-binding protein domains

    NASA Astrophysics Data System (ADS)

    Everson, Bernard H.

    In this work, I describe an approach to the rational, iterative design and characterization of two functional cofactor-binding protein domains. First, a hybrid computational/experimental method was developed with the aim of algorithmically generating a suite of porphyrin-binding protein sequences with minimal mutual sequence information. This method was explored by generating libraries of sequences, which were then expressed and evaluated for function. One successful sequence is shown to bind a variety of porphyrin-like cofactors, and exhibits light- activated electron transfer in mixed hemin:chlorin e6 and hemin:Zn(II)-protoporphyrin IX complexes. These results imply that many sophisticated functions such as cofactor binding and electron transfer require only a very small number of residue positions in a protein sequence to be fixed. Net charge and hydrophobic content are important in determining protein solubility and stability. Accordingly, rational modifications were made to the aforementioned design procedure in order to improve its overall success rate. The effects of these modifications are explored using two `next-generation' sequence libraries, which were separately expressed and evaluated. Particular modifications to these design parameters are demonstrated to effectively double the purification success rate of the procedure. Finally, I describe the redesign of the artificial di-iron protein DF2 into CDM13, a single chain di-Manganese four-helix bundle. CDM13 acts as a functional model of natural manganese catalase, exhibiting a kcat of 0.08s-1 under steady-state conditions. The bound manganese cofactors have a reduction potential of +805 mV vs NHE, which is too high for efficient dismutation of hydrogen peroxide. These results indicate that as a high-potential manganese complex, CDM13 may represent a promising first step toward a polypeptide model of the Oxygen Evolving Complex of the photosynthetic enzyme Photosystem II.

  14. Functional Requirements for Fab-7 Boundary Activity in the Bithorax Complex

    PubMed Central

    Wolle, Daniel; Cleard, Fabienne; Aoki, Tsutomu; Deshpande, Girish; Karch, Francois

    2015-01-01

    Chromatin boundaries are architectural elements that determine the three-dimensional folding of the chromatin fiber and organize the chromosome into independent units of genetic activity. The Fab-7 boundary from the Drosophila bithorax complex (BX-C) is required for the parasegment-specific expression of the Abd-B gene. We have used a replacement strategy to identify sequences that are necessary and sufficient for Fab-7 boundary function in the BX-C. Fab-7 boundary activity is known to depend on factors that are stage specific, and we describe a novel ∼700-kDa complex, the late boundary complex (LBC), that binds to Fab-7 sequences that have insulator functions in late embryos and adults. We show that the LBC is enriched in nuclear extracts from late, but not early, embryos and that it contains three insulator proteins, GAF, Mod(mdg4), and E(y)2. Its DNA binding properties are unusual in that it requires a minimal sequence of >65 bp; however, other than a GAGA motif, the three Fab-7 LBC recognition elements display few sequence similarities. Finally, we show that mutations which abrogate LBC binding in vitro inactivate the Fab-7 boundary in the BX-C. PMID:26303531

  15. Short Stat5-Interacting Peptide Derived from Phospholipase C-β3 Inhibits Hematopoietic Cell Proliferation and Myeloid Differentiation

    PubMed Central

    Yasudo, Hiroki; Ando, Tomoaki; Xiao, Wenbin; Kawakami, Yuko; Kawakami, Toshiaki

    2011-01-01

    Constitutive activation of the transcription factor Stat5 in hematopoietic stem/progenitor cells leads to various hematopoietic malignancies including myeloproliferative neoplasm (MPN). Our recent study found that phospholipase C (PLC)-β3 is a novel tumor suppressor involved in MPN, lymphoma and other tumors. Stat5 activity is negatively regulated by the SH2 domain-containing protein phosphatase SHP-1 in a PLC-β3-dependent manner. PLC-β3 can form the multimolecular SPS complex together with SHP-1 and Stat5. The close physical proximity of SHP-1 and Stat5 brought about by interacting with the C-terminal segment of PLC-β3 (PLC-β3-CT) accelerates SHP-1-mediated dephosphorylation of Stat5. Here we identify the minimal sequences within PLC-β3-CT required for its tumor suppressor function. Two of the three Stat5-binding noncontiguous regions, one of which also binds SHP-1, substantially inhibited in vitro proliferation of Ba/F3 cells. Surprisingly, an 11-residue Stat5-binding peptide (residues 988-998) suppressed Stat5 activity in Ba/F3 cells and in vivo proliferation and myeloid differentiation of hematopoietic stem/progenitor cells. Therefore, this study further defines PLC-β3-CT as the Stat5- and SHP-1-binding domain by identifying minimal functional sequences of PLC-β3 for its tumor suppressor function and implies their potential utility in the control of hematopoietic malignancies. PMID:21949826

  16. Nuclear magnetic resonance-based model of a TF1/HmU-DNA complex.

    PubMed

    Silva, M V; Pasternack, L B; Kearns, D R

    1997-12-15

    Transcription factor 1 (TF1), a type II DNA-binding protein encoded by the Bacillus subtilis bacteriophage SPO1, has the capacity for sequence-selective DNA binding and a preference for 5-hydroxymethyl-2'-deoxyuridine (HmU)-containing DNA. In NMR studies of the TF1/HmU-DNA complex, intermolecular NOEs indicate that the flexible beta-ribbon and C-terminal alpha-helix are involved in the DNA-binding site of TF1, placing it in the beta-sheet category of DNA-binding proteins proposed to bind by wrapping two beta-ribbon "arms" around the DNA. Intermolecular and intramolecular NOEs were used to generate an energy-minimized model of the protein-DNA complex in which both DNA bending and protein structure changes are evident.

  17. Role for a region of helically unstable DNA within the Epstein-Barr virus latent cycle origin of DNA replication oriP in origin function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Polonskaya, Zhanna; Benham, Craig J.; Hearing, Janet

    The minimal replicator of the Epstein-Barr virus (EBV) latent cycle origin of DNA replication oriP is composed of two binding sites for the Epstein-Barr virus nuclear antigen-1 (EBNA-1) and flanking inverted repeats that bind the telomere repeat binding factor TRF2. Although not required for minimal replicator activity, additional binding sites for EBNA-1 and TRF2 and one or more auxiliary elements located to the right of the EBNA-1/TRF2 sites are required for the efficient replication of oriP plasmids. Another region of oriP that is predicted to be destabilized by DNA supercoiling is shown here to be an important functional component ofmore » oriP. The ability of DNA fragments of unrelated sequence and possessing supercoiled-induced DNA duplex destabilized (SIDD) structures, but not fragments characterized by helically stable DNA, to substitute for this component of oriP demonstrates a role for the SIDD region in the initiation of oriP-plasmid DNA replication.« less

  18. Prospective identification of parasitic sequences in phage display screens

    PubMed Central

    Matochko, Wadim L.; Cory Li, S.; Tang, Sindy K.Y.; Derda, Ratmir

    2014-01-01

    Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the ‘parasites’), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 ‘parasites’. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. PMID:24217917

  19. Bacteroides fragilis mobilizable transposon Tn5520 requires a 71 base pair origin of transfer sequence and a single mobilization protein for relaxosome formation during conjugation.

    PubMed

    Vedantam, Gayatri; Knopf, Sarah; Hecht, David W

    2006-01-01

    Tn5520 is the smallest known bacterial mobilizable transposon and was isolated from an antibiotic resistant Bacteroides fragilis clinical isolate. When a conjugation apparatus is provided in trans, Tn5520 is mobilized (transferred) efficiently within, and from, both Bacteroides spp. and Escherichia coli. Only two genes are present on Tn5520; one encodes an integrase, and the other a multifunctional mobilization (Mob) protein BmpH. BmpH is essential for Tn5520 mobility. The focus of this study was to identify the Tn5520 origin of conjugative transfer (oriT) and to study BmpH-oriT binding. We delimited the functional Tn5520 oriT to a 71 bp sequence upstream of the bmpH gene. A plasmid vector harbouring this minimal 71 bp oriT was mobilized at the same frequency as that of intact Tn5520. The minimal oriT contains one 17 bp inverted repeat (IR) sequence. We constructed and tested multiple IR mutants and showed that the IR was essential in its entirety for mobilization. A nick site sequence (5'-GCTAC-3') was also identified within the minimal oriT; this sequence resembled nick sites found in plasmids of Gram positive origin. We further showed that mutation of a highly conserved GC dinucleotide in the nick site sequence completely abolished mobilization. We also purified BmpH and showed that it specifically bound a Tn5520 oriT fragment in electrophoretic mobility shift assays. We also identified non-nick site sequences within the minimal oriT that were essential for mobilization. We hypothesize that transposon-based single Mob protein systems may contribute to efficient gene dissemination from Bacteroides spp., because fewer DNA processing proteins are required for relaxosome formation.

  20. Single-Nucleotide-Specific Targeting of the Tf1 Retrotransposon Promoted by the DNA-Binding Protein Sap1 of Schizosaccharomyces pombe.

    PubMed

    Hickey, Anthony; Esnault, Caroline; Majumdar, Anasuya; Chatterjee, Atreyi Ghatak; Iben, James R; McQueen, Philip G; Yang, Andrew X; Mizuguchi, Takeshi; Grewal, Shiv I S; Levin, Henry L

    2015-11-01

    Transposable elements (TEs) constitute a substantial fraction of the eukaryotic genome and, as a result, have a complex relationship with their host that is both adversarial and dependent. To minimize damage to cellular genes, TEs possess mechanisms that target integration to sequences of low importance. However, the retrotransposon Tf1 of Schizosaccharomyces pombe integrates with a surprising bias for promoter sequences of stress-response genes. The clustering of integration in specific promoters suggests that Tf1 possesses a targeting mechanism that is important for evolutionary adaptation to changes in environment. We report here that Sap1, an essential DNA-binding protein, plays an important role in Tf1 integration. A mutation in Sap1 resulted in a 10-fold drop in Tf1 transposition, and measures of transposon intermediates support the argument that the defect occurred in the process of integration. Published ChIP-Seq data on Sap1 binding combined with high-density maps of Tf1 integration that measure independent insertions at single-nucleotide positions show that 73.4% of all integration occurs at genomic sequences bound by Sap1. This represents high selectivity because Sap1 binds just 6.8% of the genome. A genome-wide analysis of promoter sequences revealed that Sap1 binding and amounts of integration correlate strongly. More important, an alignment of the DNA-binding motif of Sap1 revealed integration clustered on both sides of the motif and showed high levels specifically at positions +19 and -9. These data indicate that Sap1 contributes to the efficiency and position of Tf1 integration. Copyright © 2015 by the Genetics Society of America.

  1. Single-Nucleotide-Specific Targeting of the Tf1 Retrotransposon Promoted by the DNA-Binding Protein Sap1 of Schizosaccharomyces pombe

    PubMed Central

    Hickey, Anthony; Esnault, Caroline; Majumdar, Anasuya; Chatterjee, Atreyi Ghatak; Iben, James R.; McQueen, Philip G.; Yang, Andrew X.; Mizuguchi, Takeshi; Grewal, Shiv I. S.; Levin, Henry L.

    2015-01-01

    Transposable elements (TEs) constitute a substantial fraction of the eukaryotic genome and, as a result, have a complex relationship with their host that is both adversarial and dependent. To minimize damage to cellular genes, TEs possess mechanisms that target integration to sequences of low importance. However, the retrotransposon Tf1 of Schizosaccharomyces pombe integrates with a surprising bias for promoter sequences of stress-response genes. The clustering of integration in specific promoters suggests that Tf1 possesses a targeting mechanism that is important for evolutionary adaptation to changes in environment. We report here that Sap1, an essential DNA-binding protein, plays an important role in Tf1 integration. A mutation in Sap1 resulted in a 10-fold drop in Tf1 transposition, and measures of transposon intermediates support the argument that the defect occurred in the process of integration. Published ChIP-Seq data on Sap1 binding combined with high-density maps of Tf1 integration that measure independent insertions at single-nucleotide positions show that 73.4% of all integration occurs at genomic sequences bound by Sap1. This represents high selectivity because Sap1 binds just 6.8% of the genome. A genome-wide analysis of promoter sequences revealed that Sap1 binding and amounts of integration correlate strongly. More important, an alignment of the DNA-binding motif of Sap1 revealed integration clustered on both sides of the motif and showed high levels specifically at positions +19 and −9. These data indicate that Sap1 contributes to the efficiency and position of Tf1 integration. PMID:26358720

  2. Interactions of Human Nucleotide Excision Repair Protein XPA with DNA and RPA70 Delta c327: Chemical Shift Mapping and N-15 NMR Relaxation Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchko, Garry W.; Daughdrill, Gary W.; De Lorimier, Robert

    1999-12-28

    Human XPA is an essential component in the multienzyme nucleotide excision repair (NER) pathway. The solution structure of the minimal DNA binding domain of XPA (XPA-MBD: M98-F219) was recently determined [Buchko et al. (1998) Nucleic Acids Res. 26, 2779-2788, Ikegami et al (1998) Nat. Struct. Biol. 5, 701-706] and shown to consist of a compact zinc-binding core and a loop-rich C-terminal subdomain connected by a linker sequence.

  3. Full trans-activation mediated by the immediate-early protein of equine herpesvirus 1 requires a consensus TATA box, but not its cognate binding sequence.

    PubMed

    Kim, Seong K; Shakya, Akhalesh K; O'Callaghan, Dennis J

    2016-01-04

    The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt -89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). Copyright © 2015 Elsevier B.V. All rights reserved.

  4. Full trans–activation mediated by the immediate–early protein of equine herpesvirus 1 requires a consensus TATA box, but not its cognate binding sequence

    PubMed Central

    Kim, Seong K.; Shakya, Akhalesh K.; O'Callaghan, Dennis J.

    2015-01-01

    The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt −89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). PMID:26541315

  5. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    PubMed

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2017-03-17

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. GATA: A graphic alignment tool for comparative sequenceanalysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nix, David A.; Eisen, Michael B.

    2005-01-01

    Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less

  7. Functional Requirements for Fab-7 Boundary Activity in the Bithorax Complex.

    PubMed

    Wolle, Daniel; Cleard, Fabienne; Aoki, Tsutomu; Deshpande, Girish; Schedl, Paul; Karch, Francois

    2015-11-01

    Chromatin boundaries are architectural elements that determine the three-dimensional folding of the chromatin fiber and organize the chromosome into independent units of genetic activity. The Fab-7 boundary from the Drosophila bithorax complex (BX-C) is required for the parasegment-specific expression of the Abd-B gene. We have used a replacement strategy to identify sequences that are necessary and sufficient for Fab-7 boundary function in the BX-C. Fab-7 boundary activity is known to depend on factors that are stage specific, and we describe a novel ∼700-kDa complex, the late boundary complex (LBC), that binds to Fab-7 sequences that have insulator functions in late embryos and adults. We show that the LBC is enriched in nuclear extracts from late, but not early, embryos and that it contains three insulator proteins, GAF, Mod(mdg4), and E(y)2. Its DNA binding properties are unusual in that it requires a minimal sequence of >65 bp; however, other than a GAGA motif, the three Fab-7 LBC recognition elements display few sequence similarities. Finally, we show that mutations which abrogate LBC binding in vitro inactivate the Fab-7 boundary in the BX-C. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  8. A dimer of the lymphoid protein RAG1 recognizes the recombination signal sequence and the complex stably incorporates the high mobility group protein HMG2.

    PubMed

    Rodgers, K K; Villey, I J; Ptaszek, L; Corbett, E; Schatz, D G; Coleman, J E

    1999-07-15

    RAG1 and RAG2 are the two lymphoid-specific proteins required for the cleavage of DNA sequences known as the recombination signal sequences (RSSs) flanking V, D or J regions of the antigen-binding genes. Previous studies have shown that RAG1 alone is capable of binding to the RSS, whereas RAG2 only binds as a RAG1/RAG2 complex. We have expressed recombinant core RAG1 (amino acids 384-1008) in Escherichia coli and demonstrated catalytic activity when combined with RAG2. This protein was then used to determine its oligomeric forms and the dissociation constant of binding to the RSS. Electrophoretic mobility shift assays show that up to three oligomeric complexes of core RAG1 form with a single RSS. Core RAG1 was found to exist as a dimer both when free in solution and as the minimal species bound to the RSS. Competition assays show that RAG1 recognizes both the conserved nonamer and heptamer sequences of the RSS. Zinc analysis shows the core to contain two zinc ions. The purified RAG1 protein overexpressed in E.coli exhibited the expected cleavage activity when combined with RAG2 purified from transfected 293T cells. The high mobility group protein HMG2 is stably incorporated into the recombinant RAG1/RSS complex and can increase the affinity of RAG1 for the RSS in the absence of RAG2.

  9. Development of peptoid-based ligands for the removal of cadmium from biological media

    DOE PAGES

    Knight, Abigail S.; Zhou, Effie Y.; Francis, Matthew B.

    2015-05-14

    Cadmium poisoning poses a serious health concern due to cadmium's increasing industrial use, yet there is currently no recommended treatment. The selective coordination of cadmium in a biological environment—i.e. in the presence of serum ions, small molecules, and proteins—is a difficult task. To address this challenge, a combinatorial library of peptoid-based ligands has been evaluated to identify structures that selectively bind to cadmium in human serum with minimal chelation of essential metal ions. Eighteen unique ligands were identified in this screening procedure, and the binding affinity of each was measured using metal titrations monitored by UV-vis spectroscopy. To evaluate themore » significance of each chelating moiety, sequence rearrangements and substitutions were examined. Analysis of a metal–ligand complex by NMR spectroscopy highlighted the importance of particular residues. Depletion experiments were performed in serum mimetics and human serum with exogenously added cadmium. These depletion experiments were used to compare and demonstrate the ability of these peptoids to remove cadmium from blood-like mixtures. In one of these depletion experiments, the peptoid sequence was able to deplete the cadmium to a level comparable to the reported acute toxicity limit. Evaluation of the metal selectivity in buffered solution and in human serum was performed to verify minimal off-target binding. These studies highlight a screening platform for the identification of metal–ligands that are capable of binding in a complex environment. They additionally demonstrate the potential utility of biologically-compatible ligands for the treatment of heavy metal poisoning.« less

  10. Development of peptoid-based ligands for the removal of cadmium from biological media

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Knight, Abigail S.; Zhou, Effie Y.; Francis, Matthew B.

    Cadmium poisoning poses a serious health concern due to cadmium's increasing industrial use, yet there is currently no recommended treatment. The selective coordination of cadmium in a biological environment—i.e. in the presence of serum ions, small molecules, and proteins—is a difficult task. To address this challenge, a combinatorial library of peptoid-based ligands has been evaluated to identify structures that selectively bind to cadmium in human serum with minimal chelation of essential metal ions. Eighteen unique ligands were identified in this screening procedure, and the binding affinity of each was measured using metal titrations monitored by UV-vis spectroscopy. To evaluate themore » significance of each chelating moiety, sequence rearrangements and substitutions were examined. Analysis of a metal–ligand complex by NMR spectroscopy highlighted the importance of particular residues. Depletion experiments were performed in serum mimetics and human serum with exogenously added cadmium. These depletion experiments were used to compare and demonstrate the ability of these peptoids to remove cadmium from blood-like mixtures. In one of these depletion experiments, the peptoid sequence was able to deplete the cadmium to a level comparable to the reported acute toxicity limit. Evaluation of the metal selectivity in buffered solution and in human serum was performed to verify minimal off-target binding. These studies highlight a screening platform for the identification of metal–ligands that are capable of binding in a complex environment. They additionally demonstrate the potential utility of biologically-compatible ligands for the treatment of heavy metal poisoning.« less

  11. A 31-residue peptide induces aggregation of tau's microtubule-binding region in cells

    NASA Astrophysics Data System (ADS)

    Stöhr, Jan; Wu, Haifan; Nick, Mimi; Wu, Yibing; Bhate, Manasi; Condello, Carlo; Johnson, Noah; Rodgers, Jeffrey; Lemmin, Thomas; Acharya, Srabasti; Becker, Julia; Robinson, Kathleen; Kelly, Mark J. S.; Gai, Feng; Stubbs, Gerald; Prusiner, Stanley B.; Degrado, William F.

    2017-09-01

    The self-propagation of misfolded conformations of tau underlies neurodegenerative diseases, including Alzheimer's. There is considerable interest in discovering the minimal sequence and active conformational nucleus that defines this self-propagating event. The microtubule-binding region, spanning residues 244-372, reproduces much of the aggregation behaviour of tau in cells and animal models. Further dissection of the amyloid-forming region to a hexapeptide from the third microtubule-binding repeat resulted in a peptide that rapidly forms fibrils in vitro. We show that this peptide lacks the ability to seed aggregation of tau244-372 in cells. However, as the hexapeptide is gradually extended to 31 residues, the peptides aggregate more slowly and gain potent activity to induce aggregation of tau244-372 in cells. X-ray fibre diffraction, hydrogen-deuterium exchange and solid-state NMR studies map the beta-forming region to a 25-residue sequence. Thus, the nucleus for self-propagating aggregation of tau244-372 in cells is packaged in a remarkably small peptide.

  12. In-silico studies of neutral drift for functional protein interaction networks

    NASA Astrophysics Data System (ADS)

    Ali, Md Zulfikar; Wingreen, Ned S.; Mukhopadhyay, Ranjan

    We have developed a minimal physically-motivated model of protein-protein interaction networks. Our system consists of two classes of enzymes, activators (e.g. kinases) and deactivators (e.g. phosphatases), and the enzyme-mediated activation/deactivation rates are determined by sequence-dependent binding strengths between enzymes and their targets. The network is evolved by introducing random point mutations in the binding sequences where we assume that each new mutation is either fixed or entirely lost. We apply this model to studies of neutral drift in networks that yield oscillatory dynamics, where we start, for example, with a relatively simple network and allow it to evolve by adding nodes and connections while requiring that dynamics be conserved. Our studies demonstrate both the importance of employing a sequence-based evolutionary scheme and the relative rapidity (in evolutionary time) for the redistribution of function over new nodes via neutral drift. Surprisingly, in addition to this redistribution time we discovered another much slower timescale for network evolution, reflecting hidden order in sequence space that we interpret in terms of sparsely connected domains.

  13. Screening and Characterization of a Novel RNA Aptamer That Specifically Binds to Human Prostatic Acid Phosphatase and Human Prostate Cancer Cells

    PubMed Central

    Kong, Hoon Young; Byun, Jonghoe

    2015-01-01

    Prostatic acid phosphatase (PAP) expression increases proportionally with prostate cancer progression, making it useful in prognosticating intermediate to high-risk prostate cancers. A novel ligand that can specifically bind to PAP would be very helpful for guiding prostate cancer therapy. RNA aptamers bind to target molecules with high specificity and have key advantages such as low immunogenicity and easy synthesis. Here, human PAP-specific aptamers were screened from a 2′-fluoropyrimidine (FY)-modified RNA library by SELEX. The candidate aptamer families were identified within six rounds followed by analysis of their sequences and PAP-specific binding. A gel shift assay was used to identify PAP binding aptamers and the 6N aptamer specifically bound to PAP with a Kd value of 118 nM. RT-PCR and fluorescence labeling analyses revealed that the 6N aptamer bound to PAP-positive mammalian cells, such as PC-3 and LNCaP. IMR-90 negative control cells did not bind the 6N aptamer. Systematic minimization analyses revealed that 50 nucleotide sequences and their two hairpin structures in the 6N 2′-FY RNA aptamer were equally important for PAP binding. Renewed interest in PAP combined with the versatility of RNA aptamers, including conjugation of anti-cancer drugs and nano-imaging probes, could open up a new route for early theragnosis of prostate cancer. PMID:25591398

  14. POZ domain transcription factor, FBI-1, represses transcription of ADH5/FDH by interacting with the zinc finger and interfering with DNA binding activity of Sp1.

    PubMed

    Lee, Dong-Kee; Suh, Dongchul; Edenberg, Howard J; Hur, Man-Wook

    2002-07-26

    The POZ domain is a protein-protein interaction motif that is found in many transcription factors, which are important for development, oncogenesis, apoptosis, and transcription repression. We cloned the POZ domain transcription factor, FBI-1, that recognizes the cis-element (bp -38 to -22) located just upstream of the core Sp1 binding sites (bp -22 to +22) of the ADH5/FDH minimal promoter (bp -38 to +61) in vitro and in vivo, as revealed by electrophoretic mobility shift assay and chromatin immunoprecipitation assay. The ADH5/FDH minimal promoter is potently repressed by the FBI-1. Glutathione S-transferase fusion protein pull-down showed that the POZ domains of FBI-1, Plzf, and Bcl-6 directly interact with the zinc finger DNA binding domain of Sp1. DNase I footprinting assays showed that the interaction prevents binding of Sp1 to the GC boxes of the ADH5/FDH promoter. Gal4-POZ domain fusions targeted proximal to the GC boxes repress transcription of the Gal4 upstream activator sequence-Sp1-adenovirus major late promoter. Our data suggest that POZ domain represses transcription by interacting with Sp1 zinc fingers and by interfering with the DNA binding activity of Sp1.

  15. The right half of the Escherichia coli replication origin is not essential for viability, but facilitates multi-forked replication

    PubMed Central

    Stepankiw, Nicholas; Kaidow, Akihiro; Boye, Erik; Bates, David

    2010-01-01

    Summary Replication initiation is a key event in the cell cycle of all organisms and oriC, the replication origin in Escherichia coli, serves as the prototypical model for this process. The minimal sequence required for oriC function was originally determined entirely from plasmid studies using cloned origin fragments, which have previously been shown to differ dramatically in sequence requirement from the chromosome. Using an in vivo recombineering strategy to exchange wt oriCs for mutated ones regardless of whether they are functional origins or not, we have determined the minimal origin sequence that will support chromosome replication. Nearly the entire right half of oriC could be deleted without loss of origin function, demanding a reassessment of existing models for initiation. Cells carrying the new DnaA box-depleted 163 bp minimal oriC exhibited little or no loss of fitness under slow-growth conditions, but were sensitive to rich medium, suggesting that the dense packing of initiator binding sites that is a hallmark of prokaryotic origins, has likely evolved to support the increased demands of multi-forked replication. PMID:19737351

  16. Attractors in Sequence Space: Agent-Based Exploration of MHC I Binding Peptides.

    PubMed

    Jäger, Natalie; Wisniewska, Joanna M; Hiss, Jan A; Freier, Anja; Losch, Florian O; Walden, Peter; Wrede, Paul; Schneider, Gisbert

    2010-01-12

    Ant Colony Optimization (ACO) is a meta-heuristic that utilizes a computational analogue of ant trail pheromones to solve combinatorial optimization problems. The size of the ant colony and the representation of the ants' pheromone trails is unique referring to the given optimization problem. In the present study, we employed ACO to generate novel peptides that stabilize MHC I protein on the plasma membrane of a murine lymphoma cell line. A jury of feedforward neural network classifiers served as fitness function for peptide design by ACO. Bioactive murine MHC I H-2K(b) stabilizing as well as nonstabilizing octapeptides were designed, synthesized and tested. These peptides reveal residue motifs that are relevant for MHC I receptor binding. We demonstrate how the performance of the implemented ACO algorithm depends on the colony size and the size of the search space. The actual peptide design process by ACO constitutes a search path in sequence space that can be visualized as trajectories on a self-organizing map (SOM). By projecting the sequence space on a SOM we visualize the convergence of the different solutions that emerge during the optimization process in sequence space. The SOM representation reveals attractors in sequence space for MHC I binding peptides. The combination of ACO and SOM enables systematic peptide optimization. This technique allows for the rational design of various types of bioactive peptides with minimal experimental effort. Here, we demonstrate its successful application to the design of MHC-I binding and nonbinding peptides which exhibit substantial bioactivity in a cell-based assay. Copyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Screening of Pro-Asp Sequences Exposed on Bacteriophage M13 as an Ideal Anchor for Gold Nanocubes.

    PubMed

    Lee, Hwa Kyoung; Lee, Yujean; Kim, Hyori; Lee, Hye-Eun; Chang, Hyejin; Nam, Ki Tae; Jeong, Dae Hong; Chung, Junho

    2017-09-15

    Bacteriophages are thought to be ideal vehicles for linking antibodies to nanoparticles. Here, we define the sequence of peptides exposed as a fusion protein on M13 bacteriophages to yield optimal binding of gold nanocubes and efficient bacteriophage amplification. We generated five helper bacteriophage libraries using AE(X) 2 DP, AE(X) 3 DP, AE(X) 4 DP, AE(X) 5 DP, and AE(X) 6 DP as the exposed portion of pVIII, in which X was a randomized amino acid residue encoded by the nucleotide sequence NNK. Efficient phage amplification was achievable only in the AE(X) 2 DP, AE(X) 3 DP, and AE(X) 4 DP libraries. Through biopanning with gold nanocubes, we enriched the phage clones and selected the clone with the highest fold change after enrichment. This clone displayed Pro-Asp on the surface of the bacteriophage and had amplification yields similar to those of the wild-type helper bacteriophage (VCSM13). The clone displayed even binding of gold nanocubes along its length and minimal aggregation after binding. We conclude that, for efficient amplification, the exposed pVIII amino acid length should be limited to six residues and Ala-Glu-Pro-Asp-Asp-Pro (AEPDDP) is the ideal fusion protein sequence for guaranteeing the optimal formation of a complex with gold nanocubes.

  18. Rsp5 WW domains interact directly with the carboxyl-terminal domain of RNA polymerase II.

    PubMed

    Chang, A; Cheang, S; Espanel, X; Sudol, M

    2000-07-07

    RSP5 is an essential gene in Saccharomyces cerevisiae and was recently shown to form a physical and functional complex with RNA polymerase II (RNA pol II). The amino-terminal half of Rsp5 consists of four domains: a C2 domain, which binds membrane phospholipids; and three WW domains, which are protein interaction modules that bind proline-rich ligands. The carboxyl-terminal half of Rsp5 contains a HECT (homologous to E6-AP carboxyl terminus) domain that catalytically ligates ubiquitin to proteins and functionally classifies Rsp5 as an E3 ubiquitin-protein ligase. The C2 and WW domains are presumed to act as membrane localization and substrate recognition modules, respectively. We report that the second (and possibly third) Rsp5 WW domain mediates binding to the carboxyl-terminal domain (CTD) of the RNA pol II large subunit. The CTD comprises a heptamer (YSPTSPS) repeated 26 times and a PXY core that is critical for interaction with a specific group of WW domains. An analysis of synthetic peptides revealed a minimal CTD sequence that is sufficient to bind to the second Rsp5 WW domain (Rsp5 WW2) in vitro and in yeast two-hybrid assays. Furthermore, we found that specific "imperfect" CTD repeats can form a complex with Rsp5 WW2. In addition, we have shown that phosphorylation of this minimal CTD sequence on serine, threonine and tyrosine residues acts as a negative regulator of the Rsp5 WW2-CTD interaction. In view of the recent data pertaining to phosphorylation-driven interactions between the RNA pol II CTD and the WW domain of Ess1/Pin1, we suggest that CTD dephosphorylation may be a prerequisite for targeted RNA pol II degradation.

  19. The SANT domain of human MI-ER1 interacts with Sp1 to interfere with GC box recognition and repress transcription from its own promoter.

    PubMed

    Ding, Zhihu; Gillespie, Laura L; Mercer, F Corinne; Paterno, Gary D

    2004-07-02

    To gain insight into the regulation of hmi-er1 expression, we cloned a human genomic DNA fragment containing one of the two hmi-er1 promoters and consisting of 1460 bp upstream of the translation initiation codon of hMI-ER1. Computer-assisted sequence analysis revealed that the hmi-er1 promoter region contains a CpG island but lacks an identifiable TATA element, initiator sequence and downstream promoter element. This genomic DNA was able to direct transcription of a luciferase reporter gene in a variety of human cell lines, and the minimal promoter was shown to be located within-68/+144 bp. Several putative Sp1 binding sites were identified, and we show that Sp1 can bind to the hmi-er1 minimal promoter and increase transcription, suggesting that the level of hmi-er1 expression may depend on the availability of Sp1 protein. Functional analysis revealed that hMI-ER1 represses Sp1-activated transcription from the minimal promoter by a histone deacetylase-independent mechanism. Chromatin immunoprecipitation analysis demonstrated that both Sp1 and hMI-ER1 are associated with the chromatin of the hmi-er1 promoter and that overexpression of hMI-ER1 in cell lines that allow Tet-On-inducible expression resulted in loss of detectable Sp1 from the endogenous hmi-er1 promoter. The mechanism by which this occurs does not involve binding of hMI-ER1 to cis-acting elements. Instead, we show that hMI-ER1 physically associates with Sp1 and that endogenous complexes containing the two proteins could be detected in vivo. Furthermore, hMI-ER1 specifically interferes with binding of Sp1 to the hmi-er1 minimal promoter as well as to an Sp1 consensus oligonucleotide. Deletion analysis revealed that this interaction occurs through a region containing the SANT domain of hMI-ER1. Together, these data reveal a functional role for the SANT domain in the action of co-repressor regulatory factors and suggest that the association of hMI-ER1 with Sp1 represents a novel mechanism for the negative regulation of Sp1 target promoters.

  20. A simplified Sanger sequencing method for full genome sequencing of multiple subtypes of human influenza A viruses.

    PubMed

    Deng, Yi-Mo; Spirason, Natalie; Iannello, Pina; Jelley, Lauren; Lau, Hilda; Barr, Ian G

    2015-07-01

    Full genome sequencing of influenza A viruses (IAV), including those that arise from annual influenza epidemics, is undertaken to determine if reassorting has occurred or if other pathogenic traits are present. Traditionally IAV sequencing has been biased toward the major surface glycoproteins haemagglutinin and neuraminidase, while the internal genes are often ignored. Despite the development of next generation sequencing (NGS), many laboratories are still reliant on conventional Sanger sequencing to sequence IAV. To develop a minimal and robust set of primers for Sanger sequencing of the full genome of IAV currently circulating in humans. A set of 13 primer pairs was designed that enabled amplification of the six internal genes of multiple human IAV subtypes including the recent avian influenza A(H7N9) virus from China. Specific primers were designed to amplify the HA and NA genes of each IAV subtype of interest. Each of the primers also incorporated a binding site at its 5'-end for either a forward or reverse M13 primer, such that only two M13 primers were required for all subsequent sequencing reactions. This minimal set of primers was suitable for sequencing the six internal genes of all currently circulating human seasonal influenza A subtypes as well as the avian A(H7N9) viruses that have infected humans in China. This streamlined Sanger sequencing protocol could be used to generate full genome sequence data more rapidly and easily than existing influenza genome sequencing protocols. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  1. Characterization of the rat RALDH1 promoter. A functional CCAAT and octamer motif are critical for basal promoter activity.

    PubMed

    Guimond, Julie; Devost, Dominic; Brodeur, Helene; Mader, Sylvie; Bhat, Pangala V

    2002-12-12

    Retinal dehydrogenase type 1 (RALDH1) catalyzes the oxidation of retinal to retinoic acid (RA), a metabolite of vitamin A important for embryogenesis and tissue differentiation. Rat RALDH1 is expressed to high levels in developing kidney, and in stomach, intestine epithelia. To understand the mechanisms of the transcriptional regulation of rat RALDH1, we cloned a 1360-base pair (bp) 5'-flanking region of RALDH1 gene. Using luciferase reporter constructs transfected into HEK 293 and LLCPK (kidney-derived) cells, basal promoter activity was associated with sequences between -80 and +43. In this minimal promoter region, TATA and CCAAT cis-acting elements as well as SP1, AP1 and octamer (Oct)-binding sites were present. The CCAAT box and Oct-binding site, located between positions -72 and -68 and -56 and -49, respectively, were shown by deletion analysis and site-directed mutation to be critical for promoter activity. Nuclear extracts from kidney cells contain proteins specifically binding the Oct and CCAAT sequences, resulting in the formation of six complexes, while different patterns of complexes were observed with non-kidney cell extracts. Gel shift assays using either single or double mutations of the Oct and CCAAT sequences as well as super shift assays demonstrated single and double occupancy of these two sites by Oct-1 and CBF-A. In addition, unidentified proteins also bound the Oct motif specifically in the absence of CBF-A binding. These results demonstrate specific involvement of Oct and CCAAT-binding proteins in the regulation of RALDH1 gene.

  2. The Minimal Replicator of Epstein-Barr Virus oriP

    PubMed Central

    Yates, John L.; Camiolo, Sarah M.; Bashaw, Jacqueline M.

    2000-01-01

    oriP is a 1.7-kb region of the Epstein-Barr virus (EBV) chromosome that supports the replication and stable maintenance of plasmids in human cells. oriP contains two essential components, called the DS and the FR, both of which contain multiple binding sites for the EBV-encoded protein, EBNA-1. The DS appears to function as the replicator of oriP, while the FR acts in conjunction with EBNA-1 to prevent the loss of plasmids from proliferating cells. Because of EBNA-1's role in stabilizing plasmids through the FR, it has not been entirely clear to what extent EBNA-1 might be required for replication from oriP per se, and a recent study has questioned whether EBNA-1 has any direct role in replication. In the present study we found that plasmids carrying oriP required EBNA-1 to replicate efficiently even when assayed only 2 days after plasmids were introduced into the cell lines 143B and 293. Significantly, using 293 cells it was demonstrated that the plasmid-retention function of EBNA-1 and the FR did not contribute significantly to the accumulation of replicated plasmids, and the DS supported efficient EBNA-1-dependent replication in the absence of the FR. The DS contains two pairs of closely spaced EBNA-1 binding sites, and a previous study had shown that both sites within either pair are required for activity. However, it was unclear from previous work what additional sequences within the DS might be required. We found that each “half” of the DS, including a pair of closely spaced EBNA-1 binding sites, had significant replicator activity when the other half had been deleted. The only significant DNA sequences that the two halves of the DS share in common, other than EBNA-1 binding sites, is a 9-bp sequence that is present twice in the “left half” and once in the “right half.” These nonamer repeats, while not essential for activity, contributed significantly to the activity of each half of the DS. Two thymines occur at unique positions within EBNA-1 binding sites 1 and 4 at the DS and become sensitive to oxidation by permanganate when EBNA-1 binds, but mutation of each to the consensus base, adenine, actually improved the activity of each half of the DS slightly. In conclusion, the DS of oriP is an EBNA-1-dependent replicator, and its minimal active core appears to be simply two properly spaced EBNA-1 binding sites. PMID:10775587

  3. A 31-residue peptide induces aggregation of tau’s microtubule-binding region in cells

    PubMed Central

    Stöhr, Jan; Wu, Haifan; Nick, Mimi; Wu, Yibing; Bhate, Manasi; Condello, Carlo; Johnson, Noah; Rodgers, Jeffrey; Lemmin, Thomas; Achyraya, Srabasti; Becker, Julia; Robinson, Kathleen; Kelly, Mark J.S.; Gai, Feng; Stubbs, Gerald; Prusiner, Stanley B.; DeGrado, William F.

    2018-01-01

    The self-propagation of misfolded conformations of tau underlies neurodegenerative diseases, including Alzheimer’s disease. There is considerable interest in discovering the minimal sequence and active conformational nucleus that defines this self-propagating event. The microtubule-binding region, spanning residues 244-372, reproduces much of the aggregation behavior of tau in cells and animal models. Further dissection of the amyloid-forming region to a hexapeptide from the third microtubule-binding repeat resulted in a peptide that rapidly forms fibrils in vitro. We show here that this peptide lacks the ability to seed aggregation of tau244-372 in cells. However, as the hexapeptide is gradually extended to 31 residues, the peptides aggregate more slowly and gain potent activity to induce aggregation of tau244-372 in cells. X-ray fiber diffraction, hydrogen-deuterium exchange and solids NMR studies map the beta-forming region to a 25-residue sequence. Thus, the nucleus for self-propagating aggregation of tau244-372 in cells is packaged in a remarkably small peptide. PMID:28837163

  4. Epitope mapping of PR81 anti-MUC1 monoclonal antibody following PEPSCAN and phage display techniques.

    PubMed

    Mohammadi, Mohammad; Rasaee, Mohammad Javad; Rajabibazl, Masoumeh; Paknejad, Malihe; Zare, Mehrak; Mohammadzadeh, Sara

    2007-08-01

    PR81 is an anti-MUC1 monoclonal antibody (MAb) which was generated against human MUC1 mucin that reacted with breast cancerous tissue, MUC1 positive cell line (MCF-7, BT-20, and T-4 7 D), and synthetic peptide, including the tandem repeat sequence of MUC1. Here we characterized the binding properties of PR81 against the tandem repeat of MUC1 by two different epitope mapping techniques, namely, PEPSCAN and phage display. Epitope mapping of PR81 MAb by PEPSCAN revealed a minimal consensus binding sequence, PDTRP, which is found on MUC1 peptide as the most important epitope. Using the phage display peptide library, we identified the motif PD(T/S/G)RP as an epitope and the motif AVGLSPDGSRGV as a mimotope recognized by PR81. Results of these two methods showed that the two residues, arginine and aspartic acid, have important roles in antibody binding and threonine can be substituted by either glycine or serine. These results may be of importance in tailor making antigens used in immunoassay.

  5. Recent Progress in Aptamer-Based Functional Probes for Bioanalysis and Biomedicine.

    PubMed

    Zhang, Huimin; Zhou, Leiji; Zhu, Zhi; Yang, Chaoyong

    2016-07-11

    Nucleic acid aptamers are short synthetic DNA or RNA sequences that can bind to a wide range of targets with high affinity and specificity. In recent years, aptamers have attracted increasing research interest due to their unique features of high binding affinity and specificity, small size, excellent chemical stability, easy chemical synthesis, facile modification, and minimal immunogenicity. These properties make aptamers ideal recognition ligands for bioanalysis, disease diagnosis, and cancer therapy. This review highlights the recent progress in aptamer selection and the latest applications of aptamer-based functional probes in the fields of bioanalysis and biomedicine. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.

    PubMed

    Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I

    2001-08-01

    DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.

  7. Molecular Evolution of the Oxygen-Binding Hemerythrin Domain

    PubMed Central

    Alvarez-Carreño, Claudia; Becerra, Arturo; Lazcano, Antonio

    2016-01-01

    Background The evolution of oxygenic photosynthesis during Precambrian times entailed the diversification of strategies minimizing reactive oxygen species-associated damage. Four families of oxygen-carrier proteins (hemoglobin, hemerythrin and the two non-homologous families of arthropodan and molluscan hemocyanins) are known to have evolved independently the capacity to bind oxygen reversibly, providing cells with strategies to cope with the evolutionary pressure of oxygen accumulation. Oxygen-binding hemerythrin was first studied in marine invertebrates but further research has made it clear that it is present in the three domains of life, strongly suggesting that its origin predated the emergence of eukaryotes. Results Oxygen-binding hemerythrins are a monophyletic sub-group of the hemerythrin/HHE (histidine, histidine, glutamic acid) cation-binding domain. Oxygen-binding hemerythrin homologs were unambiguously identified in 367/2236 bacterial, 21/150 archaeal and 4/135 eukaryotic genomes. Overall, oxygen-binding hemerythrin homologues were found in the same proportion as single-domain and as long protein sequences. The associated functions of protein domains in long hemerythrin sequences can be classified in three major groups: signal transduction, phosphorelay response regulation, and protein binding. This suggests that in many organisms the reversible oxygen-binding capacity was incorporated in signaling pathways. A maximum-likelihood tree of oxygen-binding hemerythrin homologues revealed a complex evolutionary history in which lateral gene transfer, duplications and gene losses appear to have played an important role. Conclusions Hemerythrin is an ancient protein domain with a complex evolutionary history. The distinctive iron-binding coordination site of oxygen-binding hemerythrins evolved first in prokaryotes, very likely prior to the divergence of Firmicutes and Proteobacteria, and spread into many bacterial, archaeal and eukaryotic species. The later evolution of the oxygen-binding hemerythrin domain in both prokaryotes and eukaryotes led to a wide variety of functions, ranging from protection against oxidative damage in anaerobic and microaerophilic organisms, to oxygen supplying to particular enzymes and pathways in aerobic and facultative species. PMID:27336621

  8. The molecular mechanism for interaction of ceruloplasmin and myeloperoxidase

    NASA Astrophysics Data System (ADS)

    Bakhautdin, Bakytzhan; Bakhautdin, Esen Göksöy

    2016-04-01

    Ceruloplasmin (Cp) is a copper-containing ferroxidase with potent antioxidant activity. Cp is expressed by hepatocytes and activated macrophages and has been known as physiologic inhibitor of myeloperoxidase (MPO). Enzymatic activity of MPO produces anti-microbial agents and strong prooxidants such as hypochlorous acid and has a potential to damage host tissue at the sites of inflammation and infection. Thus Cp-MPO interaction and inhibition of MPO has previously been suggested as an important control mechanism of excessive MPO activity. Our aim in this study was to identify minimal Cp domain or peptide that interacts with MPO. We first confirmed Cp-MPO interaction by ELISA and surface plasmon resonance (SPR). SPR analysis of the interaction yielded 30 nM affinity between Cp and MPO. We then designed and synthesized 87 overlapping peptides spanning the entire amino acid sequence of Cp. Each of the peptides was tested whether it binds to MPO by direct binding ELISA. Two of the 87 peptides, P18 and P76 strongly interacted with MPO. Amino acid sequence analysis of identified peptides revealed high sequence and structural homology between them. Further structural analysis of Cp's crystal structure by PyMOL software unfolded that both peptides represent surface-exposed sites of Cp and face nearly the same direction. To confirm our finding we raised anti-P18 antisera in rabbit and demonstrated that this antisera disrupts Cp-MPO binding and rescues MPO activity. Collectively, our results confirm Cp-MPO interaction and identify two nearly identical sites on Cp that specifically bind MPO. We propose that inhibition of MPO by Cp requires two nearly identical sites on Cp to bind homodimeric MPO simultaneously and at an angle of at least 120 degrees, which, in turn, exerts tension on MPO and results in conformational change.

  9. The structure of the SBP-Tag–streptavidin complex reveals a novel helical scaffold bridging binding pockets on separate subunits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barrette-Ng, Isabelle H.; Wu, Sau-Ching; Tjia, Wai-Mui

    2013-05-01

    The structure of the SBP-Tag–streptavidin complex reveals a novel mode of peptide recognition in which a single peptide binds simultaneously to biotin-binding pockets from adjacent subunits of streptavidin. The molecular details of peptide recognition suggest how the SBP-Tag can be further modified to become an even more useful tag for a wider range of biotechnological applications. The 38-residue SBP-Tag binds to streptavidin more tightly (K{sub d} ≃ 2.5–4.9 nM) than most if not all other known peptide sequences. Crystallographic analysis at 1.75 Å resolution shows that the SBP-Tag binds to streptavidin in an unprecedented manner by simultaneously interacting with biotin-bindingmore » pockets from two separate subunits. An N-terminal HVV peptide sequence (residues 12–14) and a C-terminal HPQ sequence (residues 31–33) form the bulk of the direct interactions between the SBP-Tag and the two biotin-binding pockets. Surprisingly, most of the peptide spanning these two sites (residues 17–28) adopts a regular α-helical structure that projects three leucine side chains into a groove formed at the interface between two streptavidin protomers. The crystal structure shows that residues 1–10 and 35–38 of the original SBP-Tag identified through in vitro selection and deletion analysis do not appear to contact streptavidin and thus may not be important for binding. A 25-residue peptide comprising residues 11–34 (SBP-Tag2) was synthesized and shown using surface plasmon resonance to bind streptavidin with very similar affinity and kinetics when compared with the SBP-Tag. The SBP-Tag2 was also added to the C-terminus of β-lactamase and was shown to be just as effective as the full-length SBP-Tag in affinity purification. These results validate the molecular structure of the SBP-Tag–streptavidin complex and establish a minimal bivalent streptavidin-binding tag from which further rational design and optimization can proceed.« less

  10. HEXIM1 is a promiscuous double-stranded RNA-binding protein and interacts with RNAs in addition to 7SK in cultured cells

    PubMed Central

    Li, Qintong; Cooper, Jeffrey J.; Altwerger, Gary H.; Feldkamp, Michael D.; Shea, Madeline A.; Price, David H.

    2007-01-01

    P-TEFb regulates eukaryotic gene expression at the level of transcription elongation, and is itself controlled by the reversible association of 7SK RNA and an RNA-binding protein HEXIM1 or HEXIM2. In an effort to determine the minimal region of 7SK needed to interact with HEXIM1 in vitro, we found that an oligo comprised of nucleotides 10–48 sufficed. A bid to further narrow down the minimal region of 7SK led to a surprising finding that HEXIM1 binds to double-stranded RNA in a sequence-independent manner. Both dsRNA and 7SK (10–48), but not dsDNA, competed efficiently with full-length 7SK for HEXIM1 binding in vitro. Upon binding dsRNA, a large conformational change was observed in HEXIM1 that allowed the recruitment and inhibition of P-TEFb. Both subcellular fractionation and immunofluorescence demonstrated that, while most HEXIM1 is found in the nucleus, a significant fraction is found in the cytoplasm. Immunoprecipitation experiments demonstrated that both nuclear and cytoplasmic HEXIM1 is associated with RNA. Interestingly, the one microRNA examined (mir-16) was found in HEXIM1 immunoprecipitates, while the small nuclear RNAs, U6 and U2, were not. Our study illuminates novel properties of HEXIM1 both in vitro and in vivo, and suggests that HEXIM1 may be involved in other nuclear and cytoplasmic processes besides controlling P-TEFb. PMID:17395637

  11. CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

    PubMed Central

    Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

    2015-01-01

    Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360

  12. Molecular identification and transcriptional regulation of porcine IFIT2 gene.

    PubMed

    Yang, Xiuqin; Jing, Xiaoyan; Song, Yanfang; Zhang, Caixia; Liu, Di

    2018-04-06

    IFN-induced protein with tetratricopeptide repeats 2 (IFIT2) plays important roles in host defense against viral infection as revealed by studies in humans and mice. However, little is known on porcine IFIT2 (pIFIT2). Here, we performed molecular cloning, expression profile, and transcriptional regulation analysis of pIFIT2. pIFIT2 gene, located on chromosome 14, is composed of two exons and have a complete coding sequence of 1407 bp. The encoded polypeptide, 468 aa in length, has three tetratricopeptide repeat motifs. pIFIT2 gene was unevenly distributed in all eleven tissues studied with the most abundance in spleen. Poly(I:C) treatment notably strongly upregulated the mRNA level and promoter activity of pIFIT2 gene. Upstream sequence of 1759 bp from the start codon which was assigned +1 here has promoter activity, and deltaEF1 acts as transcription repressor through binding to sequences at position - 1774 to - 1764. Minimal promoter region exists within nucleotide position - 162 and - 126. Two adjacent interferon-stimulated response elements (ISREs) and two nuclear factor (NF)-κB binding sites were identified within position - 310 and - 126. The ISRE elements act alone and in synergy with the one closer to start codon having more strength, so do the NF-κB binding sites. Synergistic effect was also found between the ISRE and NF-κB binding sites. Additionally, a third ISRE element was identified within position - 1661 to - 1579. These findings will contribute to clarifying the antiviral effect and underlying mechanisms of pIFIT2.

  13. A Versatile Platform for Nanotechnology Based on Circular Permutation of a Chaperonin Protein

    NASA Technical Reports Server (NTRS)

    Paavola, Chad; McMillan, Andrew; Trent, Jonathan; Chan, Suzanne; Mazzarella, Kellen; Li, Yi-Fen

    2004-01-01

    A number of protein complexes have been developed as nanoscale templates. These templates can be functionalized using the peptide sequences that bind inorganic materials. However, it is difficult to integrate peptides into a specific position within a protein template. Integrating intact proteins with desirable binding or catalytic activities is an even greater challenge. We present a general method for modifying protein templates using circular permutation so that additional peptide sequence can be added in a wide variety of specific locations. Circular permutation is a reordering of the polypeptide chain such that the original termini are joined and new termini are created elsewhere in the protein. New sequence can be joined to the protein termini without perturbing the protein structure and with minimal limitation on the size and conformation of the added sequence. We have used this approach to modify a chaperonin protein template, placing termini at five different locations distributed across the surface of the protein complex. These permutants are competent to form the double-ring structures typical of chaperonin proteins. The permuted double-rings also form the same assemblies as the unmodified protein. We fused a fluorescent protein to two representative permutants and demonstrated that it assumes its active structure and does not interfere with assembly of chaperonin double-rings.

  14. Efficient engineering of chromosomal ribosome binding site libraries in mismatch repair proficient Escherichia coli.

    PubMed

    Oesterle, Sabine; Gerngross, Daniel; Schmitt, Steven; Roberts, Tania Michelle; Panke, Sven

    2017-09-26

    Multiplexed gene expression optimization via modulation of gene translation efficiency through ribosome binding site (RBS) engineering is a valuable approach for optimizing artificial properties in bacteria, ranging from genetic circuits to production pathways. Established algorithms design smart RBS-libraries based on a single partially-degenerate sequence that efficiently samples the entire space of translation initiation rates. However, the sequence space that is accessible when integrating the library by CRISPR/Cas9-based genome editing is severely restricted by DNA mismatch repair (MMR) systems. MMR efficiency depends on the type and length of the mismatch and thus effectively removes potential library members from the pool. Rather than working in MMR-deficient strains, which accumulate off-target mutations, or depending on temporary MMR inactivation, which requires additional steps, we eliminate this limitation by developing a pre-selection rule of genome-library-optimized-sequences (GLOS) that enables introducing large functional diversity into MMR-proficient strains with sequences that are no longer subject to MMR-processing. We implement several GLOS-libraries in Escherichia coli and show that GLOS-libraries indeed retain diversity during genome editing and that such libraries can be used in complex genome editing operations such as concomitant deletions. We argue that this approach allows for stable and efficient fine tuning of chromosomal functions with minimal effort.

  15. Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ecale Zhou, C L; Zemla, A T; Roe, D

    2005-01-29

    Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers, or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set ofmore » ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context, and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics, and vaccines.« less

  16. Lentiavidins: Novel avidin-like proteins with low isoelectric points from shiitake mushroom (Lentinula edodes).

    PubMed

    Takakura, Yoshimitsu; Sofuku, Kozue; Tsunashima, Masako; Kuwata, Shigeru

    2016-04-01

    A biotin-binding protein with a low isoelectric point (pI), which minimizes electrostatic non-specific binding to substances other than biotin, is potentially valuable. To obtain such a protein, we screened hundreds of mushrooms, and detected strong biotin-binding activity in the fruit bodies of Lentinula edodes, shiitake mushroom. Two cDNAs, each encoding a protein of 152 amino acids, termed lentiavidin 1 and lentiavidin 2 were cloned from L. edodes. The proteins shared sequence identities of 27%-49% with other biotin-binding proteins, and many residues that directly associate with biotin in streptavidin were conserved in lentiavidins. The pI values of lentiavidin 1 and lentiavidin 2 were 3.9 and 4.4, respectively; the former is the lowest pI of the known biotin-binding proteins. Lentiavidin 1 was expressed as a tetrameric protein with a molecular mass of 60 kDa in an insect cell-free expression system and showed biotin-binding activity. Lentiavidin 1, with its pI of 3.9, has a potential for broad applications as a novel biotin-binding protein. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  17. [Construction of a general AAV vector regulated by minimal and artificial hypoxic-responsive element].

    PubMed

    Nie, Xiao-wei; Sun, Li-jun; Hao, Yue-wen; Yang, Guang-xiao; Wang, Quan-ying

    2011-03-01

    To synthesize the minimal and artificial HRE, and to insert it into the anterior extremity of CMV promoter of a AAV plasmid, and then to construct the AAV regulated by hypoxic-responsive element which was introduced into 293 cell by method of Ca3(PO4)2 using three plasmids. Thus obtaining the adenoassociated virus vector regulated by hypoxic-responsive element was possibly used for gene therapy in ischemia angiocardiopathy and cerebrovascular disease. Artificially synthesize the 36 bp nucleotide sequences of four connection in series HIF-binding sites A/GCGTG(4×HBS)and a 35 bp nucleotide sequences spacing inserted into anterior extremity of CMV promoter TATA Box, then amplified by PCR. The cDNA fragment was confirmed to be right by DNA sequencing. Molecular biology routine method was used to construct a AAV vector regulated by minimal hypoxic-responsive element after the normal CMV promoter in AAV vector was replaced by the CMV promoter included minimal hypoxic-responsive element. Then, NT4-6His-PR39 fusogenic peptide was inserted into MCS of the plasmid, the recombinant AAV vector was obtained by three plasmid co-transfection in 293 cells, in which we can also investigate the expression of 6×His using immunochemistry in hypoxia environment. Artificial HRE was inserted into anterior extremity of CMV promoter and there was a correct spacing between the HRE and the TATA-box. The DNA sequencing and restriction enzyme digestion results indicated that the AAV regulated by hypoxic-responsive element was successfully constructed. Compared to the control group, the expressions of 6×His was significantly increased in the experimental groups in hypoxia environment, which confirmed that the AAV effectually regulated by the minimal HRE was inserted into anterior extremity of CMV promoter. The HRE is inserted into anterior extremity of CMV promoter to lack incision enzyme recognition site by PCR. And eukaryotic expression vector regulated by hypoxic-responsive is constructed. The AAV effectually regulated by the minimal HRE inserted into anterior extremity of CMV promoter. The vector is successfully constructed and it has important theoretical and practical value in the synteresis and therapy of ischemia angiocardiopathy and cerebrovascular disease.

  18. Allergenic characterization of a novel allergen, homologous to chymotrypsin, from german cockroach.

    PubMed

    Jeong, Kyoung Yong; Son, Mina; Lee, Jae Hyun; Hong, Chein Soo; Park, Jung Won

    2015-05-01

    Cockroach feces are known to be rich in IgE-reactive components. Various protease allergens were identified by proteomic analysis of German cockroach fecal extract in a previous study. In this study, we characterized a novel allergen, a chymotrypsin-like serine protease. A cDNA sequence homologous to chymotrypsin was obtained by analysis of German cockroach expressed sequence tag (EST) clones. The recombinant chymotrypsins from the German cockroach and house dust mite (Der f 6) were expressed in Escherichia coli using the pEXP5NT/TOPO vector system, and their allergenicity was investigated by ELISA. The deduced amino acid sequence of German cockroach chymotrypsin showed 32.7 to 43.1% identity with mite group 3 (trypsin) and group 6 (chymotrypsin) allergens. Sera from 8 of 28 German cockroach allergy subjects (28.6%) showed IgE binding to the recombinant protein. IgE binding to the recombinant cockroach chymotrypsin was inhibited by house dust mite chymotrypsin Der f 6, while it minimally inhibited the German cockroach whole body extract. A novel allergen homologous to chymotrypsin was identified from the German cockroach and was cross-reactive with Der f 6.

  19. Characterization of a native hammerhead ribozyme derived from schistosomes

    PubMed Central

    OSBORNE, EDITH M.; SCHAAK, JANELL E.; DEROSE, VICTORIA J.

    2005-01-01

    A recent re-examination of the role of the helices surrounding the conserved core of the hammerhead ribozyme has identified putative loop–loop interactions between stems I and II in native hammerhead sequences. These extended hammerhead sequences are more active at low concentrations of divalent cations than are minimal hammerheads. The loop–loop interactions are proposed to stabilize a more active conformation of the conserved core. Here, a kinetic and thermodynamic characterization of an extended hammerhead sequence derived from Schistosoma mansoni is performed. Biphasic kinetics are observed, suggesting the presence of at least two conformers, one cleaving with a fast rate and the other with a slow rate. Replacing loop II with a poly(U) sequence designed to eliminate the interaction between the two loops results in greatly diminished activity, suggesting that the loop–loop interactions do aid in forming a more active conformation. Previous studies with minimal hammerheads have shown deleterious effects of Rp-phosphorothioate substitutions at the cleavage site and 5′ to A9, both of which could be rescued with Cd2+. Here, phosphorothioate modifications at the cleavage site and 5′ to A9 were made in the schistosome-derived sequence. In Mg2+, both phosphorothioate substitutions decreased the overall fraction cleaved without significantly affecting the observed rate of cleavage. The addition of Cd2+ rescued cleavage in both cases, suggesting that these are still putative metal binding sites in this native sequence. PMID:15659358

  20. Structural basis of recognition of farnesylated and methylated KRAS4b by PDEδ.

    PubMed

    Dharmaiah, Srisathiyanarayanan; Bindu, Lakshman; Tran, Timothy H; Gillette, William K; Frank, Peter H; Ghirlando, Rodolfo; Nissley, Dwight V; Esposito, Dominic; McCormick, Frank; Stephen, Andrew G; Simanshu, Dhirendra K

    2016-11-01

    Farnesylation and carboxymethylation of KRAS4b (Kirsten rat sarcoma isoform 4b) are essential for its interaction with the plasma membrane where KRAS-mediated signaling events occur. Phosphodiesterase-δ (PDEδ) binds to KRAS4b and plays an important role in targeting it to cellular membranes. We solved structures of human farnesylated-methylated KRAS4b in complex with PDEδ in two different crystal forms. In these structures, the interaction is driven by the C-terminal amino acids together with the farnesylated and methylated C185 of KRAS4b that binds tightly in the central hydrophobic pocket present in PDEδ. In crystal form II, we see the full-length structure of farnesylated-methylated KRAS4b, including the hypervariable region. Crystal form I reveals structural details of farnesylated-methylated KRAS4b binding to PDEδ, and crystal form II suggests the potential binding mode of geranylgeranylated-methylated KRAS4b to PDEδ. We identified a 5-aa-long sequence motif (Lys-Ser-Lys-Thr-Lys) in KRAS4b that may enable PDEδ to bind both forms of prenylated KRAS4b. Structure and sequence analysis of various prenylated proteins that have been previously tested for binding to PDEδ provides a rationale for why some prenylated proteins, such as KRAS4a, RalA, RalB, and Rac1, do not bind to PDEδ. Comparison of all four available structures of PDEδ complexed with various prenylated proteins/peptides shows the presence of additional interactions due to a larger protein-protein interaction interface in KRAS4b-PDEδ complex. This interface might be exploited for designing an inhibitor with minimal off-target effects.

  1. An ethylene-responsive enhancer element is involved in the senescence-related expression of the carnation glutathione-S-transferase (GST1) gene.

    PubMed

    Itzhaki, H; Maxson, J M; Woodson, W R

    1994-09-13

    The increased production of ethylene during carnation petal senescence regulates the transcription of the GST1 gene encoding a subunit of glutathione-S-transferase. We have investigated the molecular basis for this ethylene-responsive transcription by examining the cis elements and trans-acting factors involved in the expression of the GST1 gene. Transient expression assays following delivery of GST1 5' flanking DNA fused to a beta-glucuronidase receptor gene were used to functionally define sequences responsible for ethylene-responsive expression. Deletion analysis of the 5' flanking sequences of GST1 identified a single positive regulatory element of 197 bp between -667 and -470 necessary for ethylene-responsive expression. The sequences within this ethylene-responsive region were further localized to 126 bp between -596 and -470. The ethylene-responsive element (ERE) within this region conferred ethylene-regulated expression upon a minimal cauliflower mosaic virus-35S TATA-box promoter in an orientation-independent manner. Gel electrophoresis mobility-shift assays and DNase I footprinting were used to identify proteins that bind to sequences within the ERE. Nuclear proteins from carnation petals were shown to specifically interact with the 126-bp ERE and the presence and binding of these proteins were independent of ethylene or petal senescence. DNase I footprinting defined DNA sequences between -510 and -488 within the ERE specifically protected by bound protein. An 8-bp sequence (ATTTCAAA) within the protected region shares significant homology with promoter sequences required for ethylene responsiveness from the tomato fruit-ripening E4 gene.

  2. Adeno-associated virus type 2 binding study on model heparan sulfate surface

    NASA Astrophysics Data System (ADS)

    Negishi, Atsuko; Liu, Jian; McCarty, Douglas; Samulski, Jude; Superfine, Richard

    2003-11-01

    Understanding the mechanisms involved in virus infections is useful in its application in areas such as gene therapy, drug development and delivery, and biosensors. In collaboration with UNC Gene Therapy Center and School of Pharmacy, we are specifically looking at the interaction between human parvovirus adeno-associated virus type 2 (AAV2), a potential viral vector, and heparan sulfate proteoglycan (HSPG), a known cell surface receptor for AAV2. Recent development in glycobiology has shown that some protein-polysaccharide binding is sugar sequence dependent. Heparan sulfate (HS) is a polysaccharide chain of sulfated iduronic/glucuronic and sulfate glucosamine residues and can be differentiated into sequence specific structures by enzymes. These enzymatic modifications, known as heparan sulfate sulfotransferase modified modifications, have been shown to change the biological nature of heparan sulfate such as specific binding to proteins and viruses. For understanding HS-assisted viral infection mechanisms, we are interested in investigating the binding affinity and stability of AAV to different HS structures. We have developed a model heparan sulfate surface in which AAV adsorption studies are done and analyzed using the atomic force microscope (AFM). In addition, a miniArray assay has been created to facilitate to this study. Adsorption studies are done in 4 white LED wells with approximately 3 mm2 reaction areas which minimize sample use and waste.

  3. Electrostatic interactions guide the active site face of a structure-specific ribonuclease to its RNA substrate.

    PubMed

    Plantinga, Matthew J; Korennykh, Alexei V; Piccirilli, Joseph A; Correll, Carl C

    2008-08-26

    Restrictocin, a member of the alpha-sarcin family of site-specific endoribonucleases, uses electrostatic interactions to bind to the ribosome and to RNA oligonucleotides, including the minimal specific substrate, the sarcin/ricin loop (SRL) of 23S-28S rRNA. Restrictocin binds to the SRL by forming a ground-state E:S complex that is stabilized predominantly by Coulomb interactions and depends on neither the sequence nor structure of the RNA, suggesting a nonspecific complex. The 22 cationic residues of restrictocin are dispersed throughout this protein surface, complicating a priori identification of a Coulomb interacting surface. Structural studies have identified an enzyme-substrate interface, which is expected to overlap with the electrostatic E:S interface. Here, we identified restrictocin residues that contribute to binding in the E:S complex by determining the salt dependence [partial differential log(k 2/ K 1/2)/ partial differential log[KCl

  4. Evolution of cyclohexadienyl dehydratase from an ancestral solute-binding protein.

    PubMed

    Clifton, Ben E; Kaczmarski, Joe A; Carr, Paul D; Gerth, Monica L; Tokuriki, Nobuhiko; Jackson, Colin J

    2018-04-23

    The emergence of enzymes through the neofunctionalization of noncatalytic proteins is ultimately responsible for the extraordinary range of biological catalysts observed in nature. Although the evolution of some enzymes from binding proteins can be inferred by homology, we have a limited understanding of the nature of the biochemical and biophysical adaptations along these evolutionary trajectories and the sequence in which they occurred. Here we reconstructed and characterized evolutionary intermediate states linking an ancestral solute-binding protein to the extant enzyme cyclohexadienyl dehydratase. We show how the intrinsic reactivity of a desolvated general acid was harnessed by a series of mutations radiating from the active site, which optimized enzyme-substrate complementarity and transition-state stabilization and minimized sampling of noncatalytic conformations. Our work reveals the molecular evolutionary processes that underlie the emergence of enzymes de novo, which are notably mirrored by recent examples of computational enzyme design and directed evolution.

  5. First principles design of a core bioenergetic transmembrane electron-transfer protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goparaju, Geetha; Fry, Bryan A.; Chobot, Sarah E.

    Here we describe the design, Escherichia coli expression and characterization of a simplified, adaptable and functionally transparent single chain 4-α-helix transmembrane protein frame that binds multiple heme and light activatable porphyrins. Such man-made cofactor-binding oxidoreductases, designed from first principles with minimal reference to natural protein sequences, are known as maquettes. This design is an adaptable frame aiming to uncover core engineering principles governing bioenergetic transmembrane electron-transfer function and recapitulate protein archetypes proposed to represent the origins of photosynthesis. This article is part of a Special Issue entitled Biodesign for Bioenergetics — the design and engineering of electronic transfer cofactors, proteinsmore » and protein networks, edited by Ronald L. Koder and J.L. Ross Anderson.« less

  6. A sequence upstream of canonical PDZ-binding motif within CFTR COOH-terminus enhances NHERF1 interaction.

    PubMed

    Sharma, Neeraj; LaRusch, Jessica; Sosnay, Patrick R; Gottschalk, Laura B; Lopez, Andrea P; Pellicore, Matthew J; Evans, Taylor; Davis, Emily; Atalar, Melis; Na, Chan-Hyun; Rosson, Gedge D; Belchis, Deborah; Milewski, Michal; Pandey, Akhilesh; Cutting, Garry R

    2016-12-01

    The development of cystic fibrosis transmembrane conductance regulator (CFTR) targeted therapy for cystic fibrosis has generated interest in maximizing membrane residence of mutant forms of CFTR by manipulating interactions with scaffold proteins, such as sodium/hydrogen exchange regulatory factor-1 (NHERF1). In this study, we explored whether COOH-terminal sequences in CFTR beyond the PDZ-binding motif influence its interaction with NHERF1. NHERF1 displayed minimal self-association in blot overlays (NHERF1, K d = 1,382 ± 61.1 nM) at concentrations well above physiological levels, estimated at 240 nM from RNA-sequencing and 260 nM by liquid chromatography tandem mass spectrometry in sweat gland, a key site of CFTR function in vivo. However, NHERF1 oligomerized at considerably lower concentrations (10 nM) in the presence of the last 111 amino acids of CFTR (20 nM) in blot overlays and cross-linking assays and in coimmunoprecipitations using differently tagged versions of NHERF1. Deletion and alanine mutagenesis revealed that a six-amino acid sequence 1417 EENKVR 1422 and the terminal 1478 TRL 1480 (PDZ-binding motif) in the COOH-terminus were essential for the enhanced oligomerization of NHERF1. Full-length CFTR stably expressed in Madin-Darby canine kidney epithelial cells fostered NHERF1 oligomerization that was substantially reduced (∼5-fold) on alanine substitution of EEN, KVR, or EENKVR residues or deletion of the TRL motif. Confocal fluorescent microscopy revealed that the EENKVR and TRL sequences contribute to preferential localization of CFTR to the apical membrane. Together, these results indicate that COOH-terminal sequences mediate enhanced NHERF1 interaction and facilitate the localization of CFTR, a property that could be manipulated to stabilize mutant forms of CFTR at the apical surface to maximize the effect of CFTR-targeted therapeutics. Copyright © 2016 the American Physiological Society.

  7. A sequence upstream of canonical PDZ-binding motif within CFTR COOH-terminus enhances NHERF1 interaction

    PubMed Central

    Sharma, Neeraj; LaRusch, Jessica; Sosnay, Patrick R.; Gottschalk, Laura B.; Lopez, Andrea P.; Pellicore, Matthew J.; Evans, Taylor; Davis, Emily; Atalar, Melis; Na, Chan-Hyun; Rosson, Gedge D.; Belchis, Deborah; Milewski, Michal; Pandey, Akhilesh

    2016-01-01

    The development of cystic fibrosis transmembrane conductance regulator (CFTR) targeted therapy for cystic fibrosis has generated interest in maximizing membrane residence of mutant forms of CFTR by manipulating interactions with scaffold proteins, such as sodium/hydrogen exchange regulatory factor-1 (NHERF1). In this study, we explored whether COOH-terminal sequences in CFTR beyond the PDZ-binding motif influence its interaction with NHERF1. NHERF1 displayed minimal self-association in blot overlays (NHERF1, Kd = 1,382 ± 61.1 nM) at concentrations well above physiological levels, estimated at 240 nM from RNA-sequencing and 260 nM by liquid chromatography tandem mass spectrometry in sweat gland, a key site of CFTR function in vivo. However, NHERF1 oligomerized at considerably lower concentrations (10 nM) in the presence of the last 111 amino acids of CFTR (20 nM) in blot overlays and cross-linking assays and in coimmunoprecipitations using differently tagged versions of NHERF1. Deletion and alanine mutagenesis revealed that a six-amino acid sequence 1417EENKVR1422 and the terminal 1478TRL1480 (PDZ-binding motif) in the COOH-terminus were essential for the enhanced oligomerization of NHERF1. Full-length CFTR stably expressed in Madin-Darby canine kidney epithelial cells fostered NHERF1 oligomerization that was substantially reduced (∼5-fold) on alanine substitution of EEN, KVR, or EENKVR residues or deletion of the TRL motif. Confocal fluorescent microscopy revealed that the EENKVR and TRL sequences contribute to preferential localization of CFTR to the apical membrane. Together, these results indicate that COOH-terminal sequences mediate enhanced NHERF1 interaction and facilitate the localization of CFTR, a property that could be manipulated to stabilize mutant forms of CFTR at the apical surface to maximize the effect of CFTR-targeted therapeutics. PMID:27793802

  8. Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively.

    PubMed

    Clifford, Jacob; Adami, Christoph

    2015-09-02

    Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.

  9. Exploring the sequence-function relationship in transcriptional regulation by the lac O1 operator.

    PubMed

    Maity, Tuhin S; Jha, Ramesh K; Strauss, Charlie E M; Dunbar, John

    2012-07-01

    Understanding how binding of a transcription factor to an operator is influenced by the operator sequence is an ongoing quest. It facilitates discovery of alternative binding sites as well as tuning of transcriptional regulation. We investigated the behavior of the Escherichia coli Lac repressor (LacI) protein with a large set of lac O(1) operator variants. The 114 variants examined contained a mean of 2.9 (range 0-4) mutations at positions -4, -2, +2 and +4 in the minimally required 17 bp operator. The relative affinity of LacI for the operators was examined by quantifying expression of a GFP reporter gene and Rosetta structural modeling. The combinations of mutations in the operator sequence created a wide range of regulatory behaviors. We observed variations in the GFP fluorescent signal among the operator variants of more than an order of magnitude under both uninduced and induced conditions. We found that a single nucleotide change may result in changes of up to six- and 12-fold in uninduced and induced GFP signals, respectively. Among the four positions mutated, we found that nucleotide G at position -4 is strongly correlated with strong repression. By Rosetta modeling, we found a significant correlation between the calculated binding energy and the experimentally observed transcriptional repression strength for many operators. However, exceptions were also observed, underscoring the necessity for further improvement in biophysical models of protein-DNA interactions. © 2012 The Authors Journal compilation © 2012 FEBS.

  10. Mature parasite-infected erythrocyte surface antigen (MESA) of Plasmodium falciparum binds to the 30-kDa domain of protein 4.1 in malaria-infected red blood cells.

    PubMed

    Waller, Karena L; Nunomura, Wataru; An, Xiuli; Cooke, Brian M; Mohandas, Narla; Coppel, Ross L

    2003-09-01

    The Plasmodium falciparum mature parasite-infected erythrocyte surface antigen (MESA) is exported from the parasite to the infected red blood cell (IRBC) membrane skeleton, where it binds to protein 4.1 (4.1R) via a 19-residue MESA sequence. Using purified RBC 4.1R and recombinant 4.1R fragments, we show MESA binds the 30-kDa region of RBC 4.1R, specifically to a 51-residue region encoded by exon 10 of the 4.1R gene. The 3D structure of this region reveals that the MESA binding site overlaps the region of 4.1R involved in the p55, glycophorin C, and 4.1R ternary complex. Further binding studies using p55, 4.1R, and MESA showed competition between p55 and MESA for 4.1R, implying that MESA bound at the IRBC membrane skeleton may modulate normal 4.1R and p55 interactions in vivo. Definition of minimal binding domains involved in critical protein interactions in IRBCs may aid the development of novel therapies for falciparum malaria.

  11. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  12. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification.

    PubMed

    Andreatta, Massimo; Karosiene, Edita; Rasmussen, Michael; Stryhn, Anette; Buus, Søren; Nielsen, Morten

    2015-11-01

    A key event in the generation of a cellular response against malicious organisms through the endocytic pathway is binding of peptidic antigens by major histocompatibility complex class II (MHC class II) molecules. The bound peptide is then presented on the cell surface where it can be recognized by T helper lymphocytes. NetMHCIIpan is a state-of-the-art method for the quantitative prediction of peptide binding to any human or mouse MHC class II molecule of known sequence. In this paper, we describe an updated version of the method with improved peptide binding register identification. Binding register prediction is concerned with determining the minimal core region of nine residues directly in contact with the MHC binding cleft, a crucial piece of information both for the identification and design of CD4(+) T cell antigens. When applied to a set of 51 crystal structures of peptide-MHC complexes with known binding registers, the new method NetMHCIIpan-3.1 significantly outperformed the earlier 3.0 version. We illustrate the impact of accurate binding core identification for the interpretation of T cell cross-reactivity using tetramer double staining with a CMV epitope and its variants mapped to the epitope binding core. NetMHCIIpan is publicly available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.1 .

  13. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences.

    PubMed

    Whitaker, Weston R; Lee, Hanson; Arkin, Adam P; Dueber, John E

    2015-03-20

    Genetic sequences ported into non-native hosts for synthetic biology applications can gain unexpected properties. In this study, we explored sequences functioning as ribosome binding sites (RBSs) within protein coding DNA sequences (CDSs) that cause internal translation, resulting in truncated proteins. Genome-wide prediction of bacterial RBSs, based on biophysical calculations employed by the RBS calculator, suggests a selection against internal RBSs within CDSs in Escherichia coli, but not those in Saccharomyces cerevisiae. Based on these calculations, silent mutations aimed at removing internal RBSs can effectively reduce truncation products from internal translation. However, a solution for complete elimination of internal translation initiation is not always feasible due to constraints of available coding sequences. Fluorescence assays and Western blot analysis showed that in genes with internal RBSs, increasing the strength of the intended upstream RBS had little influence on the internal translation strength. Another strategy to minimize truncated products from an internal RBS is to increase the relative strength of the upstream RBS with a concomitant reduction in promoter strength to achieve the same protein expression level. Unfortunately, lower transcription levels result in increased noise at the single cell level due to stochasticity in gene expression. At the low expression regimes desired for many synthetic biology applications, this problem becomes particularly pronounced. We found that balancing promoter strengths and upstream RBS strengths to intermediate levels can achieve the target protein concentration while avoiding both excessive noise and truncated protein.

  14. Transcriptional regulation of the human mitochondrial peptide deformylase (PDF).

    PubMed

    Pereira-Castro, Isabel; Costa, Luís Teixeira da; Amorim, António; Azevedo, Luisa

    2012-05-18

    The last years of research have been particularly dynamic in establishing the importance of peptide deformylase (PDF), a protein of the N-terminal methionine excision (NME) pathway that removes formyl-methionine from mitochondrial-encoded proteins. The genomic sequence of the human PDF gene is shared with the COG8 gene, which encodes a component of the oligomeric golgi complex, a very unusual case in Eukaryotic genomes. Since PDF is crucial in maintaining mitochondrial function and given the atypical short distance between the end of COG8 coding sequence and the PDF initiation codon, we investigated whether the regulation of the human PDF is affected by the COG8 overlapping partner. Our data reveals that PDF has several transcription start sites, the most important of which only 18 bp from the initiation codon. Furthermore, luciferase-activation assays using differently-sized fragments defined a 97 bp minimal promoter region for human PDF, which is capable of very strong transcriptional activity. This fragment contains a potential Sp1 binding site highly conserved in mammalian species. We show that this binding site, whose mutation significantly reduces transcription activation, is a target for the Sp1 transcription factor, and possibly of other members of the Sp family. Importantly, the entire minimal promoter region is located after the end of COG8's coding region, strongly suggesting that the human PDF preserves an independent regulation from its overlapping partner. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. Modulation of DNA-Polyamide Interaction by β-alanine Substitutions: A Study of Positional Effects on Binding Affinity, Kinetics and Thermodynamics

    PubMed Central

    Wang, Shuo; Aston, Karl; Koeller, Kevin J.; Harris, G. Davis; Rath, Nigam P.

    2014-01-01

    Hairpin polyamides (PAs) are an important class of sequence-specific DNA minor groove binders, and frequently employ a flexible motif, β-alanine (β), to reduce the molecular rigidity to maintain the DNA recognition register. To better understand the diverse effects β can have on DNA-PA binding affinity, selectivity, and especially kinetics, which have rarely been reported, we have initiated a detailed study for an eight-heterocyclic hairpin PA and its β derivatives with their cognate and mutant sequences. With these derivatives, all internal pyrroles of the parent PA are systematically substituted with single or double βs. A set of complementary experiments have been conducted to evaluate the molecular interactions in detail: UV-melting, biosensor-surface plasmon resonance, circular dichroism and isothermal titration calorimetry. The β substitutions generally weaken the binding affinities of these PAs with cognate DNA, and have large and diverse influences on PA binding kinetics in a position- and number-dependent manner. The DNA base mutations have also shown positional effects on binding of a single PA. Besides the β substitutions, the monocationic Dp group [3-(dimethylamino) propylamine] in parent PA has been modified into a dicationic Ta group (3, 3'-Diamino-N-methyldipropylamine) to minimize the frequently observed PA aggregation with ITC experiments. The results clearly show that the Ta modification not only maintains the DNA binding mode and affinity of PA, but also significantly reduces PA aggregation and allows the complete thermodynamic signature of eight-ring hairpin PA to be determined for the first time. This combined set of results significantly extends our understanding of the energetic basis of specific DNA recognition by PAs. PMID:25141096

  16. Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

    PubMed Central

    Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

    1993-01-01

    The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943

  17. Special AT-rich sequence binding protein 1 promotes tumor growth and metastasis of esophageal squamous cell carcinoma.

    PubMed

    Ma, Jun; Wu, Kaiming; Zhao, Zhenxian; Miao, Rong; Xu, Zhe

    2017-03-01

    Esophageal squamous cell carcinoma is one of the most aggressive malignancies worldwide. Special AT-rich sequence binding protein 1 is a nuclear matrix attachment region binding protein which participates in higher order chromatin organization and tissue-specific gene expression. However, the role of special AT-rich sequence binding protein 1 in esophageal squamous cell carcinoma remains unknown. In this study, western blot and quantitative real-time polymerase chain reaction analysis were performed to identify differentially expressed special AT-rich sequence binding protein 1 in a series of esophageal squamous cell carcinoma tissue samples. The effects of special AT-rich sequence binding protein 1 silencing by two short-hairpin RNAs on cell proliferation, migration, and invasion were assessed by the CCK-8 assay and transwell assays in esophageal squamous cell carcinoma in vitro. Special AT-rich sequence binding protein 1 was significantly upregulated in esophageal squamous cell carcinoma tissue samples and cell lines. Silencing of special AT-rich sequence binding protein 1 inhibited the proliferation of KYSE450 and EC9706 cells which have a relatively high level of special AT-rich sequence binding protein 1, and the ability of migration and invasion of KYSE450 and EC9706 cells was distinctly suppressed. Special AT-rich sequence binding protein 1 could be a potential target for the treatment of esophageal squamous cell carcinoma and inhibition of special AT-rich sequence binding protein 1 may provide a new strategy for the prevention of esophageal squamous cell carcinoma invasion and metastasis.

  18. A Gibbs sampler for motif detection in phylogenetically close sequences

    NASA Astrophysics Data System (ADS)

    Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

    2004-03-01

    Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.

  19. New Insights in Thrombin Inhibition Structure-Activity Relationships by Characterization of Octadecasaccharides from Low Molecular Weight Heparin.

    PubMed

    Mourier, Pierre A J; Guichard, Olivier Y; Herman, Fréderic; Sizun, Philippe; Viskov, Christian

    2017-03-08

    Low Molecular Weight Heparins (LMWH) are complex anticoagulant drugs that mainly inhibit the blood coagulation cascade through indirect interaction with antithrombin. While inhibition of the factor Xa is well described, little is known about the polysaccharide structure inhibiting thrombin. In fact, a minimal chain length of 18 saccharides units, including an antithrombin (AT) binding pentasaccharide, is mandatory to form the active ternary complex for LMWH obtained by alkaline β-elimination (e.g., enoxaparin). However, the relationship between structure of octadecasaccharides and their thrombin inhibition has not been yet assessed on natural compounds due to technical hurdles to isolate sufficiently pure material. We report the preparation of five octadecasaccharides by using orthogonal separation methods including size exclusion, AT affinity, ion pairing and strong anion exchange chromatography. Each of these octadecasaccharides possesses two AT binding pentasaccharide sequences located at various positions. After structural elucidation using enzymatic sequencing and NMR, in vitro aFXa and aFIIa were determined. The biological activities reveal the critical role of each pentasaccharide sequence position within the octadecasaccharides and structural requirements to inhibit thrombin. Significant differences in potency, such as the twenty-fold magnitude difference observed between two regioisomers, further highlights the importance of depolymerisation process conditions on LMWH biological activity.

  20. Cloning of cellobiose phosphoenolpyruvate-dependent phosphotransferase genes: Functional expression in recombinant Escherichia coli and identification of a putative binding region for disaccharides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.

    1997-02-01

    Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less

  1. A spectroscopic and voltammetric study of the pH-dependent Cu(II) coordination to the peptide GGGTH: relevance to the fifth Cu(II) site in the prion protein.

    PubMed

    Hureau, Christelle; Charlet, Laurent; Dorlet, Pierre; Gonnet, Florence; Spadini, Lorenzo; Anxolabéhère-Mallart, Elodie; Girerd, Jean-Jacques

    2006-09-01

    The GGGTH sequence has been proposed to be the minimal sequence involved in the binding of a fifth Cu(II) ion in addition to the octarepeat region of the prion protein (PrP) which binds four Cu(II) ions. Coordination of Cu(II) by the N- and C-protected Ac-GGGTH-NH(2) pentapeptide (P(5)) was investigated by using potentiometric titration, electrospray ionization mass spectrometry, UV-vis spectroscopy, electron paramagnetic resonance (EPR) spectroscopy and cyclic voltammetry experiments. Four different Cu(II) complexes were identified and characterized as a function of pH. The Cu(II) binding mode switches from NO(3) to N(4) for pH values ranging from 6.0 to 10.0. Quasi-reversible reduction of the [Cu(II)(P(5))H(-2)] complex formed at pH 6.7 occurs at E (1/2)=0.04 V versus Ag/AgCl, whereas reversible oxidation of the [Cu(II)(P(5))H(-3)](-) complex formed at pH 10.0 occurs at E (1/2)=0.66 V versus Ag/AgCl. Comparison of our EPR data with those of the rSHaPrP(90-231) (Burns et al. in Biochemistry 42:6794-6803, 2003) strongly suggests an N(3)O binding mode at physiological pH for the fifth Cu(II) site in the protein.

  2. A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation

    PubMed Central

    Farasat, Iman; Salis, Howard M.

    2016-01-01

    The ability to precisely modify genomes and regulate specific genes will greatly accelerate several medical and engineering applications. The CRISPR/Cas9 (Type II) system binds and cuts DNA using guide RNAs, though the variables that control its on-target and off-target activity remain poorly characterized. Here, we develop and parameterize a system-wide biophysical model of Cas9-based genome editing and gene regulation to predict how changing guide RNA sequences, DNA superhelical densities, Cas9 and crRNA expression levels, organisms and growth conditions, and experimental conditions collectively control the dynamics of dCas9-based binding and Cas9-based cleavage at all DNA sites with both canonical and non-canonical PAMs. We combine statistical thermodynamics and kinetics to model Cas9:crRNA complex formation, diffusion, site selection, reversible R-loop formation, and cleavage, using large amounts of structural, biochemical, expression, and next-generation sequencing data to determine kinetic parameters and develop free energy models. Our results identify DNA supercoiling as a novel mechanism controlling Cas9 binding. Using the model, we predict Cas9 off-target binding frequencies across the lambdaphage and human genomes, and explain why Cas9’s off-target activity can be so high. With this improved understanding, we propose several rules for designing experiments for minimizing off-target activity. We also discuss the implications for engineering dCas9-based genetic circuits. PMID:26824432

  3. Solution structure of the chick TGFbeta type II receptor ligand-binding domain.

    PubMed

    Marlow, Michael S; Brown, Christopher B; Barnett, Joey V; Krezel, Andrzej M

    2003-02-28

    The transforming growth factor beta (TGFbeta) signaling pathway influences cell proliferation, immune responses, and extracellular matrix reorganization throughout the vertebrate life cycle. The signaling cascade is initiated by ligand-binding to its cognate type II receptor. Here, we present the structure of the chick type II TGFbeta receptor determined by solution NMR methods. Distance and angular constraints were derived from 15N and 13C edited NMR experiments. Torsion angle dynamics was used throughout the structure calculations and refinement. The 20 final structures were energy minimized using the generalized Born solvent model. For these 20 structures, the average backbone root-mean-square distance from the average structure is below 0.6A. The overall fold of this 109-residue domain is conserved within the superfamily of these receptors. Chick receptors fully recognize and respond to human TGFbeta ligands despite only 60% identity at the sequence level. Comparison with the human TGFbeta receptor determined by X-ray crystallography reveals different conformations in several regions. Sequence divergence and crystal packing interactions under low pH conditions are likely causes. This solution structure identifies regions were structural changes, however subtle, may occur upon ligand-binding. We also identified two very well conserved molecular surfaces. One was found to bind ligand in the crystallized human TGFbeta3:TGFbeta type II receptor complex. The other, newly identified area can be the interaction site with type I and/or type III receptors of the TGFbeta signaling complex.

  4. Evolutionary and molecular foundations of multiple contemporary functions of the nitroreductase superfamily

    PubMed Central

    Akiva, Eyal; Copp, Janine N.; Tokuriki, Nobuhiko; Babbitt, Patricia C.

    2017-01-01

    Insight regarding how diverse enzymatic functions and reactions have evolved from ancestral scaffolds is fundamental to understanding chemical and evolutionary biology, and for the exploitation of enzymes for biotechnology. We undertook an extensive computational analysis using a unique and comprehensive combination of tools that include large-scale phylogenetic reconstruction to determine the sequence, structural, and functional relationships of the functionally diverse flavin mononucleotide-dependent nitroreductase (NTR) superfamily (>24,000 sequences from all domains of life, 54 structures, and >10 enzymatic functions). Our results suggest an evolutionary model in which contemporary subgroups of the superfamily have diverged in a radial manner from a minimal flavin-binding scaffold. We identified the structural design principle for this divergence: Insertions at key positions in the minimal scaffold that, combined with the fixation of key residues, have led to functional specialization. These results will aid future efforts to delineate the emergence of functional diversity in enzyme superfamilies, provide clues for functional inference for superfamily members of unknown function, and facilitate rational redesign of the NTR scaffold. PMID:29078300

  5. A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

    PubMed

    Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

    1995-04-01

    The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).

  6. Authentic interdomain communication in an RNA helicase reconstituted by expressed protein ligation of two helicase domains.

    PubMed

    Karow, Anne R; Theissen, Bettina; Klostermeier, Dagmar

    2007-01-01

    RNA helicases mediate structural rearrangements of RNA or RNA-protein complexes at the expense of ATP hydrolysis. Members of the DEAD box helicase family consist of two flexibly connected helicase domains. They share nine conserved sequence motifs that are involved in nucleotide binding and hydrolysis, RNA binding, and helicase activity. Most of these motifs line the cleft between the two helicase domains, and extensive communication between them is required for RNA unwinding. The two helicase domains of the Bacillus subtilis RNA helicase YxiN were produced separately as intein fusions, and a functional RNA helicase was generated by expressed protein ligation. The ligated helicase binds adenine nucleotides with very similar affinities to the wild-type protein. Importantly, its intrinsically low ATPase activity is stimulated by RNA, and the Michaelis-Menten parameters are similar to those of the wild-type. Finally, ligated YxiN unwinds a minimal RNA substrate to an extent comparable to that of the wild-type helicase, confirming authentic interdomain communication.

  7. Preparation and properties of pure, full-length IclR protein of Escherichia coli. Use of time-of-flight mass spectrometry to investigate the problems encountered.

    PubMed Central

    Donald, L. J.; Chernushevich, I. V.; Zhou, J.; Verentchikov, A.; Poppe-Schriemer, N.; Hosfield, D. J.; Westmore, J. B.; Ens, W.; Duckworth, H. W.; Standing, K. G.

    1996-01-01

    IclR protein, the repressor of the aceBAK operon of Escherichia coli, has been examined by time-of-flight mass spectrometry, with ionization by matrix assisted laser desorption or by electrospray. The purified protein was found to have a smaller mass than that predicted from the base sequence of the cloned iclR gene. Additional measurements were made on mixtures of peptides derived from IclR by treatment with trypsin and cyanogen bromide. They showed that the amino acid sequence is that predicted from the gene sequence, except that the protein has suffered truncation by removal of the N-terminal eight or, in some cases, nine amino acid residues. The peptide bond whose hydrolysis would remove eight residues is a typical target for the E. coli protease OmpT. We find that, by taking precautions to minimize Omp T proteolysis, or by eliminating it through mutation of the host strain, we can isolate full-length IclR protein (lacking only the N-terminal methionine residue). Full-length IclR is a much better DNA-binding protein than the truncated versions: it binds the aceBAK operator sequence 44-fold more tightly, presumably because of additional contacts that the N-terminal residues make with the DNA. Our experience thus demonstrates the advantages of using mass spectrometry to characterize newly purified proteins produced from cloned genes, especially where proteolysis or other covalent modification is a concern. This technique gives mass spectra from complex peptide mixtures that can be analyzed completely, without any fractionation of the mixtures, by reference to the amino acid sequence inferred from the base sequence of the cloned gene. PMID:8844850

  8. Identification of amino acid residues in protein SRP72 required for binding to a kinked 5e motif of the human signal recognition particle RNA.

    PubMed

    Iakhiaeva, Elena; Iakhiaev, Alexei; Zwieb, Christian

    2010-11-13

    Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed.

  9. Identification of amino acid residues in protein SRP72 required for binding to a kinked 5e motif of the human signal recognition particle RNA

    PubMed Central

    2010-01-01

    Background Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. Results We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. Conclusions The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed. PMID:21073748

  10. A novel class of plant-specific zinc-dependent DNA-binding protein that binds to A/T-rich DNA sequences

    PubMed Central

    Nagano, Yukio; Furuhashi, Hirofumi; Inaba, Takehito; Sasaki, Yukiko

    2001-01-01

    Complementary DNA encoding a DNA-binding protein, designated PLATZ1 (plant AT-rich sequence- and zinc-binding protein 1), was isolated from peas. The amino acid sequence of the protein is similar to those of other uncharacterized proteins predicted from the genome sequences of higher plants. However, no paralogous sequences have been found outside the plant kingdom. Multiple alignments among these paralogous proteins show that several cysteine and histidine residues are invariant, suggesting that these proteins are a novel class of zinc-dependent DNA-binding proteins with two distantly located regions, C-x2-H-x11-C-x2-C-x(4–5)-C-x2-C-x(3–7)-H-x2-H and C-x2-C-x(10–11)-C-x3-C. In an electrophoretic mobility shift assay, the zinc chelator 1,10-o-phenanthroline inhibited DNA binding, and two distant zinc-binding regions were required for DNA binding. A protein blot with 65ZnCl2 showed that both regions are required for zinc-binding activity. The PLATZ1 protein non-specifically binds to A/T-rich sequences, including the upstream region of the pea GTPase pra2 and plastocyanin petE genes. Expression of the PLATZ1 repressed those of the reporter constructs containing the coding sequence of luciferase gene driven by the cauliflower mosaic virus (CaMV) 35S90 promoter fused to the tandem repeat of the A/T-rich sequences. These results indicate that PLATZ1 is a novel class of plant-specific zinc-dependent DNA-binding protein responsible for A/T-rich sequence-mediated transcriptional repression. PMID:11600698

  11. Repression of the Chromatin-Tethering Domain of Murine Leukemia Virus p12.

    PubMed

    Brzezinski, Jonathon D; Modi, Apexa; Liu, Mengdan; Roth, Monica J

    2016-12-15

    Murine leukemia virus (MLV) p12, encoded within Gag, binds the viral preintegration complex (PIC) to the mitotic chromatin. This acts to anchor the viral PIC in the nucleus as the nuclear envelope re-forms postmitosis. Mutations within the p12 C terminus (p12 PM13 to PM15) block early stages in viral replication. Within the p12 PM13 region (p12 60 PSPMA 65 ), our studies indicated that chromatin tethering was not detected when the wild-type (WT) p12 protein (M63) was expressed as a green fluorescent protein (GFP) fusion; however, constructs bearing p12-I63 were tethered. N-terminal truncations of the activated p12-I63-GFP indicated that tethering increased further upon deletion of p12 25 DLLTEDPPPY 34 , which includes the late domain required for viral assembly. The p12 PM15 sequence (p12 70 RREPP 74 ) is critical for wild-type viral viability; however, virions bearing the PM15 mutation (p12 70 AAAAA 74 ) with a second M63I mutant were viable, with a titer 18-fold lower than that of the WT. The p12 M63I mutation amplified chromatin tethering and compensated for the loss of chromatin binding of p12 PM15. Rescue of the p12-M63-PM15 nonviable mutant with prototype foamy virus (PFV) and Kaposi's sarcoma herpesvirus (KSHV) tethering sequences confirmed the function of p12 70-74 in chromatin binding. Minimally, full-strength tethering was seen with only p12 61 SPIASRLRGRR 71 fused to GFP. These results indicate that the p12 C terminus alone is sufficient for chromatin binding and that the presence of the p12 25 DLLTEDPPPY 34 motif in the N terminus suppresses the ability to tether. This study defines a regulatory mechanism controlling the differential roles of the MLV p12 protein in early and late replication. During viral assembly and egress, the late domain within the p12 N terminus functions to bind host vesicle release factors. During viral entry, the C terminus of p12 is required for tethering to host mitotic chromosomes. Our studies indicate that the p12 domain including the PPPY late sequence temporally represses the p12 chromatin tethering motif. Maximal p12 tethering was identified with only an 11-amino-acid minimal chromatin tethering motif encoded at p12 61-71 Within this region, the p12-M63I substitution switches p12 into a tethering-competent state, partially rescuing the p12-PM15 tethering mutant. A model for how this conformational change regulates early versus late functions is presented. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  12. The HIP1 initiator element plays a role in determining the in vitro requirement of the dihydrofolate reductase gene promoter for the C-terminal domain of RNA polymerase II.

    PubMed

    Buermeyer, A B; Thompson, N E; Strasheim, L A; Burgess, R R; Farnham, P J

    1992-05-01

    We examined the ability of purified RNA polymerase (RNAP) II lacking the carboxy-terminal heptapeptide repeat domain (CTD), called RNAP IIB, to transcribe a variety of promoters in HeLa extracts in which endogenous RNAP II activity was inhibited with anti-CTD monoclonal antibodies. Not all promoters were efficiently transcribed by RNAP IIB, and transcription did not correlate with the in vitro strength of the promoter or with the presence of a consensus TATA box. This was best illustrated by the GC-rich, non-TATA box promoters of the bidirectional dihydrofolate reductase (DHFR)-REP-encoding locus. Whereas the REP promoter was transcribed by RNAP IIB, the DHFR promoter remained inactive after addition of RNAP IIB to the antibody-inhibited reactions. However, both promoters were efficiently transcribed when purified RNAP with an intact CTD was added. We analyzed a series of promoter deletions to identify which cis elements determine the requirement for the CTD of RNAP II. All of the promoter deletions of both DHFR and REP retained the characteristics of their respective full-length promoters, suggesting that the information necessary to specify the requirement for the CTD is contained within approximately 65 bp near the initiation site. Furthermore, a synthetic minimal promoter of DHFR, consisting of a single binding site for Sp1 and a binding site for the HIP1 initiator cloned into a bacterial vector sequence, required RNAP II with an intact CTD for activity in vitro. Since the synthetic minimal promoter of DHFR and the smallest REP promoter deletion are both activated by Sp1, the differential response in this assay does not result from upstream activators. However, the sequences around the start sites of DHFR and REP are not similar and our data suggest that they bind different proteins. Therefore, we propose that specific initiator elements are important for determination of the requirement of some promoters for the CTD.

  13. Cloning and characterization of an autonomous replication sequence from Coxiella burnetii.

    PubMed Central

    Suhan, M; Chen, S Y; Thompson, H A; Hoover, T A; Hill, A; Williams, J C

    1994-01-01

    A Coxiella burnetii chromosomal fragment capable of functioning as an origin for the replication of a kanamycin resistance (Kanr) plasmid was isolated by use of origin search methods utilizing an Escherichia coli host. The 5.8-kb fragment was subcloned into phagemid vectors and was deleted progressively by an exonuclease III-S1 technique. Plasmids containing progressively shorter DNA fragments were then tested for their capability to support replication by transformation of an E. coli polA strain. A minimal autonomous replication sequence (ARS) was delimited to 403 bp. Sequencing of the entire 5.8-kb region revealed that the minimal ARS contained two consensus DnaA boxes, three A + T-rich 21-mers, a transcriptional promoter leading rightwards, and potential integration host factor and factor of inversion stimulation binding sites. Database comparisons of deduced amino acid sequences revealed that open reading frames located around the ARS were homologous to genes often, but not always, found near bacterial chromosomal origins; these included identities with rpmH and rnpA in E. coli and identities with the 9K protein and 60K membrane protein in E. coli and Pseudomonas species. These and direct hybridization data suggested that the ARS was chromosomal and not associated with the resident plasmid QpH1. Two-dimensional agarose gel electrophoresis did not reveal the presence of initiating intermediates, indicating that the ARS did not initiate chromosome replication during laboratory growth of C. burnetii. Images PMID:8071197

  14. Basis of altered RNA-binding specificity by PUF proteins revealed by crystal structures of yeast Puf4p

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, Matthew T.; Higgin, Joshua J.; Hall, Traci M.Tanaka

    2008-06-06

    Pumilio/FBF (PUF) family proteins are found in eukaryotic organisms and regulate gene expression post-transcriptionally by binding to sequences in the 3' untranslated region of target transcripts. PUF proteins contain an RNA binding domain that typically comprises eight {alpha}-helical repeats, each of which recognizes one RNA base. Some PUF proteins, including yeast Puf4p, have altered RNA binding specificity and use their eight repeats to bind to RNA sequences with nine or ten bases. Here we report the crystal structures of Puf4p alone and in complex with a 9-nucleotide (nt) target RNA sequence, revealing that Puf4p accommodates an 'extra' nucleotide by modestmore » adaptations allowing one base to be turned away from the RNA binding surface. Using structural information and sequence comparisons, we created a mutant Puf4p protein that preferentially binds to an 8-nt target RNA sequence over a 9-nt sequence and restores binding of each protein repeat to one RNA base.« less

  15. Molecular recognition of pyr mRNA by the Bacillus subtilis attenuation regulatory protein PyrR

    PubMed Central

    Bonner, Eric R.; D’Elia, John N.; Billips, Benjamin K.; Switzer, Robert L.

    2001-01-01

    The pyrimidine nucleotide biosynthesis (pyr) operon in Bacillus subtilis is regulated by transcriptional attenuation. The PyrR protein binds in a uridine nucleotide-dependent manner to three attenuation sites at the 5′-end of pyr mRNA. PyrR binds an RNA-binding loop, allowing a terminator hairpin to form and repressing the downstream genes. The binding of PyrR to defined RNA molecules was characterized by a gel mobility shift assay. Titration indicated that PyrR binds RNA in an equimolar ratio. PyrR bound more tightly to the binding loops from the second (BL2 RNA) and third (BL3 RNA) attenuation sites than to the binding loop from the first (BL1 RNA) attenuation site. PyrR bound BL2 RNA 4–5-fold tighter in the presence of saturating UMP or UDP and 150- fold tighter with saturating UTP, suggesting that UTP is the more important co-regulator. The minimal RNA that bound tightly to PyrR was 28 nt long. Thirty-one structural variants of BL2 RNA were tested for PyrR binding affinity. Two highly conserved regions of the RNA, the terminal loop and top of the upper stem and a purine-rich internal bulge and the base pairs below it, were crucial for tight binding. Conserved elements of RNA secondary structure were also required for tight binding. PyrR protected conserved areas of the binding loop in hydroxyl radical footprinting experiments. PyrR likely recognizes conserved RNA sequences, but only if they are properly positioned in the correct secondary structure. PMID:11726695

  16. Detecting cooperative sequences in the binding of RNA Polymerase-II

    NASA Astrophysics Data System (ADS)

    Glass, Kimberly; Rozenberg, Julian; Girvan, Michelle; Losert, Wolfgang; Ott, Ed; Vinson, Charles

    2008-03-01

    Regulation of the expression level of genes is a key biological process controlled largely by the 1000 base pair (bp) sequence preceding each gene (the promoter region). Within that region transcription factor binding sites (TFBS), 5-10 bp long sequences, act individually or cooperate together in the recruitment of, and therefore subsequent gene transcription by, RNA Polymerase-II (RNAP). We have measured the binding of RNAP to promoters on a genome-wide basis using Chromatin Immunoprecipitation (ChIP-on-Chip) microarray assays. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters with high RNAP binding values. We are able to demonstrate that virtually all sequences enriched in such promoters contain a CpG dinucleotide, indicating that TFBS that contain the CpG dinucleotide are involved in RNAP binding to promoters. Further analysis shows that the presence of pairs of CpG containing sequences cooperate to enhance the binding of RNAP to the promoter.

  17. Binding of resveratrol to the minor groove of DNA sequences with AATT and TTAA segments induces differential stability.

    PubMed

    Nair, Maya S; D'Mello, Samar; Pant, Rashmi; Poluri, Krishna Mohan

    2017-05-01

    Interactions of a natural stilbene compound, resveratrol with two DNA sequences containing AATT/TTAA segments have been studied. Resveratrol is found to interact with both the sequences. The mode of interaction has been studied using absorption, steady state fluorescence and circular dichroism spectroscopic techniques. UV-visible absorption and fluorescence studies provided the information regarding the binding constants and the stoichiometry of binding, whereas circular dichroism studies depicted the structural changes in DNA upon resveratrol binding. Our results evidenced that, though resveratrol showed similar affinity to both the sequences, the mode of interactions was different. The binding constants of resveratrol to AATT/TTAA sequences were found to be 7.55×10 5 M -1 and 5.42×10 5 M -1 respectively. Spectroscopic data evidenced for a groove binding interaction. Melting studies showed that the binding of resveratrol induces differential stability to the DNA sequences d(CGTTAACG) 2 and d(CGAATTCG) 2 . Fluorescence data showed a stoichiometry of 1:1 for d(CGAATTCG) 2 -resveratrol complex and 1:4 for d(CGTTAACG) 2 -resveratrol complex. Molecular docking studies demonstrated that resveratrol binds to the minor groove region of both the sequences to form stable complexes with varied atomic contacts to the DNA bases or backbone. Both the complexes are stabilized by hydrogen bond formation. Our results evidenced that modulation of DNA sequence within the same bases can greatly alter the binding geometry and stability of the complex upon binding to small molecule inhibitor compounds like resveratrol. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. A peptide sequence on carcinoembryonic antigen binds to a 80kD protein on Kupffer cells.

    PubMed

    Thomas, P; Petrick, A T; Toth, C A; Fox, E S; Elting, J J; Steele, G

    1992-10-30

    Clearance of carcinoembryonic antigen (CEA) from the circulation is by binding to Kupffer cells in the liver. We have shown that CEA binding to Kupffer cells occurs via a peptide sequence YPELPK representing amino acids 107-112 of the CEA sequence. This peptide sequence is located in the region between the N-terminal and the first immunoglobulin like loop domain. Using native CEA and peptides containing this sequence complexed with a heterobifunctional crosslinking agent and ligand blotting with biotinylated CEA and NCA we have shown binding to an 80kD protein on the Kupffer cell surface. This binding protein may be important in the development of hepatic metastases.

  19. In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure.

    PubMed

    Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V

    2017-02-07

    Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.

  20. Hyperdiversity of Genes Encoding Integral Light-Harvesting Proteins in the Dinoflagellate Symbiodinium sp

    PubMed Central

    Boldt, Lynda; Yellowlees, David; Leggat, William

    2012-01-01

    The superfamily of light-harvesting complex (LHC) proteins is comprised of proteins with diverse functions in light-harvesting and photoprotection. LHC proteins bind chlorophyll (Chl) and carotenoids and include a family of LHCs that bind Chl a and c. Dinophytes (dinoflagellates) are predominantly Chl c binding algal taxa, bind peridinin or fucoxanthin as the primary carotenoid, and can possess a number of LHC subfamilies. Here we report 11 LHC sequences for the chlorophyll a-chlorophyll c 2-peridinin protein complex (acpPC) subfamily isolated from Symbiodinium sp. C3, an ecologically important peridinin binding dinoflagellate taxa. Phylogenetic analysis of these proteins suggests the acpPC subfamily forms at least three clades within the Chl a/c binding LHC family; Clade 1 clusters with rhodophyte, cryptophyte and peridinin binding dinoflagellate sequences, Clade 2 with peridinin binding dinoflagellate sequences only and Clades 3 with heterokontophytes, fucoxanthin and peridinin binding dinoflagellate sequences. PMID:23112815

  1. Structural and functional analysis of mouse Msx1 gene promoter: sequence conservation with human MSX1 promoter points at potential regulatory elements.

    PubMed

    Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E

    1998-06-01

    Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.

  2. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  3. Cloning and functional analysis of the promoter region of the human Disc large gene.

    PubMed

    Cavatorta, Ana Laura; Giri, Adriana A; Banks, Lawrence; Gardiol, Daniela

    2008-11-15

    A number of studies have demonstrated the involvement of human Disc large (DLG1) in the control of both cell polarity and maintenance of tissue architecture. However, the mechanisms controlling DLG1 transcription are not fully understood. This is relevant since DLG1 is lost in many tumours during the later stages of malignant progression. Therefore, we performed the cloning and functional analysis of a genomic 5' flanking region of the DLG1 open reading frame with promoter activity. We analyzed the activity of a series of 5' deletion constructs of the DLG1 promoter and determined the minimal essential sequences that are required for promoter activity as well as cis-elements that regulate transcription. We found, within the DLG1 promoter sequences, consensus-binding sites for the Snail family of transcription factors that repress the expression of epithelial markers and are up-regulated in a variety of tumours. Snail transcription factors repress the transcriptional activity of the DLG1 promoter and, ectopically expressed Snail proteins bind to the native DLG1 promoter. These data suggest a role for Snail transcription factors in the control of DLG1 expression and provide a basis for understanding the transcriptional regulation of DLG1.

  4. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    PubMed

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  5. Xylan utilization in human gut commensal bacteria is orchestrated by unique modular organization of polysaccharide-degrading enzymes.

    PubMed

    Zhang, Meiling; Chekan, Jonathan R; Dodd, Dylan; Hong, Pei-Ying; Radlinski, Lauren; Revindran, Vanessa; Nair, Satish K; Mackie, Roderick I; Cann, Isaac

    2014-09-02

    Enzymes that degrade dietary and host-derived glycans represent the most abundant functional activities encoded by genes unique to the human gut microbiome. However, the biochemical activities of a vast majority of the glycan-degrading enzymes are poorly understood. Here, we use transcriptome sequencing to understand the diversity of genes expressed by the human gut bacteria Bacteroides intestinalis and Bacteroides ovatus grown in monoculture with the abundant dietary polysaccharide xylan. The most highly induced carbohydrate active genes encode a unique glycoside hydrolase (GH) family 10 endoxylanase (BiXyn10A or BACINT_04215 and BACOVA_04390) that is highly conserved in the Bacteroidetes xylan utilization system. The BiXyn10A modular architecture consists of a GH10 catalytic module disrupted by a 250 amino acid sequence of unknown function. Biochemical analysis of BiXyn10A demonstrated that such insertion sequences encode a new family of carbohydrate-binding modules (CBMs) that binds to xylose-configured oligosaccharide/polysaccharide ligands, the substrate of the BiXyn10A enzymatic activity. The crystal structures of CBM1 from BiXyn10A (1.8 Å), a cocomplex of BiXyn10A CBM1 with xylohexaose (1.14 Å), and the CBM from its homolog in the Prevotella bryantii B14 Xyn10C (1.68 Å) reveal an unanticipated mode for ligand binding. A minimal enzyme mix, composed of the gene products of four of the most highly up-regulated genes during growth on wheat arabinoxylan, depolymerizes the polysaccharide into its component sugars. The combined biochemical and biophysical studies presented here provide a framework for understanding fiber metabolism by an important group within the commensal bacterial population known to influence human health.

  6. Finding specific RNA motifs: Function in a zeptomole world?

    PubMed Central

    KNIGHT, ROB; YARUS, MICHAEL

    2003-01-01

    We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865

  7. Synthesis and fluorescence studies of multiple labeled oligonucleotides containing dansyl fluorophore covalently attached at 2'-terminus of cytidine via carbamate linkage.

    PubMed

    Misra, Arvind; Mishra, Satyendra; Misra, Krishna

    2004-01-01

    Synthesis of modified oligonucleotides in which the specific cytidine nucleoside analogues linked at 2'-OH position via a carbamate bond with an amino ethyl derivative of dansyl fluorophore is reported. For the multiple labeling of oligonucleotides, a strategy involving prelabeling at the monomeric level followed by solid phase assembly of oligonucleotides to obtain regiospecifically labeled probes has been described. The labeled monomer was phosphitylated using 2-cyanoethyl-N,N,N',N'-tetraisopropyl-phosphoramidite (Bis-reagent) and pyridiniumtrifluoro acetate (Py.TFA) as an activator. To ascertain the minimal number of labeled monomers required for a specific length of oligonucleotide for detection and also to assess the effect of carbamate linkage on hybridization, hexamer and 20-mer sequences were selected. Both were labeled with 1, 2, and 3 monomers at the 5'-end and hybridized with normal (unmodified) complementary sequences. As compared to midsequence or 3'-terminal labeling reported earlier, the 5'-terminal labeling has been found to have minimal contact-mediated quenching on duplex formation. This may be due to complementary deoxyguanosine (dG) rich oligonucleotide sequences or CG base pairs at a terminus that is known to yield stronger binding. This is one reason for selecting cytidine for labeling. The results may aid rational design of multiple fluorescent DNA probes for nonradioactive detection of nucleic acids.

  8. The R35 residue of the influenza A virus NS1 protein has minimal effects on nuclear localization but alters virus replication through disrupting protein dimerization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lalime, Erin N.; Pekosz, Andrew, E-mail: apekosz@jhsph.edu

    The influenza A virus NS1 protein has a nuclear localization sequence (NLS) in the amino terminal region. This NLS overlaps sequences that are important for RNA binding as well as protein dimerization. To assess the significance of the NS1 NLS on influenza virus replication, the NLS amino acids were individually mutated to alanines and recombinant viruses encoding these mutations were rescued. Viruses containing NS1 proteins with mutations at R37, R38 and K41 displayed minimal changes in replication or NS1 protein nuclear localization. Recombinant viruses encoding NS1 R35A were not recovered but viruses containing second site mutations at position D39 inmore » addition to the R35A mutation were isolated. The mutations at position 39 were shown to partially restore NS1 protein dimerization but had minimal effects on nuclear localization. These data indicate that the amino acids in the NS1 NLS region play a more important role in protein dimerization compared to nuclear localization. - Highlights: • Mutations were introduced into influenza NS1 NLS1. • NS1 R37A, R38A, K41A viruses had minimal changes in replication and NS1 localization. • Viruses from NS1 R35A rescue all contained additional mutations at D39. • NS1 R35A D39X mutations recover dimerization lost in NS1 R35A mutations. • These results reaffirm the importance of dimerization for NS1 protein function.« less

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hraber, Peter; Korber, Bette; Wagh, Kshitij

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  10. H3K9me2/3 Binding of the MBT Domain Protein LIN-61 Is Essential for Caenorhabditis elegans Vulva Development

    PubMed Central

    Koester-Eiserfunke, Nora; Fischle, Wolfgang

    2011-01-01

    MBT domain proteins are involved in developmental processes and tumorigenesis. In vitro binding and mutagenesis studies have shown that individual MBT domains within clustered MBT repeat regions bind mono- and dimethylated histone lysine residues with little to no sequence specificity but discriminate against the tri- and unmethylated states. However, the exact function of promiscuous histone methyl-lysine binding in the biology of MBT domain proteins has not been elucidated. Here, we show that the Caenorhabditis elegans four MBT domain protein LIN-61, in contrast to other MBT repeat factors, specifically interacts with histone H3 when methylated on lysine 9, displaying a strong preference for di- and trimethylated states (H3K9me2/3). Although the fourth MBT repeat is implicated in this interaction, H3K9me2/3 binding minimally requires MBT repeats two to four. Further, mutagenesis of residues conserved with other methyl-lysine binding MBT regions in the fourth MBT repeat does not abolish interaction, implicating a distinct binding mode. In vivo, H3K9me2/3 interaction of LIN-61 is required for C. elegans vulva development within the synMuvB pathway. Mutant LIN-61 proteins deficient in H3K9me2/3 binding fail to rescue lin-61 synMuvB function. Also, previously identified point mutant synMuvB alleles are deficient in H3K9me2/3 interaction although these target residues that are outside of the fourth MBT repeat. Interestingly, lin-61 genetically interacts with two other synMuvB genes, hpl-2, an HP1 homologous H3K9me2/3 binding factor, and met-2, a SETDB1 homologous H3K9 methyl transferase (H3K9MT), in determining C. elegans vulva development and fertility. Besides identifying the first sequence specific and di-/trimethylation binding MBT domain protein, our studies imply complex multi-domain regulation of ligand interaction of MBT domains. Our results also introduce a mechanistic link between LIN-61 function and biology, and they establish interplay of the H3K9me2/3 binding proteins, LIN-61 and HPL-2, as well as the H3K9MT MET-2 in distinct developmental pathways. PMID:21437264

  11. The dyad palindromic glutathione transferase P enhancer binds multiple factors including AP1.

    PubMed Central

    Diccianni, M B; Imagawa, M; Muramatsu, M

    1992-01-01

    Glutathione Transferase P (GST-P) gene expression is dominantly regulated by an upstream enhancer (GPEI) consisting of a dyad of palindromically oriented imperfect TPA (12-O-tetradecanoyl-phorbol-13-acetate)-responsive elements (TRE). GPEI is active in AP1-lacking F9 cells as well in AP1-containing HeLa cells. Despite GPEI's similarity to a TRE, c-jun co-transfection has only a minimal effect on transactivation. Antisense c-jun and c-fos co-transfection experiments further demonstrate the lack of a role for AP1 in GPEI mediated trans-activation in F9 cells, although endogenously present AP1 can influence GPEI in HeLa cells. Co-transfection of delta fosB with c-jun, which forms an inactive c-Jun/delta FosB heterodimer that binds TRE sequences, inhibits GPEI-mediated transcription in AP1-lacking F9 cells as well as AP1-containing HeLa cells. These data suggest novel factor(s) other than AP1 are influencing GPEI. Binding studies reveal multiple nucleoproteins bind to GPEI. These factors are likely responsible for the high level of GPEI-mediated transcription observed in the absence of AP1 and during hepatocarcinogenesis. Images PMID:1408831

  12. The dyad palindromic glutathione transferase P enhancer binds multiple factors including AP1.

    PubMed

    Diccianni, M B; Imagawa, M; Muramatsu, M

    1992-10-11

    Glutathione Transferase P (GST-P) gene expression is dominantly regulated by an upstream enhancer (GPEI) consisting of a dyad of palindromically oriented imperfect TPA (12-O-tetradecanoyl-phorbol-13-acetate)-responsive elements (TRE). GPEI is active in AP1-lacking F9 cells as well in AP1-containing HeLa cells. Despite GPEI's similarity to a TRE, c-jun co-transfection has only a minimal effect on transactivation. Antisense c-jun and c-fos co-transfection experiments further demonstrate the lack of a role for AP1 in GPEI mediated trans-activation in F9 cells, although endogenously present AP1 can influence GPEI in HeLa cells. Co-transfection of delta fosB with c-jun, which forms an inactive c-Jun/delta FosB heterodimer that binds TRE sequences, inhibits GPEI-mediated transcription in AP1-lacking F9 cells as well as AP1-containing HeLa cells. These data suggest novel factor(s) other than AP1 are influencing GPEI. Binding studies reveal multiple nucleoproteins bind to GPEI. These factors are likely responsible for the high level of GPEI-mediated transcription observed in the absence of AP1 and during hepatocarcinogenesis.

  13. Structure and Sequence Search on Aptamer-Protein Docking

    NASA Astrophysics Data System (ADS)

    Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie

    2015-03-01

    Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.

  14. How proteins bind to DNA: target discrimination and dynamic sequence search by the telomeric protein TRF1

    PubMed Central

    2017-01-01

    Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355

  15. 'Mitominis': multiplex PCR analysis of reduced size amplicons for compound sequence analysis of the entire mtDNA control region in highly degraded samples.

    PubMed

    Eichmann, Cordula; Parson, Walther

    2008-09-01

    The traditional protocol for forensic mitochondrial DNA (mtDNA) analyses involves the amplification and sequencing of the two hypervariable segments HVS-I and HVS-II of the mtDNA control region. The primers usually span fragment sizes of 300-400 bp each region, which may result in weak or failed amplification in highly degraded samples. Here we introduce an improved and more stable approach using shortened amplicons in the fragment range between 144 and 237 bp. Ten such amplicons were required to produce overlapping fragments that cover the entire human mtDNA control region. These were co-amplified in two multiplex polymerase chain reactions and sequenced with the individual amplification primers. The primers were carefully selected to minimize binding on homoplasic and haplogroup-specific sites that would otherwise result in loss of amplification due to mis-priming. The multiplexes have successfully been applied to ancient and forensic samples such as bones and teeth that showed a high degree of degradation.

  16. Hierarchy and extremes in selections from pools of randomized proteins

    PubMed Central

    Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier

    2016-01-01

    Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different “frameworks” typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution). PMID:26969726

  17. Hierarchy and extremes in selections from pools of randomized proteins.

    PubMed

    Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier

    2016-03-29

    Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different "frameworks" typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution).

  18. Nuclear proteins that bind the human gamma-globin gene promoter: alterations in binding produced by point mutations associated with hereditary persistence of fetal hemoglobin.

    PubMed Central

    Gumucio, D L; Rood, K L; Gray, T A; Riordan, M F; Sartor, C I; Collins, F S

    1988-01-01

    The molecular mechanisms responsible for the human fetal-to-adult hemoglobin switch have not yet been elucidated. Point mutations identified in the promoter regions of gamma-globin genes from individuals with nondeletion hereditary persistence of fetal hemoglobin (HPFH) may mark cis-acting sequences important for this switch, and the trans-acting factors which interact with these sequences may be integral parts in the puzzle of gamma-globin gene regulation. We have used gel retardation and footprinting strategies to define nuclear proteins which bind to the normal gamma-globin promoter and to determine the effect of HPFH mutations on the binding of a subset of these proteins. We have identified five proteins in human erythroleukemia cells (K562 and HEL) which bind to the proximal promoter region of the normal gamma-globin gene. One factor, gamma CAAT, binds the duplicated CCAAT box sequences; the -117 HPFH mutation increases the affinity of interaction between gamma CAAT and its cognate site. Two proteins, gamma CAC1 and gamma CAC2, bind the CACCC sequence. These proteins require divalent cations for binding. The -175 HPFH mutation interferes with the binding of a fourth protein, gamma OBP, which binds an octamer sequence (ATGCAAAT) in the normal gamma-globin promoter. The HPFH phenotype of the -175 mutation indicates that the octamer-binding protein may play a negative regulatory role in this setting. A fifth protein, EF gamma a, binds to sequences which overlap the octamer-binding site. The erythroid-specific distribution of EF gamma a and its close approximation to an apparent repressor-binding site suggest that it may be important in gamma-globin regulation. Images PMID:2468996

  19. Low pathogenic avian influenza isolates from wild birds replicate and transmit via contact in ferrets without prior adaptation.

    PubMed

    Driskell, Elizabeth A; Pickens, Jennifer A; Humberd-Smith, Jennifer; Gordy, James T; Bradley, Konrad C; Steinhauer, David A; Berghaus, Roy D; Stallknecht, David E; Howerth, Elizabeth W; Tompkins, Stephen Mark

    2012-01-01

    Direct transmission of avian influenza viruses to mammals has become an increasingly investigated topic during the past decade; however, isolates that have been primarily investigated are typically ones originating from human or poultry outbreaks. Currently there is minimal comparative information on the behavior of the innumerable viruses that exist in the natural wild bird host. We have previously demonstrated the capacity of numerous North American avian influenza viruses isolated from wild birds to infect and induce lesions in the respiratory tract of mice. In this study, two isolates from shorebirds that were previously examined in mice (H1N9 and H6N1 subtypes) are further examined through experimental inoculations in the ferret with analysis of viral shedding, histopathology, and antigen localization via immunohistochemistry to elucidate pathogenicity and transmission of these viruses. Using sequence analysis and glycan binding analysis, we show that these avian viruses have the typical avian influenza binding pattern, with affinity for cell glycoproteins/glycolipids having terminal sialic acid (SA) residues with α 2,3 linkage [Neu5Ac(α2,3)Gal]. Despite the lack of α2,6 linked SA binding, these AIVs productively infected both the upper and lower respiratory tract of ferrets, resulting in nasal viral shedding and pulmonary lesions with minimal morbidity. Moreover, we show that one of the viruses is able to transmit to ferrets via direct contact, despite its binding affinity for α 2,3 linked SA residues. These results demonstrate that avian influenza viruses, which are endemic in aquatic birds, can potentially infect humans and other mammals without adaptation. Finally this work highlights the need for additional study of the wild bird subset of influenza viruses in regard to surveillance, transmission, and potential for reassortment, as they have zoonotic potential.

  20. Application of volcanic ash particles for protein affinity purification with a minimized silica-binding tag.

    PubMed

    Abdelhamid, Mohamed A A; Ikeda, Takeshi; Motomura, Kei; Tanaka, Tatsuya; Ishida, Takenori; Hirota, Ryuichi; Kuroda, Akio

    2016-11-01

    We recently reported that the spore coat protein, CotB1 (171 amino acids), from Bacillus cereus mediates silica biomineralization and that the polycationic C-terminal sequence of CotB1 (14 amino acids), designated CotB1p, serves as a silica-binding tag when fused to other proteins. Here, we reduced the length of this silica-binding tag to only seven amino acids (SB7 tag: RQSSRGR) while retaining its affinity for silica. Alanine scanning mutagenesis indicated that the three arginine residues in the SB7 tag play important roles in binding to a silica surface. Monomeric l-arginine, at concentrations of 0.3-0.5 M, was found to serve as a competitive eluent to release bound SB7-tagged proteins from silica surfaces. To develop a low-cost, silica-based affinity purification procedure, we used natural volcanic ash particles with a silica content of ∼70%, rather than pure synthetic silica particles, as an adsorbent for SB7-tagged proteins. Using green fluorescent protein, mCherry, and mKate2 as model proteins, our purification method achieved 75-90% recovery with ∼90% purity. These values are comparable to or even higher than that of the commonly used His-tag affinity purification. In addition to low cost, another advantage of our method is the use of l-arginine as the eluent because its protein-stabilizing effect would help minimize alteration of the intrinsic properties of the purified proteins. Our approach paves the way for the use of naturally occurring materials as adsorbents for simple, low-cost affinity purification. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  1. Functional analysis of the EspR binding sites upstream of espR in Mycobacterium tuberculosis.

    PubMed

    Cao, Guangxiang; Howard, Susan T; Zhang, Peipei; Hou, Guihua; Pang, Xiuhua

    2013-11-01

    The ESX-1 secretion system exports substrate proteins into host cells and is crucial for the pathogenesis of Mycobacterium tuberculosis. EspR is one of the characterized transcriptional regulators that modulates the ESX-1 system by binding the conserved EspR binding sites in the promoter of espA, the encoding gene of EspA, which is also a substrate protein of the ESX-1 system and is required for the ESX-1 activity. EspR is autoregulatory and conserved EspR binding sites are present upstream of espR. In this study, we showed that these EspR sites had varying affinities for EspR, with site B being the strongest one. Point mutations of the DNA sequence at site B abolished binding of EspR to oligonucleotides containing site B alone or with other sites, further suggesting that site B is a major binding site for EspR. Complementation studies showed that constructs containing espR, and the upstream intergenic region fully restored espR expression in a ΔespR mutant strain. Although recombinant strains with mutations at more than one EspR site showed minimal differences in espR expression, reduced expression of other EspR target genes was observed, suggesting that slight changes in EspR levels can have downstream regulatory effects. These findings contribute to our understanding of the regulation of the ESX-1 system.

  2. Phage display selection of peptides that target calcium-binding proteins.

    PubMed

    Vetter, Stefan W

    2013-01-01

    Phage display allows to rapidly identify peptide sequences with binding affinity towards target proteins, for example, calcium-binding proteins (CBPs). Phage technology allows screening of 10(9) or more independent peptide sequences and can identify CBP binding peptides within 2 weeks. Adjusting of screening conditions allows selecting CBPs binding peptides that are either calcium-dependent or independent. Obtained peptide sequences can be used to identify CBP target proteins based on sequence homology or to quickly obtain peptide-based CBP inhibitors to modulate CBP-target interactions. The protocol described here uses a commercially available phage display library, in which random 12-mer peptides are displayed on filamentous M13 phages. The library was screened against the calcium-binding protein S100B.

  3. Toward a General Approach for RNA-Templated Hierarchical Assembly of Split-Proteins

    PubMed Central

    Furman, Jennifer L.; Badran, Ahmed H.; Ajulo, Oluyomi; Porter, Jason R.; Stains, Cliff I.; Segal, David J.; Ghosh, Indraneel

    2010-01-01

    The ability to conditionally turn on a signal or induce a function in the presence of a user-defined RNA target has potential applications in medicine and synthetic biology. Although sequence-specific pumilio repeat proteins can target a limited set of ssRNA sequences, there are no general methods for targeting ssRNA with designed proteins. As a first step toward RNA recognition, we utilized the RNA binding domain of argonaute, implicated in RNA interference, for specifically targeting generic 2-nucleotide, 3' overhangs of any dsRNA. We tested the reassembly of a split-luciferase enzyme guided by argonaute-mediated recognition of newly generated nucleotide overhangs when ssRNA is targeted by a designed complementary guide sequence. This approach was successful when argonaute was utilized in conjunction with a pumilio repeat and expanded the scope of potential ssRNA targets. However, targeting any desired ssRNA remained elusive as two argonaute domains provided minimal reassembled split-luciferase. We next designed and tested a second hierarchical assembly, wherein ssDNA guides are appended to DNA hairpins that serve as a scaffold for high affinity zinc fingers attached to split-luciferase. In the presence of a ssRNA target containing adjacent sequences complementary to the guides, the hairpins are brought into proximity, allowing for zinc finger binding and concomitant reassembly of the fragmented luciferase. The scope of this new approach was validated by specifically targeting RNA encoding VEGF, hDM2, and HER2. These approaches provide potentially general design paradigms for the conditional reassembly of fragmented proteins in the presence of any desired ssRNA target. PMID:20681585

  4. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  5. CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

    PubMed

    Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

    2017-09-01

    Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.

  6. Cryptic MCAT enhancer regulation in fibroblasts and smooth muscle cells. Suppression of TEF-1 mediated activation by the single-stranded DNA-binding proteins, Pur alpha, Pur beta, and MSY1.

    PubMed

    Carlini, Leslie E; Getz, Michael J; Strauch, Arthur R; Kelm, Robert J

    2002-03-08

    An asymmetric polypurine-polypyrimidine cis-element located in the 5' region of the mouse vascular smooth muscle alpha-actin gene serves as a binding site for multiple proteins with specific affinity for either single- or double-stranded DNA. Here, we test the hypothesis that single-stranded DNA-binding proteins are responsible for preventing a cryptic MCAT enhancer centered within this element from cooperating with a nearby serum response factor-interacting CArG motif to trans-activate the minimal promoter in fibroblasts and smooth muscle cells. DNA binding studies revealed that the core MCAT sequence mediates binding of transcription enhancer factor-1 to the double-stranded polypurine-polypyrimidine element while flanking nucleotides account for interaction of Pur alpha and Pur beta with the purine-rich strand and MSY1 with the complementary pyrimidine-rich strand. Mutations that selectively impaired high affinity single-stranded DNA binding by fibroblast or smooth muscle cell-derived Pur alpha, Pur beta, and MSY1 in vitro, released the cryptic MCAT enhancer from repression in transfected cells. Additional experiments indicated that Pur alpha, Pur beta, and MSY1 also interact specifically, albeit weakly, with double-stranded DNA and with transcription enhancer factor-1. These results are consistent with two plausible models of cryptic MCAT enhancer regulation by Pur alpha, Pur beta, and MSY1 involving either competitive single-stranded DNA binding or masking of MCAT-bound transcription enhancer factor-1.

  7. Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchman, A.R.; Kimmerly, W.J.; Rine, J.

    1988-01-01

    Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less

  8. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    DOE PAGES

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; ...

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  9. Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach

    PubMed Central

    Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.

    2007-01-01

    We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853

  10. Antimicrobial and Antitumor Activities of Novel Peptides Derived from the Lipopolysaccharide- and β-1,3-Glucan Binding Protein of the Pacific Abalone Haliotis discus hannai.

    PubMed

    Nam, Bo-Hye; Moon, Ji Young; Park, Eun Hee; Kong, Hee Jeong; Kim, Young-Ok; Kim, Dong-Gyun; Kim, Woo-Jin; An, Chul Min; Seo, Jung-Kil

    2016-12-14

    Antimicrobial peptides are a pivotal component of the invertebrate innate immune system. In this study, we identified a lipopolysaccharide- and β-1,3-glucan-binding protein (LGBP) gene from the pacific abalone Haliotis discus hannai (HDH), which is involved in the pattern recognition mechanism and plays avital role in the defense mechanism of invertebrates immune system. The HDH-LGBP cDNA consisted of a 1263-bp open reading frame (ORF) encoding a polypeptide of 420 amino acids, with a 20-amino-acid signal sequence. The molecular mass of the protein portion was 45.5 kDa, and the predicted isoelectric point of the mature protein was 4.93. Characteristic potential polysaccharide binding motif, glucanase motif, and β-glucan recognition motif were identified in the LGBP of HDH. We used its polysaccharide-binding motif sequence to design two novel antimicrobial peptide analogs (HDH-LGBP-A1 and HDH-LGBP-A2). By substituting a positively charged amino acid and amidation at the C -terminus, the pI and net charge of the HDH-LGBP increased, and the proteins formed an α-helical structure. The HDH-LGBP analogs exhibited antibacterial and antifungal activity, with minimal effective concentrations ranging from 0.008 to 2.2 μg/mL. Additionally, both were toxic against human cervix (HeLa), lung (A549), and colon (HCT 116) carcinoma cell lines but not much on human umbilical vein cell (HUVEC). Fluorescence-activated cell sorter (FACS) analysis showed that HDH-LGBP analogs disturb the cancer cell membrane and cause apoptotic cell death. These results suggest the use of HDH-LGBP analogs as multifunctional drugs.

  11. Antimicrobial and Antitumor Activities of Novel Peptides Derived from the Lipopolysaccharide- and β-1,3-Glucan Binding Protein of the Pacific Abalone Haliotis discus hannai

    PubMed Central

    Nam, Bo-Hye; Moon, Ji Young; Park, Eun Hee; Kong, Hee Jeong; Kim, Young-Ok; Kim, Dong-Gyun; Kim, Woo-Jin; An, Chul Min; Seo, Jung-Kil

    2016-01-01

    Antimicrobial peptides are a pivotal component of the invertebrate innate immune system. In this study, we identified a lipopolysaccharide- and β-1,3-glucan-binding protein (LGBP) gene from the pacific abalone Haliotis discus hannai (HDH), which is involved in the pattern recognition mechanism and plays avital role in the defense mechanism of invertebrates immune system. The HDH-LGBP cDNA consisted of a 1263-bp open reading frame (ORF) encoding a polypeptide of 420 amino acids, with a 20-amino-acid signal sequence. The molecular mass of the protein portion was 45.5 kDa, and the predicted isoelectric point of the mature protein was 4.93. Characteristic potential polysaccharide binding motif, glucanase motif, and β-glucan recognition motif were identified in the LGBP of HDH. We used its polysaccharide-binding motif sequence to design two novel antimicrobial peptide analogs (HDH-LGBP-A1 and HDH-LGBP-A2). By substituting a positively charged amino acid and amidation at the C-terminus, the pI and net charge of the HDH-LGBP increased, and the proteins formed an α-helical structure. The HDH-LGBP analogs exhibited antibacterial and antifungal activity, with minimal effective concentrations ranging from 0.008 to 2.2 μg/mL. Additionally, both were toxic against human cervix (HeLa), lung (A549), and colon (HCT 116) carcinoma cell lines but not much on human umbilical vein cell (HUVEC). Fluorescence-activated cell sorter (FACS) analysis showed that HDH-LGBP analogs disturb the cancer cell membrane and cause apoptotic cell death. These results suggest the use of HDH-LGBP analogs as multifunctional drugs. PMID:27983632

  12. Intrinsic Pleckstrin Homology (PH) Domain Motion in Phospholipase C-β Exposes a Gβγ Protein Binding Site.

    PubMed

    Kadamur, Ganesh; Ross, Elliott M

    2016-05-20

    Mammalian phospholipase C-β (PLC-β) isoforms are stimulated by heterotrimeric G protein subunits and members of the Rho GTPase family of small G proteins. Although recent structural studies showed how Gαq and Rac1 bind PLC-β, there is a lack of consensus regarding the Gβγ binding site in PLC-β. Using FRET between cerulean fluorescent protein-labeled Gβγ and the Alexa Fluor 594-labeled PLC-β pleckstrin homology (PH) domain, we demonstrate that the PH domain is the minimal Gβγ binding region in PLC-β3. We show that the isolated PH domain can compete with full-length PLC-β3 for binding Gβγ but not Gαq, Using sequence conservation, structural analyses, and mutagenesis, we identify a hydrophobic face of the PLC-β PH domain as the Gβγ binding interface. This PH domain surface is not solvent-exposed in crystal structures of PLC-β, necessitating conformational rearrangement to allow Gβγ binding. Blocking PH domain motion in PLC-β by cross-linking it to the EF hand domain inhibits stimulation by Gβγ without altering basal activity or Gαq response. The fraction of PLC-β cross-linked is proportional to the fractional loss of Gβγ response. Cross-linked PLC-β does not bind Gβγ in a FRET-based Gβγ-PLC-β binding assay. We propose that unliganded PLC-β exists in equilibrium between a closed conformation observed in crystal structures and an open conformation where the PH domain moves away from the EF hands. Therefore, intrinsic movement of the PH domain in PLC-β modulates Gβγ access to its binding site. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. Intrinsic Pleckstrin Homology (PH) Domain Motion in Phospholipase C-β Exposes a Gβγ Protein Binding Site*

    PubMed Central

    Kadamur, Ganesh

    2016-01-01

    Mammalian phospholipase C-β (PLC-β) isoforms are stimulated by heterotrimeric G protein subunits and members of the Rho GTPase family of small G proteins. Although recent structural studies showed how Gαq and Rac1 bind PLC-β, there is a lack of consensus regarding the Gβγ binding site in PLC-β. Using FRET between cerulean fluorescent protein-labeled Gβγ and the Alexa Fluor 594-labeled PLC-β pleckstrin homology (PH) domain, we demonstrate that the PH domain is the minimal Gβγ binding region in PLC-β3. We show that the isolated PH domain can compete with full-length PLC-β3 for binding Gβγ but not Gαq, Using sequence conservation, structural analyses, and mutagenesis, we identify a hydrophobic face of the PLC-β PH domain as the Gβγ binding interface. This PH domain surface is not solvent-exposed in crystal structures of PLC-β, necessitating conformational rearrangement to allow Gβγ binding. Blocking PH domain motion in PLC-β by cross-linking it to the EF hand domain inhibits stimulation by Gβγ without altering basal activity or Gαq response. The fraction of PLC-β cross-linked is proportional to the fractional loss of Gβγ response. Cross-linked PLC-β does not bind Gβγ in a FRET-based Gβγ-PLC-β binding assay. We propose that unliganded PLC-β exists in equilibrium between a closed conformation observed in crystal structures and an open conformation where the PH domain moves away from the EF hands. Therefore, intrinsic movement of the PH domain in PLC-β modulates Gβγ access to its binding site. PMID:27002154

  14. STAT1:DNA sequence-dependent binding modulation by phosphorylation, protein:protein interactions and small-molecule inhibition

    PubMed Central

    Bonham, Andrew J.; Wenta, Nikola; Osslund, Leah M.; Prussin, Aaron J.; Vinkemeier, Uwe; Reich, Norbert O.

    2013-01-01

    The DNA-binding specificity and affinity of the dimeric human transcription factor (TF) STAT1, were assessed by total internal reflectance fluorescence protein-binding microarrays (TIRF-PBM) to evaluate the effects of protein phosphorylation, higher-order polymerization and small-molecule inhibition. Active, phosphorylated STAT1 showed binding preferences consistent with prior characterization, whereas unphosphorylated STAT1 showed a weak-binding preference for one-half of the GAS consensus site, consistent with recent models of STAT1 structure and function in response to phosphorylation. This altered-binding preference was further tested by use of the inhibitor LLL3, which we show to disrupt STAT1 binding in a sequence-dependent fashion. To determine if this sequence-dependence is specific to STAT1 and not a general feature of human TF biology, the TF Myc/Max was analysed and tested with the inhibitor Mycro3. Myc/Max inhibition by Mycro3 is sequence independent, suggesting that the sequence-dependent inhibition of STAT1 may be specific to this system and a useful target for future inhibitor design. PMID:23180800

  15. Computational design of enzyme-ligand binding using a combined energy function and deterministic sequence optimization algorithm.

    PubMed

    Tian, Ye; Huang, Xiaoqiang; Zhu, Yushan

    2015-08-01

    Enzyme amino-acid sequences at ligand-binding interfaces are evolutionarily optimized for reactions, and the natural conformation of an enzyme-ligand complex must have a low free energy relative to alternative conformations in native-like or non-native sequences. Based on this assumption, a combined energy function was developed for enzyme design and then evaluated by recapitulating native enzyme sequences at ligand-binding interfaces for 10 enzyme-ligand complexes. In this energy function, the electrostatic interaction between polar or charged atoms at buried interfaces is described by an explicitly orientation-dependent hydrogen-bonding potential and a pairwise-decomposable generalized Born model based on the general side chain in the protein design framework. The energy function is augmented with a pairwise surface-area based hydrophobic contribution for nonpolar atom burial. Using this function, on average, 78% of the amino acids at ligand-binding sites were predicted correctly in the minimum-energy sequences, whereas 84% were predicted correctly in the most-similar sequences, which were selected from the top 20 sequences for each enzyme-ligand complex. Hydrogen bonds at the enzyme-ligand binding interfaces in the 10 complexes were usually recovered with the correct geometries. The binding energies calculated using the combined energy function helped to discriminate the active sequences from a pool of alternative sequences that were generated by repeatedly solving a series of mixed-integer linear programming problems for sequence selection with increasing integer cuts.

  16. Specific minor groove solvation is a crucial determinant of DNA binding site recognition

    PubMed Central

    Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.

    2014-01-01

    The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976

  17. Functional evolution and structural conservation in chimeric cytochromes p450: calibrating a structure-guided approach.

    PubMed

    Otey, Christopher R; Silberg, Jonathan J; Voigt, Christopher A; Endelman, Jeffrey B; Bandara, Geethani; Arnold, Frances H

    2004-03-01

    Recombination generates chimeric proteins whose ability to fold depends on minimizing structural perturbations that result when portions of the sequence are inherited from different parents. These chimeric sequences can display functional properties characteristic of the parents or acquire entirely new functions. Seventeen chimeras were generated from two CYP102 members of the functionally diverse cytochrome p450 family. Chimeras predicted to have limited structural disruption, as defined by the SCHEMA algorithm, displayed CO binding spectra characteristic of folded p450s. Even this small population exhibited significant functional diversity: chimeras displayed altered substrate specificities, a wide range in thermostabilities, up to a 40-fold increase in peroxidase activity, and ability to hydroxylate a substrate toward which neither parent heme domain shows detectable activity. These results suggest that SCHEMA-guided recombination can be used to generate diverse p450s for exploring function evolution within the p450 structural framework.

  18. The evolutionarily conserved leprecan gene: its regulation by Brachyury and its role in the developing Ciona notochord.

    PubMed

    Dunn, Matthew P; Di Gregorio, Anna

    2009-04-15

    In Ciona intestinalis, leprecan was identified as a target of the notochord-specific transcription factor Ciona Brachyury (Ci-Bra) (Takahashi, H., Hotta, K., Erives, A., Di Gregorio, A., Zeller, R.W., Levine, M., Satoh, N., 1999. Brachyury downstream notochord differentiation in the ascidian embryo. Genes Dev. 13, 1519-1523). By screening approximately 14 kb of the Ci-leprecan locus for cis-regulatory activity, we have identified a 581-bp minimal notochord-specific cis-regulatory module (CRM) whose activity depends upon T-box binding sites located at the 3'-end of its sequence. These sites are specifically bound in vitro by a GST-Ci-Bra fusion protein, and mutations that abolish binding in vitro result in loss or decrease of regulatory activity in vivo. Serial deletions of the 581-bp notochord CRM revealed that this sequence is also able to direct expression in muscle cells through the same T-box sites that are utilized by Ci-Bra in the notochord, which are also bound in vitro by the muscle-specific T-box activators Ci-Tbx6b and Ci-Tbx6c. Additionally, we created plasmids aimed to interfere with the function of Ci-leprecan and categorized the resulting phenotypes, which consist of variable dislocations of notochord cells along the anterior-posterior axis. Together, these observations provide mechanistic insights generally applicable to T-box transcription factors and their target sequences, as well as a first set of clues on the function of Leprecan in early chordate development.

  19. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  20. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE PAGES

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  1. Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.

    PubMed

    Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P

    2018-05-22

    Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.

  2. Molecular simulations of polycation-DNA binding exploring the effect of peptide chemistry and sequence in nuclear localization sequence based polycations.

    PubMed

    Elder, Robert M; Jayaraman, Arthi

    2013-10-10

    Gene therapy relies on the delivery of DNA into cells, and polycations are one class of vectors enabling efficient DNA delivery. Nuclear localization sequences (NLS), cationic oligopeptides that target molecules for nuclear entry, can be incorporated into polycations to improve their gene delivery efficiency. We use simulations to study the effect of peptide chemistry and sequence on the DNA-binding behavior of NLS-grafted polycations by systematically mutating the residues in the grafts, which are based on the SV40 NLS (peptide sequence PKKKRKV). Replacing arginine (R) with lysine (K) reduces binding strength by eliminating arginine-DNA interactions, but placing R in a less hindered location (e.g., farther from the grafting point to the polycation backbone) has surprisingly little effect on polycation-DNA binding strength. Changing the positions of the hydrophobic proline (P) and valine (V) residues relative to the polycation backbone changes hydrophobic aggregation within the polycation and, consequently, changes the conformational entropy loss that occurs upon polycation-DNA binding. Since conformational entropy loss affects the free energy of binding, the positions of P and V in the grafts affect DNA binding affinity. The insight from this work guides synthesis of polycations with tailored DNA binding affinity and, in turn, efficient DNA delivery.

  3. A label-free aptamer-fluorophore assembly for rapid and specific detection of cocaine in biofluids.

    PubMed

    Roncancio, Daniel; Yu, Haixiang; Xu, Xiaowen; Wu, Shuo; Liu, Ran; Debord, Joshua; Lou, Xinhui; Xiao, Yi

    2014-11-18

    We report a rapid and specific aptamer-based method for one-step cocaine detection with minimal reagent requirements. The feasibility of aptamer-based detection has been demonstrated with sensors that operate via target-induced conformational change mechanisms, but these have generally exhibited limited target sensitivity. We have discovered that the cocaine-binding aptamer MNS-4.1 can also bind the fluorescent molecule 2-amino-5,6,7-trimethyl-1,8-naphthyridine (ATMND) and thereby quench its fluorescence. We subsequently introduced sequence changes into MNS-4.1 to engineer a new cocaine-binding aptamer (38-GC) that exhibits higher affinity to both ligands, with reduced background signal and increased signal gain. Using this aptamer, we have developed a new sensor platform that relies on the cocaine-mediated displacement of ATMND from 38-GC as a result of competitive binding. We demonstrate that our sensor can detect cocaine within seconds at concentrations as low as 200 nM, which is 50-fold lower than existing assays based on target-induced conformational change. More importantly, our assay achieves successful cocaine detection in body fluids, with a limit of detection of 10.4, 18.4, and 36 μM in undiluted saliva, urine, and serum samples, respectively.

  4. TALE-PvuII fusion proteins--novel tools for gene targeting.

    PubMed

    Yanik, Mert; Alzubi, Jamal; Lahaye, Thomas; Cathomen, Toni; Pingoud, Alfred; Wende, Wolfgang

    2013-01-01

    Zinc finger nucleases (ZFNs) consist of zinc fingers as DNA-binding module and the non-specific DNA-cleavage domain of the restriction endonuclease FokI as DNA-cleavage module. This architecture is also used by TALE nucleases (TALENs), in which the DNA-binding modules of the ZFNs have been replaced by DNA-binding domains based on transcription activator like effector (TALE) proteins. Both TALENs and ZFNs are programmable nucleases which rely on the dimerization of FokI to induce double-strand DNA cleavage at the target site after recognition of the target DNA by the respective DNA-binding module. TALENs seem to have an advantage over ZFNs, as the assembly of TALE proteins is easier than that of ZFNs. Here, we present evidence that variant TALENs can be produced by replacing the catalytic domain of FokI with the restriction endonuclease PvuII. These fusion proteins recognize only the composite recognition site consisting of the target site of the TALE protein and the PvuII recognition sequence (addressed site), but not isolated TALE or PvuII recognition sites (unaddressed sites), even at high excess of protein over DNA and long incubation times. In vitro, their preference for an addressed over an unaddressed site is > 34,000-fold. Moreover, TALE-PvuII fusion proteins are active in cellula with minimal cytotoxicity.

  5. Comparative genomics and evolution of the amylase-binding proteins of oral streptococci.

    PubMed

    Haase, Elaine M; Kou, Yurong; Sabharwal, Amarpreet; Liao, Yu-Chieh; Lan, Tianying; Lindqvist, Charlotte; Scannapieco, Frank A

    2017-04-20

    Successful commensal bacteria have evolved to maintain colonization in challenging environments. The oral viridans streptococci are pioneer colonizers of dental plaque biofilm. Some of these bacteria have adapted to life in the oral cavity by binding salivary α-amylase, which hydrolyzes dietary starch, thus providing a source of nutrition. Oral streptococcal species bind α-amylase by expressing a variety of amylase-binding proteins (ABPs). Here we determine the genotypic basis of amylase binding where proteins of diverse size and function share a common phenotype. ABPs were detected in culture supernatants of 27 of 59 strains representing 13 oral Streptococcus species screened using the amylase-ligand binding assay. N-terminal sequences from ABPs of diverse size were obtained from 18 strains representing six oral streptococcal species. Genome sequencing and BLAST searches using N-terminal sequences, protein size, and key words identified the gene associated with each ABP. Among the sequenced ABPs, 14 matched amylase-binding protein A (AbpA), 6 matched amylase-binding protein B (AbpB), and 11 unique ABPs were identified as peptidoglycan-binding, glutamine ABC-type transporter, hypothetical, or choline-binding proteins. Alignment and phylogenetic analyses performed to ascertain evolutionary relationships revealed that ABPs cluster into at least six distinct, unrelated families (AbpA, AbpB, and four novel ABPs) with no phylogenetic evidence that one group evolved from another, and no single ancestral gene found within each group. AbpA-like sequences can be divided into five subgroups based on the N-terminal sequences. Comparative genomics focusing on the abpA gene locus provides evidence of horizontal gene transfer. The acquisition of an ABP by oral streptococci provides an interesting example of adaptive evolution.

  6. Accurate Prediction of Inducible Transcription Factor Binding Intensities In Vivo

    PubMed Central

    Siepel, Adam; Lis, John T.

    2012-01-01

    DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB–seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB–seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF–bound and HSF–free DNA, and then detecting HSF–bound DNA by high-throughput sequencing. We compared PB–seq binding profiles with ones observed in vivo by ChIP–seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase–seq data and the ChIP–chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity. PMID:22479205

  7. Complex binding of the FabR repressor of bacterial unsaturated fatty acid biosynthesis to its cognate promoters.

    PubMed

    Feng, Youjun; Cronan, John E

    2011-04-01

    Two transcriptional regulators, the FadR activator and the FabR repressor, control biosynthesis of unsaturated fatty acids in Escherichia coli. FabR represses expression of the two genes, fabA and fabB, required for unsaturated fatty acid synthesis and has been reported to require the presence of an unsaturated thioester (of either acyl carrier protein or CoA) in order to bind the fabA and fabB promoters in vitro. We report in vivo experiments in which unsaturated fatty acid synthesis was blocked in the absence of exogenous unsaturated fatty acids in a ΔfadR strain and found that the rates of transcription of fabA and fabB were unaffected by the lack of unsaturated thioesters. To examine the discrepancy between our in vivo results and the prior in vitro results we obtained active, natively folded forms of the E. coli and Vibrio cholerae FabRs by use of an in vitro transcription-translation system. We report that FabR bound the intact promoter regions of both fabA and fabB in the absence of unsaturated acyl thioesters, but bound the two promoters differently. Native FabR bound the fabA promoter region provided that the canonical FabR binding site is extended by inclusion of flanking sequences that overlap the neighbouring FadR binding site. In contrast, although binding to the fabB operator also required a flanking sequence, a non-specific sequence could suffice. However, unsaturated thioesters did allow FabR binding to the minimal FabR operator sites of both promoters which otherwise were not bound. Thus unsaturated thioester ligands were not essential for FabR/target DNA interaction, but acted to enhance binding. The gel mobility shift data plus in vivo expression data indicate that despite the remarkably similar arrangements of promoter elements, FadR predominately regulates fabA expression whereas FabR is the dominant regulator of fabB expression. We also report that E. coli fabR expression is not autoregulated. Complementation, qRT-PCR and fatty acid composition analyses demonstrated that V. cholerae FabR was a functional repressor of unsaturated fatty acid synthesis. However, in contrast to E. coli, gel mobility shift assays indicated that neither E. coli nor V. cholerae FabRs bound the V. cholerae fabB promoter, although both proteins efficiently bound the V. cholerae fabA promoter. This asymmetry was shown to be due to the lack of a FabR binding site within the V. cholerae fabB promoter region. © 2011 Blackwell Publishing Ltd.

  8. Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

    PubMed

    Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

    2016-04-01

    Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  9. Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

    PubMed

    Choi, Daesik; Park, Byungkyu; Chae, Hanju; Lee, Wook; Han, Kyungsook

    2017-03-14

    Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .

  10. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    PubMed

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. An Evolutionary/Biochemical Connection Between Promoter- and Primer-Dependent Polymerases Revealed by Selective Evolution of Ligands by Exponential Enrichment (SELEX).

    PubMed

    Fenstermacher, Katherine J; Achuthan, Vasudevan; Schneider, Thomas D; DeStefano, Jeffrey J

    2018-01-16

    DNA polymerases (DNAPs) recognize 3' recessed termini on duplex DNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases (RNAPs), no sequence specificity is required for binding or initiation of catalysis. Despite this, previous results indicate that viral reverse transcriptases bind much more tightly to DNA primers that mimic the polypurine tract. In the current report, primer sequences that bind with high affinity to Taq and Klenow polymerases were identified using a modified Selective Evolution of Ligands by Exponential Enrichment (SELEX) approach. Two Taq -specific primers that bound ∼10 (Taq1) and over 100 (Taq2) times more stably than controls to Taq were identified. Taq1 contained 8 nucleotides (5' -CACTAAAG-3') that matched the phage T3 RNAP "core" promoter. Both primers dramatically outcompeted primers with similar binding thermodynamics in PCR reactions. Similarly, exonuclease minus Klenow polymerase also selected a high affinity primer that contained a related core promoter sequence from phage T7 RNAP (5' -ACTATAG-3'). For both Taq and Klenow, even small modifications to the sequence resulted in large losses in binding affinity suggesting that binding was highly sequence-specific. The results are discussed in the context of possible effects on multi-primer (multiplex) PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs. Importance This work further demonstrates that primer-dependent DNA polymerases can have strong sequence biases leading to dramatically tighter binding to specific sequences. These may be related to biological function, or be a consequences of the structural architecture of the enzyme. New sequence specificity for Taq and Klenow polymerases were uncovered and among them were sequences that contained the core promoter elements from T3 and T7 phage RNA polymerase promoters. This suggests the intriguing possibility that phage RNA polymerases exploited intrinsic binding affinities of ancestral DNA polymerases to develop their promotors. Conversely, DNA polymerases could have evolved from related RNA polymerases and retained the intrinsic binding preference despite there being no clear function for such a preference in DNA biology. Copyright © 2018 American Society for Microbiology.

  12. Does TATA matter? A structural exploration of the selectivity determinants in its complexes with TATA box-binding protein.

    PubMed Central

    Pastor, N; Pardo, L; Weinstein, H

    1997-01-01

    The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783

  13. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    PubMed

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Purification and sequencing of the active site tryptic peptide from penicillin-binding protein 1b of Escherichia coli

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nicholas, R.A.; Suzuki, H.; Hirota, Y.

    This paper reports the sequence of the active site peptide of penicillin-binding protein 1b from Escherichia coli. Purified penicillin-binding protein 1b was labeled with (/sup 14/C)penicillin G, digested with trypsin, and partially purified by gel filtration. Upon further purification by high-pressure liquid chromatography, two radioactive peaks were observed, and the major peak, representing over 75% of the applied radioactivity, was submitted to amino acid analysis and sequencing. The sequence Ser-Ile-Gly-Ser-Leu-Ala-Lys was obtained. The active site nucleophile was identified by digesting the purified peptide with aminopeptidase M and separating the radioactive products on high-pressure liquid chromatography. Amino acid analysis confirmed thatmore » the serine residue in the middle of the sequence was covalently bonded to the (/sup 14/C)penicilloyl moiety. A comparison of this sequence to active site sequences of other penicillin-binding proteins and beta-lactamases is presented.« less

  15. Probing the electrostatics and pharmacologic modulation of sequence-specific binding by the DNA-binding domain of the ETS-family transcription factor PU.1: a binding affinity and kinetics investigation

    PubMed Central

    Munde, Manoj; Poon, Gregory M. K.; Wilson, W. David

    2013-01-01

    Members of the ETS family of transcription factors regulate a functionally diverse array of genes. All ETS proteins share a structurally-conserved but sequence-divergent DNA-binding domain, known as the ETS domain. Although the structure and thermodynamics of the ETS-DNA complexes are well known, little is known about the kinetics of sequence recognition, a facet that offers potential insight into its molecular mechanism. We have characterized DNA binding by the ETS domain of PU.1 by biosensor-surface plasmon resonance (SPR). SPR analysis revealed a striking kinetic profile for DNA binding by the PU.1 ETS domain. At low salt concentrations, it binds high-affinity cognate DNA with a very slow association rate constant (≤105 M−1 s−1), compensated by a correspondingly small dissociation rate constant. The kinetics are strongly salt-dependent but mutually balance to produce a relatively weak dependence in the equilibrium constant. This profile contrasts sharply with reported data for other ETS domains (e.g., Ets-1, TEL) for which high-affinity binding is driven by rapid association (>107 M−1 s−1). We interpret this difference in terms of the hydration properties of ETS-DNA binding and propose that at least two mechanisms of sequence recognition are employed by this family of DNA-binding domain. Additionally, we use SPR to demonstrate the potential for pharmacological inhibition of sequence-specific ETS-DNA binding, using the minor groove-binding distamycin as a model compound. Our work establishes SPR as a valuable technique for extending our understanding of the molecular mechanisms of ETS-DNA interactions as well as developing potential small-molecule agents for biotechnological and therapeutic purposes. PMID:23416556

  16. Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape.

    PubMed

    Dai, Hanjun; Umarov, Ramzan; Kuwahara, Hiroyuki; Li, Yu; Song, Le; Gao, Xin

    2017-11-15

    An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods. Our program is freely available at https://github.com/ramzan1990/sequence2vec. xin.gao@kaust.edu.sa or lsong@cc.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  17. Phosphorylation-dependent mineral-type specificity for apatite-binding peptide sequences.

    PubMed

    Addison, William N; Miller, Sharon J; Ramaswamy, Janani; Mansouri, Ahmad; Kohn, David H; McKee, Marc D

    2010-12-01

    Apatite-binding peptides discovered by phage display provide an alternative design method for creating functional biomaterials for bone and tooth tissue repair. A limitation of this approach is the absence of display peptide phosphorylation--a post-translational modification important to mineral-binding proteins. To refine the material specificity of a recently identified apatite-binding peptide, and to determine critical design parameters (net charge, charge distribution, amino acid sequence and composition) controlling peptide affinity for mineral, we investigated the effects of phosphorylation and sequence scrambling on peptide adsorption to four different apatites (bone-like mineral, and three types of apatite containing initially 0, 5.6 and 10.5% carbonate). Phosphorylation of the VTKHLNQISQSY peptide (VTK peptide) led to a 10-fold increase in peptide adsorption (compared to nonphosphorylated peptide) to bone-like mineral, and a 2-fold increase in adsorption to the carbonated apatite, but there was no effect of phosphorylation on peptide affinity to pure hydroxyapatite (without carbonate). Sequence scrambling of the nonphosphorylated VTK peptide enhanced its specificity for the bone-like mineral, but scrambled phosphorylated VTK peptide (pVTK) did not significantly alter mineral-binding suggesting that despite the importance of sequence order and/or charge distribution to mineral-binding, the enhanced binding after phosphorylation exceeds any further enhancement by altered sequence order. Osteoblast culture mineralization was dose-dependently inhibited by pVTK and to a significantly lesser extent by scrambled pVTK, while the nonphosphorylated and scrambled forms had no effect, indicating that inhibition of osteoblast mineralization is dependent on both peptide sequence and charge. Computational modeling of peptide-mineral interactions indicated a favorable change in binding energy upon phosphorylation that was unaffected by scrambling. In conclusion, phosphorylation of serine residues increases peptide specificity for bone-like mineral, whose adsorption is determined primarily by sequence composition and net charge as opposed to sequence order. However, sequence order in addition to net charge modulates the mineralization of osteoblast cultures. The ability of such peptides to inhibit mineralization has potential utility in the management of pathologic calcification. Copyright © 2010 Elsevier Ltd. All rights reserved.

  18. Genetic diversity of the DBLalpha region in Plasmodium falciparum var genes among Asia-Pacific isolates.

    PubMed

    Fowler, Elizabeth V; Peters, Jennifer M; Gatton, Michelle L; Chen, Nanhua; Cheng, Qin

    2002-03-01

    In Plasmodium falciparum a highly polymorphic multi-copy gene family, var, encodes the variant surface antigen P. falciparum erythrocyte membrane protein 1 (PfEMP1), which has an important role in cytoadherence and immune evasion. Using previously described universal PCR primers for the first Duffy binding-like domain (DBLalpha) of var we analysed the DBLalpha repertoires of Dd2 (originally from Thailand) and eight isolates from the Solomon Islands (n=4), Philippines (n=2), Papua New Guinea (n=1) and Africa (n=1). We found 15-32 unique DBLalpha sequence types among these isolates and estimated detectable DBLalpha repertoire sizes ranging from 33-38 to 52-57 copies per genome. Our data suggest that var gene repertoires generally consist of 40-50 copies per genome. Eighteen DBLalpha sequences appeared in more than one Asia-Pacific isolate with the number of sequences shared between any two isolates ranging from 0 to 6 (mean=2.0 +/-1.6). At the amino acid level DBLalpha sequence similarity within isolates ranged from 45.2 +/- 7.1 to 50.2 +/- 6.9%, and was not significantly different from the DBLalpha amino acid sequence similarity among isolates (P>0.1). Comparisons with published sequences also revealed little overlap among DBLalpha sequences from different regions. High DBLalpha sequence diversity and minimal overlap among these isolates suggest that the global var gene repertoire is immense, and may potentially be selected for by the host's protective immune response to the var gene products, PfEMP1.

  19. A conserved mechanism for replication origin recognition and binding in archaea.

    PubMed

    Majerník, Alan I; Chong, James P J

    2008-01-15

    To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.

  20. Human immunodeficiency virus type 1 LTR TATA and TAR region sequences required for transcriptional regulation.

    PubMed Central

    Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B

    1989-01-01

    The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501

  1. Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong

    2010-02-19

    Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less

  2. The FOXP2 forkhead domain binds to a variety of DNA sequences with different rates and affinities.

    PubMed

    Webb, Helen; Steeb, Olga; Blane, Ashleigh; Rotherham, Lia; Aron, Shaun; Machanick, Philip; Dirr, Heini; Fanucchi, Sylvia

    2017-07-01

    FOXP2 is a member of the P subfamily of FOX transcription factors, the DNA-binding domain of which is the winged helix forkhead domain (FHD). In this work we show that the FOXP2 FHD is able to bind to various DNA sequences, including a novel sequence identified in this work, with different affinities and rates as detected using surface plasmon resonance. Combining the experimental work with molecular docking, we show that high-affinity sequences remain bound to the protein for longer, form a greater number of interactions with the protein and induce a greater structural change in the protein than low-affinity sequences. We propose a binding model for the FOXP2 FHD that involves three types of binding sequence: low affinity sites which allow for rapid scanning of the genome by the protein in a partially unstructured state; moderate affinity sites which serve to locate the protein near target sites and high-affinity sites which secure the protein to the DNA and induce a conformational change necessary for functional binding and the possible initiation of downstream transcriptional events. © The Authors 2017. Published by Oxford University Press on behalf of the Japanese Biochemical Society. All rights reserved.

  3. Position specific variation in the rate of evolution in transcription factor binding sites

    PubMed Central

    Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

    2003-01-01

    Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282

  4. Characterisation of a DNA sequence element that directs Dictyostelium stalk cell-specific gene expression.

    PubMed

    Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J

    2000-12-01

    The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.

  5. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  6. Dimeric PROP1 binding to diverse palindromic TAAT sequences promotes its transcriptional activity.

    PubMed

    Nakayama, Michie; Kato, Takako; Susa, Takao; Sano, Akiko; Kitahara, Kousuke; Kato, Yukio

    2009-08-13

    Mutations in the Prop1 gene are responsible for murine Ames dwarfism and human combined pituitary hormone deficiency with hypogonadism. Recently, we reported that PROP1 is a possible transcription factor for gonadotropin subunit genes through plural cis-acting sites composed of AT-rich sequences containing a TAAT motif which differs from its consensus binding sequence known as PRDQ9 (TAATTGAATTA). This study aimed to verify the binding specificity and sequence of PROP1 by applying the method of SELEX (Systematic Evolution of Ligands by EXponential enrichment), EMSA (electrophoretic mobility shift assay) and transient transfection assay. SELEX, after 5, 7 and 9 generations of selection using a random sequence library, showed that nucleotides containing one or two TAAT motifs were accumulated and accounted for 98.5% at the 9th generation. Aligned sequences and EMSA demonstrated that PROP1 binds preferentially to 11 nucleotides composed of an inverted TAAT motif separated by 3 nucleotides with variation in the half site of palindromic TAAT motifs and with preferential requirement of T at the nucleotide number 5 immediately 3' to a TAAT motif. Transient transfection assay demonstrated first that dimeric binding of PROP1 to an inverted TAAT motif and its cognates resulted in transcriptional activation, whereas monomeric binding of PROP1 to a single TAAT motif and an inverted ATTA motif did not mediate activation. Thus, this study demonstrated that dimeric binding of PROP1 is able to recognize diverse palindromic TAAT sequences separated by 3 nucleotides and to exhibit its transcriptional activity.

  7. The structural basis of actinomycin D–binding induces nucleotide flipping out, a sharp bend and a left-handed twist in CGG triplet repeats

    PubMed Central

    Lo, Yu-Sheng; Tseng, Wen-Hsuan; Chuang, Chien-Ying; Hou, Ming-Hon

    2013-01-01

    The potent anticancer drug actinomycin D (ActD) functions by intercalating into DNA at GpC sites, thereby interrupting essential biological processes including replication and transcription. Certain neurological diseases are correlated with the expansion of (CGG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single G:G mispair. To characterize the binding of ActD to CGG triplet repeat sequences, the structural basis for the strong binding of ActD to neighbouring GpC sites flanking a G:G mismatch has been determined based on the crystal structure of ActD bound to ATGCGGCAT, which contains a CGG triplet sequence. The binding of ActD molecules to GCGGC causes many unexpected conformational changes including nucleotide flipping out, a sharp bend and a left-handed twist in the DNA helix via a two site-binding model. Heat denaturation, circular dichroism and surface plasmon resonance analyses showed that adjacent GpC sequences flanking a G:G mismatch are preferred ActD-binding sites. In addition, ActD was shown to bind the hairpin conformation of (CGG)16 in a pairwise combination and with greater stability than that of other DNA intercalators. Our results provide evidence of a possible biological consequence of ActD binding to CGG triplet repeat sequences. PMID:23408860

  8. CENP-B binds a novel centromeric sequence in the Asian mouse Mus caroli.

    PubMed Central

    Kipling, D; Mitchell, A R; Masumoto, H; Wilson, H E; Nicol, L; Cooke, H J

    1995-01-01

    Minor satellite DNA, found at Mus musculus centromeres, is not present in the genome of the Asian mouse Mus caroli. This repetitive sequence family is speculated to have a role in centromere function by providing an array of binding sites for the centromere-associated protein CENP-B. The apparent absence of CENP-B binding sites in the M. caroli genome poses a major challenge to this hypothesis. Here we describe two abundant satellite DNA sequences present at M. caroli centromeres. These satellites are organized as tandem repeat arrays, over 1 Mb in size, of either 60- or 79-bp monomers. All autosomes carry both satellites and small amounts of a sequence related to the M. musculus major satellite. The Y chromosome contains small amounts of both major satellite and the 60-bp satellite, whereas the X chromosome carries only major satellite sequences. M. caroli chromosomes segregate in M. caroli x M. musculus interspecific hybrid cell lines, indicating that the two sets of chromosomes can interact with the same mitotic spindle. Using a polyclonal CENP-B antiserum, we demonstrate that M. caroli centromeres can bind murine CENP-B in such an interspecific cell line, despite the absence of canonical 17-bp CENP-B binding sites in the M. caroli genome. Sequence analysis of the 79-bp M. caroli satellite reveals a 17-bp motif that contains all nine bases previously shown to be necessary for in vitro binding of CENP-B. This M. caroli motif binds CENP-B from HeLa cell nuclear extract in vitro, as indicated by gel mobility shift analysis. We therefore suggest that this motif also causes CENP-B to associate with M. caroli centromeres in vivo. Despite the sequence differences, M. caroli presents a third, novel mammalian centromeric sequence producing an array of binding sites for CENP-B. PMID:7623797

  9. Minimized virus binding for tests of barrier materials.

    PubMed Central

    Lytle, C D; Routson, L B

    1995-01-01

    Viruses are used to test the barrier properties of materials. Binding of virus particles during passage through holes in the material may yield misleading test results. The choices of challenge virus and suspending medium may be important for minimizing confounding effects that might arise from such binding. In this study, different surrogate viruses, as well as different support media, were evaluated to determine optimal test parameters. Two membranes with high-binding properties (nitrocellulose and cationic polysulfone) were used as filters to compare binding activities of different surrogate challenge viruses (MS2, phi X174, T7, PRD1, and phi 6) in different media. The media consisted of buffered saline with surfactants, serum, or culture broth as additives. In addition, elution rates of viruses that bound to the membranes were determined. The results suggest that viruses can bind by hydrophobic and electrostatic interactions, with phi X174 displaying the lowest level of binding by either process. The nonionic detergents Triton X-100 and Tween 80 (0.1%) equally minimized hydrophobic interactions. Neither anionic nor cationic surfactants were as effective at nontoxic levels. Serum was effective at reducing both hydrophobic and electrostatic binding, with 2% being sufficient for eliminating binding under our test conditions. Thus, phi X174 remains the best choice as a surrogate virus to test barrier materials, and Triton X-100 (0.1%) remains a good choice for reducing hydrophobic binding. In addition, binding of viruses by barrier materials is unlikely to prevent passage of blood-borne pathogens. PMID:7574603

  10. A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

    PubMed

    Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

    2002-12-01

    There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.

  11. Sequestration of cAMP response element-binding proteins by transcription factor decoys causes collateral elaboration of regenerating Aplysia motor neuron axons.

    PubMed

    Dash, P K; Tian, L M; Moore, A N

    1998-07-07

    Axonal injury increases intracellular Ca2+ and cAMP and has been shown to induce gene expression, which is thought to be a key event for regeneration. Increases in intracellular Ca2+ and/or cAMP can alter gene expression via activation of a family of transcription factors that bind to and modulate the expression of CRE (Ca2+/cAMP response element) sequence-containing genes. We have used Aplysia motor neurons to examine the role of CRE-binding proteins in axonal regeneration after injury. We report that axonal injury increases the binding of proteins to a CRE sequence-containing probe. In addition, Western blot analysis revealed that the level of ApCREB2, a CRE sequence-binding repressor, was enhanced as a result of axonal injury. The sequestration of CRE-binding proteins by microinjection of CRE sequence-containing plasmids enhanced axon collateral formation (both number and length) as compared with control plasmid injections. These findings show that Ca2+/cAMP-mediated gene expression via CRE-binding transcription factors participates in the regeneration of motor neuron axons.

  12. DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

    PubMed

    Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford

    2017-10-01

    Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  13. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-08-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded.

  14. Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.

    PubMed Central

    Sasaki, H; Yokoyama, E; Kuroiwa, A

    1990-01-01

    The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866

  15. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens

    PubMed Central

    Naz, Sadia; Ngo, Tony; Farooq, Umar

    2017-01-01

    Background The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis. The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Methods Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli, two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. Results High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis. Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Discussion Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner. PMID:28948099

  16. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens.

    PubMed

    Naz, Sadia; Ngo, Tony; Farooq, Umar; Abagyan, Ruben

    2017-01-01

    The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis . The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli , two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis . Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner.

  17. Patterns and plasticity in RNA-protein interactions enable recruitment of multiple proteins through a single site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valley, Cary T.; Porter, Douglas F.; Qiu, Chen

    2012-06-28

    mRNA control hinges on the specificity and affinity of proteins for their RNA binding sites. Regulatory proteins must bind their own sites and reject even closely related noncognate sites. In the PUF [Pumilio and fem-3 binding factor (FBF)] family of RNA binding proteins, individual proteins discriminate differences in the length and sequence of binding sites, allowing each PUF to bind a distinct battery of mRNAs. Here, we show that despite these differences, the pattern of RNA interactions is conserved among PUF proteins: the two ends of the PUF protein make critical contacts with the two ends of the RNA sites.more » Despite this conserved 'two-handed' pattern of recognition, the RNA sequence is flexible. Among the binding sites of yeast Puf4p, RNA sequence dictates the pattern in which RNA bases are flipped away from the binding surface of the protein. Small differences in RNA sequence allow new modes of control, recruiting Puf5p in addition to Puf4p to a single site. This embedded information adds a new layer of biological meaning to the connections between RNA targets and PUF proteins.« less

  18. De novo truncating variants in the AHDC1 gene encoding the AT-hook DNA-binding motif-containing protein 1 are associated with intellectual disability and developmental delay.

    PubMed

    Yang, Hui; Douglas, Ganka; Monaghan, Kristin G; Retterer, Kyle; Cho, Megan T; Escobar, Luis F; Tucker, Megan E; Stoler, Joan; Rodan, Lance H; Stein, Diane; Marks, Warren; Enns, Gregory M; Platt, Julia; Cox, Rachel; Wheeler, Patricia G; Crain, Carrie; Calhoun, Amy; Tryon, Rebecca; Richard, Gabriele; Vitazka, Patrik; Chung, Wendy K

    2015-10-01

    Whole-exome sequencing (WES) represents a significant breakthrough in clinical genetics, and identifies a genetic etiology in up to 30% of cases of intellectual disability (ID). Using WES, we identified seven unrelated patients with a similar clinical phenotype of severe intellectual disability or neurodevelopmental delay who were all heterozygous for de novo truncating variants in the AT-hook DNA-binding motif-containing protein 1 (AHDC1). The patients were all minimally verbal or nonverbal and had variable neurological problems including spastic quadriplegia, ataxia, nystagmus, seizures, autism, and self-injurious behaviors. Additional common clinical features include dysmorphic facial features and feeding difficulties associated with failure to thrive and short stature. The AHDC1 gene has only one coding exon, and the protein contains conserved regions including AT-hook motifs and a PDZ binding domain. We postulate that all seven variants detected in these patients result in a truncated protein missing critical functional domains, disrupting interactions with other proteins important for brain development. Our study demonstrates that truncating variants in AHDC1 are associated with ID and are primarily associated with a neurodevelopmental phenotype.

  19. G-quadruplex formation in telomeres enhances POT1/TPP1 protection against RPA binding

    PubMed Central

    Ray, Sujay; Bandaria, Jigar N.; Qureshi, Mohammad H.; Yildiz, Ahmet; Balci, Hamza

    2014-01-01

    Human telomeres terminate with a single-stranded 3′ G overhang, which can be recognized as a DNA damage site by replication protein A (RPA). The protection of telomeres (POT1)/POT1-interacting protein 1 (TPP1) heterodimer binds specifically to single-stranded telomeric DNA (ssTEL) and protects G overhangs against RPA binding. The G overhang spontaneously folds into various G-quadruplex (GQ) conformations. It remains unclear whether GQ formation affects the ability of POT1/TPP1 to compete against RPA to access ssTEL. Using single-molecule Förster resonance energy transfer, we showed that POT1 stably loads to a minimal DNA sequence adjacent to a folded GQ. At 150 mM K+, POT1 loading unfolds the antiparallel GQ, as the parallel conformation remains folded. POT1/TPP1 loading blocks RPA’s access to both folded and unfolded telomeres by two orders of magnitude. This protection is not observed at 150 mM Na+, in which ssTEL forms only a less-stable antiparallel GQ. These results suggest that GQ formation of telomeric overhangs may contribute to suppression of DNA damage signals. PMID:24516170

  20. Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies.

    PubMed

    Rickert, Keith W; Grinberg, Luba; Woods, Robert M; Wilson, Susan; Bowen, Michael A; Baca, Manuel

    2016-01-01

    The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3-5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material.

  1. Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies

    PubMed Central

    Rickert, Keith W.; Grinberg, Luba; Woods, Robert M.; Wilson, Susan; Bowen, Michael A.; Baca, Manuel

    2016-01-01

    ABSTRACT The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3–5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material. PMID:26852694

  2. Protein design on computers. Five new proteins: Shpilka, Grendel, Fingerclasp, Leather, and Aida.

    PubMed

    Sander, C; Vriend, G; Bazan, F; Horovitz, A; Nakamura, H; Ribas, L; Finkelstein, A V; Lockhart, A; Merkl, R; Perry, L J

    1992-02-01

    What is the current state of the art in protein design? This question was approached in a recent two-week protein design workshop sponsored by EMBO and held at the EMBL in Heidelberg. The goals were to test available design tools and to explore new design strategies. Five novel proteins were designed: Shpilka, a sandwich of two four-stranded beta-sheets, a scaffold on which to explore variations in loop topology; Grendel, a four-helical membrane anchor, ready for fusion to water-soluble functional domains; Finger-clasp, a dimer of interdigitating beta-beta-alpha units, the simplest variant of the "handshake" structural class; Aida, an antibody binding surface intended to be specific for flavodoxin; Leather--a minimal NAD binding domain, extracted from a larger protein. Each design is available as a set of three-dimensional coordinates, the corresponding amino acid sequence and a set of analytical results. The designs are placed in the public domain for scrutiny, improvement, and possible experimental verification.

  3. MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions.

    PubMed

    Li, Minghui; Simonetti, Franco L; Goncearenco, Alexander; Panchenko, Anna R

    2016-07-08

    Proteins engage in highly selective interactions with their macromolecular partners. Sequence variants that alter protein binding affinity may cause significant perturbations or complete abolishment of function, potentially leading to diseases. There exists a persistent need to develop a mechanistic understanding of impacts of variants on proteins. To address this need we introduce a new computational method MutaBind to evaluate the effects of sequence variants and disease mutations on protein interactions and calculate the quantitative changes in binding affinity. The MutaBind method uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms. The MutaBind server maps mutations on a structural protein complex, calculates the associated changes in binding affinity, determines the deleterious effect of a mutation, estimates the confidence of this prediction and produces a mutant structural model for download. MutaBind can be applied to a large number of problems, including determination of potential driver mutations in cancer and other diseases, elucidation of the effects of sequence variants on protein fitness in evolution and protein design. MutaBind is available at http://www.ncbi.nlm.nih.gov/projects/mutabind/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  4. HMG-D is an architecture-specific protein that preferentially binds to DNA containing the dinucleotide TG.

    PubMed Central

    Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A

    1995-01-01

    The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717

  5. Identification of high-specificity H-NS binding site in LEE5 promoter of enteropathogenic Esherichia coli (EPEC).

    PubMed

    Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E

    2014-07-01

    Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.

  6. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-01-01

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363

  7. Low Pathogenic Avian Influenza Isolates from Wild Birds Replicate and Transmit via Contact in Ferrets without Prior Adaptation

    PubMed Central

    Humberd-Smith, Jennifer; Gordy, James T.; Bradley, Konrad C.; Steinhauer, David A.; Berghaus, Roy D.; Stallknecht, David E.; Howerth, Elizabeth W.; Tompkins, Stephen Mark

    2012-01-01

    Direct transmission of avian influenza viruses to mammals has become an increasingly investigated topic during the past decade; however, isolates that have been primarily investigated are typically ones originating from human or poultry outbreaks. Currently there is minimal comparative information on the behavior of the innumerable viruses that exist in the natural wild bird host. We have previously demonstrated the capacity of numerous North American avian influenza viruses isolated from wild birds to infect and induce lesions in the respiratory tract of mice. In this study, two isolates from shorebirds that were previously examined in mice (H1N9 and H6N1 subtypes) are further examined through experimental inoculations in the ferret with analysis of viral shedding, histopathology, and antigen localization via immunohistochemistry to elucidate pathogenicity and transmission of these viruses. Using sequence analysis and glycan binding analysis, we show that these avian viruses have the typical avian influenza binding pattern, with affinity for cell glycoproteins/glycolipids having terminal sialic acid (SA) residues with α 2,3 linkage [Neu5Ac(α2,3)Gal]. Despite the lack of α2,6 linked SA binding, these AIVs productively infected both the upper and lower respiratory tract of ferrets, resulting in nasal viral shedding and pulmonary lesions with minimal morbidity. Moreover, we show that one of the viruses is able to transmit to ferrets via direct contact, despite its binding affinity for α 2,3 linked SA residues. These results demonstrate that avian influenza viruses, which are endemic in aquatic birds, can potentially infect humans and other mammals without adaptation. Finally this work highlights the need for additional study of the wild bird subset of influenza viruses in regard to surveillance, transmission, and potential for reassortment, as they have zoonotic potential. PMID:22675507

  8. Improving the performance of minimizers and winnowing schemes

    PubMed Central

    Marçais, Guillaume; Pellow, David; Bork, Daniel; Orenstein, Yaron; Shamir, Ron; Kingsford, Carl

    2017-01-01

    Abstract Motivation: The minimizers scheme is a method for selecting k-mers from sequences. It is used in many bioinformatics software tools to bin comparable sequences or to sample a sequence in a deterministic fashion at approximately regular intervals, in order to reduce memory consumption and processing time. Although very useful, the minimizers selection procedure has undesirable behaviors (e.g. too many k-mers are selected when processing certain sequences). Some of these problems were already known to the authors of the minimizers technique, and the natural lexicographic ordering of k-mers used by minimizers was recognized as their origin. Many software tools using minimizers employ ad hoc variations of the lexicographic order to alleviate those issues. Results: We provide an in-depth analysis of the effect of k-mer ordering on the performance of the minimizers technique. By using small universal hitting sets (a recently defined concept), we show how to significantly improve the performance of minimizers and avoid some of its worse behaviors. Based on these results, we encourage bioinformatics software developers to use an ordering based on a universal hitting set or, if not possible, a randomized ordering, rather than the lexicographic order. This analysis also settles negatively a conjecture (by Schleimer et al.) on the expected density of minimizers in a random sequence. Availability and Implementation: The software used for this analysis is available on GitHub: https://github.com/gmarcais/minimizers.git. Contact: gmarcais@cs.cmu.edu or carlk@cs.cmu.edu PMID:28881970

  9. Fibronectin tetrapeptide is target for syphilis spirochete cytadherence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thomas, D.D.; Baseman, J.B.; Alderete, J.F.

    1985-11-01

    The syphilis bacterium, Treponema pallidum, parasitizes host cells through recognition of fibronectin (Fn) on cell surfaces. The active site of the Fn molecule has been identified as a four-amino acid sequence, arg-gly-asp-ser (RGDS), located on each monomer of the cell-binding domain. The synthetic heptapeptide gly-arg-gly-asp-ser-pro-cys (GRGDSPC), with the active site sequence RGDS, specifically competed with SVI-labeled cell-binding domain acquisition by T. pallidum. Additionally, the same heptapeptide with the RGDS sequence diminished treponemal attachment to HEp-2 and HT1080 cell monolayers. Related heptapeptides altered in one key amino acid within the RGDS sequence failed to inhibit Fn cell-binding domain acquisition or parasitismmore » of host cells by T. pallidum. The data support the view that T. pallidum cytadherence of host cells is through recognition of the RGDS sequence also important for eukaryotic cell-Fn binding.« less

  10. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

    PubMed

    Schneider, T D

    2001-12-01

    The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.

  11. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  12. Proteolytic dissection of Zab, the Z-DNA-binding domain of human ADAR1

    NASA Technical Reports Server (NTRS)

    Schwartz, T.; Lowenhaupt, K.; Kim, Y. G.; Li, L.; Brown, B. A. 2nd; Herbert, A.; Rich, A.

    1999-01-01

    Zalpha is a peptide motif that binds to Z-DNA with high affinity. This motif binds to alternating dC-dG sequences stabilized in the Z-conformation by means of bromination or supercoiling, but not to B-DNA. Zalpha is part of the N-terminal region of double-stranded RNA adenosine deaminase (ADAR1), a candidate enzyme for nuclear pre-mRNA editing in mammals. Zalpha is conserved in ADAR1 from many species; in each case, there is a second similar motif, Zbeta, separated from Zalpha by a more divergent linker. To investigate the structure-function relationship of Zalpha, its domain structure was studied by limited proteolysis. Proteolytic profiles indicated that Zalpha is part of a domain, Zab, of 229 amino acids (residues 133-361 in human ADAR1). This domain contains both Zalpha and Zbeta as well as a tandem repeat of a 49-amino acid linker module. Prolonged proteolysis revealed a minimal core domain of 77 amino acids (positions 133-209), containing only Zalpha, which is sufficient to bind left-handed Z-DNA; however, the substrate binding is strikingly different from that of Zab. The second motif, Zbeta, retains its structural integrity only in the context of Zab and does not bind Z-DNA as a separate entity. These results suggest that Zalpha and Zbeta act as a single bipartite domain. In the presence of substrate DNA, Zab becomes more resistant to proteases, suggesting that it adopts a more rigid structure when bound to its substrate, possibly with conformational changes in parts of the protein.

  13. Specialized nucleoprotein structures at the origin of replication of bacteriophage lambda: localized unwinding of duplex DNA by a six-protein reaction.

    PubMed Central

    Dodson, M; Echols, H; Wickner, S; Alfano, C; Mensa-Wilmot, K; Gomes, B; LeBowitz, J; Roberts, J D; McMacken, R

    1986-01-01

    The O protein of bacteriophage lambda localizes the initiation of DNA replication to a unique site on the lambda genome, ori lambda. By means of electron microscopy, we infer that the binding of O to ori lambda initiates a series of protein addition and transfer reactions that culminate in localized unwinding of the origin DNA, generating a prepriming structure for the initiation of DNA replication. We can define three stages of this prepriming reaction, the first two of which we have characterized previously. First, dimeric O protein binds to multiple DNA binding sites and self-associates to form a nucleoprotein structure, the O-some. Second, lambda P and host DnaB proteins interact with the O-some to generate a larger complex that includes additional DNA from an A + T-rich region adjacent to the O binding sites. Third, the addition of the DnaJ, DnaK, and Ssb proteins and ATP results in an origin-specific unwinding reaction, probably catalyzed by the helicase activity of DnaB. The unwinding reaction is unidirectional, proceeding "rightward" from the origin. The minimal DNA sequence competent for unwinding consists of two O binding sites and the adjacent A + T-rich region to the right of the binding sites. We conclude that the lambda O protein localizes and initiates a six-protein sequential reaction responsible for but preceding the precise initiation of DNA replication. Specialized nucleoprotein structures similar to the O-some may be a general feature of DNA transactions requiring extraordinary precision in localization and control. Images PMID:3020552

  14. Live-cell monitoring of periodic gene expression in synchronous human cells identifies Forkhead genes involved in cell cycle control

    PubMed Central

    Grant, Gavin D.; Gamsby, Joshua; Martyanov, Viktor; Brooks, Lionel; George, Lacy K.; Mahoney, J. Matthew; Loros, Jennifer J.; Dunlap, Jay C.; Whitfield, Michael L.

    2012-01-01

    We developed a system to monitor periodic luciferase activity from cell cycle–regulated promoters in synchronous cells. Reporters were driven by a minimal human E2F1 promoter with peak expression in G1/S or a basal promoter with six Forkhead DNA-binding sites with peak expression at G2/M. After cell cycle synchronization, luciferase activity was measured in live cells at 10-min intervals across three to four synchronous cell cycles, allowing unprecedented resolution of cell cycle–regulated gene expression. We used this assay to screen Forkhead transcription factors for control of periodic gene expression. We confirmed a role for FOXM1 and identified two novel cell cycle regulators, FOXJ3 and FOXK1. Knockdown of FOXJ3 and FOXK1 eliminated cell cycle–dependent oscillations and resulted in decreased cell proliferation rates. Analysis of genes regulated by FOXJ3 and FOXK1 showed that FOXJ3 may regulate a network of zinc finger proteins and that FOXK1 binds to the promoter and regulates DHFR, TYMS, GSDMD, and the E2F binding partner TFDP1. Chromatin immunoprecipitation followed by high-throughput sequencing analysis identified 4329 genomic loci bound by FOXK1, 83% of which contained a FOXK1-binding motif. We verified that a subset of these loci are activated by wild-type FOXK1 but not by a FOXK1 (H355A) DNA-binding mutant. PMID:22740631

  15. Divalent Metal-Ion Complexes with Dipeptide Ligands Having Phe and His Side-Chain Anchors: Effects of Sequence, Metal Ion, and Anchor.

    PubMed

    Dunbar, Robert C; Berden, Giel; Martens, Jonathan K; Oomens, Jos

    2015-09-24

    Conformational preferences have been surveyed for divalent metal cation complexes with the dipeptide ligands AlaPhe, PheAla, GlyHis, and HisGly. Density functional theory results for a full set of complexes are presented, and previous experimental infrared spectra, supplemented by a number of newly recorded spectra obtained with infrared multiple photon dissociation spectroscopy, provide experimental verification of the preferred conformations in most cases. The overall structural features of these complexes are shown, and attention is given to comparisons involving peptide sequence, nature of the metal ion, and nature of the side-chain anchor. A regular progression is observed as a function of binding strength, whereby the weakly binding metal ions (Ba(2+) to Ca(2+)) transition from carboxylate zwitterion (ZW) binding to charge-solvated (CS) binding, while the stronger binding metal ions (Ca(2+) to Mg(2+) to Ni(2+)) transition from CS binding to metal-ion-backbone binding (Iminol) by direct metal-nitrogen bonds to the deprotonated amide nitrogens. Two new sequence-dependent reversals are found between ZW and CS binding modes, such that Ba(2+) and Ca(2+) prefer ZW binding in the GlyHis case but prefer CS binding in the HisGly case. The overall binding strength for a given metal ion is not strongly dependent on the sequence, but the histidine peptides are significantly more strongly bound (by 50-100 kJ mol(-1)) than the phenylalanine peptides.

  16. Immunogenic proteins of Brucella abortus to minimize cross reactions in brucellosis diagnosis.

    PubMed

    Ko, Kyung Yuk; Kim, Jong-Wan; Her, Moon; Kang, Sung-Il; Jung, Suk Chan; Cho, Dong Hee; Kim, Ji-Yeon

    2012-05-04

    To overcome the limitations of serological diagnosis, including false positive reactions caused by other pathogens, specific antigens for diagnosis of brucellosis other than LPS have been required. The present study was conducted to separate and identify immuno-dominant insoluble proteins of Brucella abortus against the antisera of cattle infected with B. abortus, or/and Yersinia enterocolitica, or the sera of non-infected cattle. After separating insoluble proteins of B. abortus by two dimensional electrophoresis (2-DE), their immuno-reactivity was determined by western blotting. A portion of the immunogenic spots against the positive antisera of B. abortus that have the potential for use as specific antigens were identified by MS/MS analysis. Overall, 18 immunogenic insoluble proteins of B. abortus 1119-3 showed immuno-reactivity against only the positive antisera of B. abortus, but failed to have immunogenicity toward both the positive sera of Y. enterocolitica and the negative sera of B. abortus. Identification of these proteins revealed the following: F0F1 ATP synthase subunit β, solute-binding family 5 protein, 28 kDa OMP, Leu/Ile/Val-binding family protein, Histidinol dehyddrogenase, Hypothetical protein, Twin-arginine translocation pathway signal sequence domain-containing protein, Dihydroorotase, Serine protease family protein, β-hydroxyacyl-(acyl-carrier-protein) dehydratase FabA, Short-chain dehydrogenase-/reductase carbonic anhydrase, Orinithine carbamoyltransferase, Leucyl aminopeptidase, Cold shock DNA-binding domain-containing protein, Cu/Zn superoxide dismutase, and Methionine aminopeptidase. The 18 immunogenic proteins separated in the present study can be considered candidate antigens to minimize cross reaction in the diagnosis of brucellosis and useful sources for Brucella vaccine development. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Molecular cloning and analysis of Schizosaccharomyces pombe Reb1p: sequence-specific recognition of two sites in the far upstream rDNA intergenic spacer.

    PubMed Central

    Zhao, A; Guo, A; Liu, Z; Pape, L

    1997-01-01

    The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645

  18. Characterization of protein--DNA interactions using surface plasmon resonance spectroscopy with various assay schemes.

    PubMed

    Teh, Huey Fang; Peh, Wendy Y X; Su, Xiaodi; Thomsen, Jane S

    2007-02-27

    Specific protein-DNA interactions play a central role in transcription and other biological processes. A comprehensive characterization of protein-DNA interactions should include information about binding affinity, kinetics, sequence specificity, and binding stoichiometry. In this study, we have used surface plasmon resonance spectroscopy (SPR) to study the interactions between human estrogen receptors (ER, alpha and beta subtypes) and estrogen response elements (ERE), with four assay schemes. First, we determined the sequence-dependent receptors' binding capacity by monitoring the binding of ER to various ERE sequences immobilized on a sensor surface (assay format denoted as the direct assay). Second, we screened the relative affinity of ER for various ERE sequences using a competition assay, in which the receptors bind to an ERE-immobilized surface in the presence of competitor ERE sequences. Third, we monitored the assembly of ER-ERE complexes on a SPR surface and thereafter the removal and/or dissociation of the ER (assay scheme denoted as the dissociation assay) to determine the binding stoichiometry. Last, a sandwich assay (ER binding to ERE followed by anti-ER recognition of a specific ER subtype) was performed in an effort to understand how ERalpha and ERbeta may associate and compete when binding to the DNA. With these assay schemes, we reaffirmed that (1) ERalpha is more sensitive than ERbeta to base pair change(s) in the consensus ERE, (2) ERalpha and ERbeta form a heterodimer when they bind to the consensus ERE, and (3) the binding stoichiometry of both ERalpha- and ERbeta-ERE complexes is dependent on salt concentration. With this study, we demonstrate the versatility of the SPR analysis. With the involvement of various assay arrangements, the SPR analysis can be further extended to more than kinetics and affinity study.

  19. TIA-1 RRM23 binding and recognition of target oligonucleotides

    PubMed Central

    Waris, Saboora; García-Mauriño, Sofía M.; Sivakumaran, Andrew; Beckham, Simone A.; Loughlin, Fionna E.; Gorospe, Myriam; Díaz-Moreno, Irene; Wilce, Matthew C.J.

    2017-01-01

    Abstract TIA-1 (T-cell restricted intracellular antigen-1) is an RNA-binding protein involved in splicing and translational repression. It mainly interacts with RNA via its second and third RNA recognition motifs (RRMs), with specificity for U-rich sequences directed by RRM2. It has recently been shown that RRM3 also contributes to binding, with preferential binding for C-rich sequences. Here we designed UC-rich and CU-rich 10-nt sequences for engagement of both RRM2 and RRM3 and demonstrated that the TIA-1 RRM23 construct preferentially binds the UC-rich RNA ligand (5΄-UUUUUACUCC-3΄). Interestingly, this binding depends on the presence of Lys274 that is C-terminal to RRM3 and binding to equivalent DNA sequences occurs with similar affinity. Small-angle X-ray scattering was used to demonstrate that, upon complex formation with target RNA or DNA, TIA-1 RRM23 adopts a compact structure, showing that both RRMs engage with the target 10-nt sequences to form the complex. We also report the crystal structure of TIA-1 RRM2 in complex with DNA to 2.3 Å resolution providing the first atomic resolution structure of any TIA protein RRM in complex with oligonucleotide. Together our data support a specific mode of TIA-1 RRM23 interaction with target oligonucleotides consistent with the role of TIA-1 in binding RNA to regulate gene expression. PMID:28184449

  20. TIA-1 RRM23 binding and recognition of target oligonucleotides.

    PubMed

    Waris, Saboora; García-Mauriño, Sofía M; Sivakumaran, Andrew; Beckham, Simone A; Loughlin, Fionna E; Gorospe, Myriam; Díaz-Moreno, Irene; Wilce, Matthew C J; Wilce, Jacqueline A

    2017-05-05

    TIA-1 (T-cell restricted intracellular antigen-1) is an RNA-binding protein involved in splicing and translational repression. It mainly interacts with RNA via its second and third RNA recognition motifs (RRMs), with specificity for U-rich sequences directed by RRM2. It has recently been shown that RRM3 also contributes to binding, with preferential binding for C-rich sequences. Here we designed UC-rich and CU-rich 10-nt sequences for engagement of both RRM2 and RRM3 and demonstrated that the TIA-1 RRM23 construct preferentially binds the UC-rich RNA ligand (5΄-UUUUUACUCC-3΄). Interestingly, this binding depends on the presence of Lys274 that is C-terminal to RRM3 and binding to equivalent DNA sequences occurs with similar affinity. Small-angle X-ray scattering was used to demonstrate that, upon complex formation with target RNA or DNA, TIA-1 RRM23 adopts a compact structure, showing that both RRMs engage with the target 10-nt sequences to form the complex. We also report the crystal structure of TIA-1 RRM2 in complex with DNA to 2.3 Å resolution providing the first atomic resolution structure of any TIA protein RRM in complex with oligonucleotide. Together our data support a specific mode of TIA-1 RRM23 interaction with target oligonucleotides consistent with the role of TIA-1 in binding RNA to regulate gene expression. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships.

    PubMed

    Gold, Nicola D; Jackson, Richard M

    2006-02-03

    The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.

  2. A TATA binding protein mutant with increased affinity for DNA directs transcription from a reversed TATA sequence in vivo.

    PubMed

    Spencer, J Vaughn; Arndt, Karen M

    2002-12-01

    The TATA-binding protein (TBP) nucleates the assembly and determines the position of the preinitiation complex at RNA polymerase II-transcribed genes. We investigated the importance of two conserved residues on the DNA binding surface of Saccharomyces cerevisiae TBP to DNA binding and sequence discrimination. Because they define a significant break in the twofold symmetry of the TBP-TATA interface, Ala100 and Pro191 have been proposed to be key determinants of TBP binding orientation and transcription directionality. In contrast to previous predictions, we found that substitution of an alanine for Pro191 did not allow recognition of a reversed TATA box in vivo; however, the reciprocal change, Ala100 to proline, resulted in efficient utilization of this and other variant TATA sequences. In vitro assays demonstrated that TBP mutants with the A100P and P191A substitutions have increased and decreased affinity for DNA, respectively. The TATA binding defect of TBP with the P191A mutation could be intragenically suppressed by the A100P substitution. Our results suggest that Ala100 and Pro191 are important for DNA binding and sequence recognition by TBP, that the naturally occurring asymmetry of Ala100 and Pro191 is not essential for function, and that a single amino acid change in TBP can lead to elevated DNA binding affinity and recognition of a reversed TATA sequence.

  3. SSMART: Sequence-structure motif identification for RNA-binding proteins.

    PubMed

    Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

    2018-06-11

    RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.

  4. Sequence-selective binding of C8-conjugated pyrrolobenzodiazepines (PBDs) to DNA.

    PubMed

    Basher, Mohammad A; Rahman, Khondaker Miraz; Jackson, Paul J M; Thurston, David E; Fox, Keith R

    2017-11-01

    DNA footprinting and melting experiments have been used to examine the sequence-specific binding of C8-conjugates of pyrrolobenzodiazepines (PBDs) and benzofused rings including benzothiophene and benzofuran, which are attached using pyrrole- or imidazole-containing linkers. The conjugates modulate the covalent attachment points of the PBDs, so that they bind best to guanines flanked by A/T-rich sequences on either the 5'- or 3'-side. The linker affects the binding, and pyrrole produces larger changes than imidazole. Melting studies with 14-mer oligonucleotide duplexes confirm covalent attachment of the conjugates, which show a different selectivity to anthramycin and reveal that more than one ligand molecule can bind to each duplex. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. A minimal murine Msx-1 gene promoter. Organization of its cis-regulatory motifs and their role in transcriptional activation in cells in culture and in transgenic mice.

    PubMed

    Takahashi, T; Guron, C; Shetty, S; Matsui, H; Raghow, R

    1997-09-05

    To dissect the cis-regulatory elements of the murine Msx-1 promoter, which lacks a conventional TATA element, a putative Msx-1 promoter DNA fragment (from -1282 to +106 base pairs (bp)) or its congeners containing site-specific alterations were fused to luciferase reporter and introduced into NIH3T3 and C2C12 cells, and the expression of luciferase was assessed in transient expression assays. The functional consequences of the sequential 5' deletions of the promotor revealed that multiple positive and negative regulatory elements participate in regulating transcription of the Msx-1 gene. Surprisingly, however, the optimal expression of Msx-1 promoter in either NIH3T3 or C2C12 cells required only 165 bp of the upstream sequence to warrant detailed examination of its structure. Therefore, the functional consequences of site-specific deletions and point mutations of the cis-acting elements of the minimal Msx-1 promoter were systematically examined. Concomitantly, potential transcriptional factor(s) interacting with the cis-acting elements of the minimal promoter were also studied by gel electrophoretic mobility shift assays and DNase I footprinting. Combined analyses of the minimal promoter by DNase I footprinting, electrophoretic mobility shift assays, and super shift assays with specific antibodies revealed that 5'-flanking regions from -161 to -154 and from -26 to -13 of the Msx-1 promoter contains an authentic E box (proximal E box), capable of binding a protein immunologically related to the upstream stimulating factor 1 (USF-1) and a GC-rich sequence motif which can bind to Sp1 (proximal Sp1), respectively. Additionally, we observed that the promoter activation was seriously hampered if the proximal E box was removed or mutated, and the promoter activity was eliminated completely if the proximal Sp1 site was similarly altered. Absolute dependence of the Msx-1 minimal promoter on Sp1 could be demonstrated by transient expression assays in the Sp1-deficient Drosophila cell line cotransfected with Msx-1-luciferase and an Sp1 expression vector pPacSp1. The transgenic mice embryos containing -165/106-bp Msx-1 promoter-LacZ DNA in their genomes abundantly expressed beta-galactosidase in maxillae and mandibles and in the cellular primordia involved in the formation of the meninges and the bones of the skull. Thus, the truncated murine Msx-1 promoter can target expression of a heterologous gene in the craniofacial tissues of transgenic embryos known for high level of expression of the endogenous Msx-1 gene and found to be severely defective in the Msx-1 knock-out mice.

  6. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    PubMed

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  7. Saccharomyces cerevisiae SSB1 protein and its relationship to nucleolar RNA-binding proteins.

    PubMed Central

    Jong, A Y; Clark, M W; Gilbert, M; Oehm, A; Campbell, J L

    1987-01-01

    To better define the function of Saccharomyces cerevisiae SSB1, an abundant single-stranded nucleic acid-binding protein, we determined the nucleotide sequence of the SSB1 gene and compared it with those of other proteins of known function. The amino acid sequence contains 293 amino acid residues and has an Mr of 32,853. There are several stretches of sequence characteristic of other eucaryotic single-stranded nucleic acid-binding proteins. At the amino terminus, residues 39 to 54 are highly homologous to a peptide in calf thymus UP1 and UP2 and a human heterogeneous nuclear ribonucleoprotein. Residues 125 to 162 constitute a fivefold tandem repeat of the sequence RGGFRG, the composition of which suggests a nucleic acid-binding site. Near the C terminus, residues 233 to 245 are homologous to several RNA-binding proteins. Of 18 C-terminal residues, 10 are acidic, a characteristic of the procaryotic single-stranded DNA-binding proteins and eucaryotic DNA- and RNA-binding proteins. In addition, examination of the subcellular distribution of SSB1 by immunofluorescence microscopy indicated that SSB1 is a nuclear protein, predominantly located in the nucleolus. Sequence homologies and the nucleolar localization make it likely that SSB1 functions in RNA metabolism in vivo, although an additional role in DNA metabolism cannot be excluded. Images PMID:2823109

  8. Investigating intermolecular forces associated with thrombus initiation using optical tweezers

    NASA Astrophysics Data System (ADS)

    Arya, Maneesh; Lopez, Jose A.; Romo, Gabriel M.; Dong, Jing-Fei; McIntire, Larry V.; Moake, Joel L.; Anvari, Bahman

    2002-05-01

    Thrombus formation occurs when a platelet membrane receptor, glycoprotein (GP) Ib-IX-V complex, binds to its ligand, von Willebrand factor (vWf), in the subendothelium or plasma. To determine which GP Ib-IX-V amino acid sequences are critical for bond formation, we have used optical tweezers to measure forces involved in the binding of vWf to GP Ib-IX-V variants. Inasmuch as GP Ib(alpha) subunit is the primary component in human GP Ib-IX-V complex that binds to vWf, and that canine GP Ib(alpha) , on the other hand, does not bind to human vWf, we progressively replaced human GP Ib(alpha) amino acid sequences with canine GP Ib(alpha) sequences to determine the sequences essential for vWf/GP Ib(alpha) binding. After measuring the adhesive forces between optically trapped, vWf-coated beads and GP Ib(alpha) variants expressed on mammalian cells, we determined that leucine- rich repeat 2 of GP Ib(alpha) was necessary for vWf/GP Ib-IX- V bond formation. We also found that deletion of the N- terminal flanking sequence and leucine-rich repeat 1 reduced adhesion strength to vWf but did not abolish binding. While divalent cations are known to influence binding of vWf, addition of 1mM CaCl2 had no effect on measured vWf/GP Ib(alpha) bond strengths.

  9. Understanding the mechanisms of protein-DNA interactions

    NASA Astrophysics Data System (ADS)

    Lavery, Richard

    2004-03-01

    Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.

  10. Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling.

    PubMed

    Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T

    2014-06-01

    RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.

  11. The Liverwort Contains a Lectin That Is Structurally and Evolutionary Related to the Monocot Mannose-Binding Lectins1

    PubMed Central

    Peumans, Willy J.; Barre, Annick; Bras, Julien; Rougé, Pierre; Proost, Paul; Van Damme, Els J.M.

    2002-01-01

    A mannose (Man)-binding lectin has been isolated and characterized from the thallus of the liverwort Marchantia polymorpha. N-terminal sequencing indicated that the M. polymorpha agglutinin (Marpola) shares sequence similarity with the superfamily of monocot Man-binding lectins. Searches in the databases yielded expressed sequence tags encoding Marpola. Sequence analysis, molecular modeling, and docking experiments revealed striking structural similarities between Marpola and the monocot Man-binding lectins. Activity and specificity studies further indicated that Marpola is a much stronger agglutinin than the Galanthus nivalis agglutinin and exhibits a preference for methylated Man and glucose, which is unprecedented within the family of monocot Man-binding lectins. The discovery of Marpola allows us, for the first time, to corroborate the evolutionary relationship between a lectin from a lower plant and a well-established lectin family from flowering plants. In addition, the identification of Marpola sheds a new light on the molecular evolution of the superfamily of monocot Man-binding lectins. Beside evolutionary considerations, the occurrence of a G. nivalis agglutinin homolog in a lower plant necessitates the rethinking of the physiological role of the whole family of monocot Man-binding lectins. PMID:12114560

  12. TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

    PubMed

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

  13. TFBSshape: a motif database for DNA shape features of transcription factor binding sites

    PubMed Central

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

  14. Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2018-05-02

    RNA-binding proteins (RBPs) take over 5∼10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using pattern learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN run 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. https://github.com/xypan1232/iDeepE. xypan172436@gmail.com or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.

  15. Biostable aptamers with antagonistic properties to the neuropeptide nociceptin/orphanin FQ

    PubMed Central

    FAULHAMMER, DIRK; ESCHGFÄLLER, BERND; STARK, SANDRA; BURGSTALLER, PETRA; ENGLBERGER, WERNER; ERFURTH, JEANNETTE; KLEINJUNG, FRANK; RUPP, JOHANNA; VULCU, SEBASTIAN DAN; SCHRÖDER, WERNER; VONHOFF, STEFAN; NAWRATH, HERMANN; GILLEN, CLEMENS; KLUSSMANN, SVEN

    2004-01-01

    The neuropeptide nociceptin/orphanin FQ (N/OFQ), the endogenous ligand of the opioid receptor-like 1 (ORL1) receptor, has been shown to play a prominent role in the regulation of several biological functions such as pain and stress. Here we describe the isolation and characterization of N/OFQ binding biostable RNA aptamers (Spiegelmers) using a mirror-image in vitro selection approach. Spiegelmers are l-enantiomeric oligonucleotide ligands that display high affinity and specificity to their targets and high resistance to enzymatic degradation compared to d-oligonucleotides. A representative Spiegelmer from the selections performed was size-minimized to two distinct sequences capable of high affinity binding to N/OFQ. The Spiegelmers were shown to antagonize binding of N/OFQ to the ORL1 receptor in a binding-competition assay. The calculated IC50 values for the Spiegelmers NOX 2149 and NOX 2137a/b were 110 nM and 330 nM, respectively. The competitive antagonistic properties of these Spiegelmers were further demonstrated by their effective and specific inhibition of G-protein activation in two additional models. The Spiegelmers antagonized the N/OFQ-induced GTPγS incorporation into cell membranes of a CHO-K1 cell line expressing the human ORL1 receptor. In oocytes from Xenopus laevis, NOX 2149 showed an antagonistic effect to the N/OFQ-ORL 1 receptor system that was functionally coupled with G-protein-regulated inwardly rectifying K+ channels. PMID:14970396

  16. TALE-PvuII Fusion Proteins – Novel Tools for Gene Targeting

    PubMed Central

    Yanik, Mert; Alzubi, Jamal; Lahaye, Thomas; Cathomen, Toni; Pingoud, Alfred; Wende, Wolfgang

    2013-01-01

    Zinc finger nucleases (ZFNs) consist of zinc fingers as DNA-binding module and the non-specific DNA-cleavage domain of the restriction endonuclease FokI as DNA-cleavage module. This architecture is also used by TALE nucleases (TALENs), in which the DNA-binding modules of the ZFNs have been replaced by DNA-binding domains based on transcription activator like effector (TALE) proteins. Both TALENs and ZFNs are programmable nucleases which rely on the dimerization of FokI to induce double-strand DNA cleavage at the target site after recognition of the target DNA by the respective DNA-binding module. TALENs seem to have an advantage over ZFNs, as the assembly of TALE proteins is easier than that of ZFNs. Here, we present evidence that variant TALENs can be produced by replacing the catalytic domain of FokI with the restriction endonuclease PvuII. These fusion proteins recognize only the composite recognition site consisting of the target site of the TALE protein and the PvuII recognition sequence (addressed site), but not isolated TALE or PvuII recognition sites (unaddressed sites), even at high excess of protein over DNA and long incubation times. In vitro, their preference for an addressed over an unaddressed site is > 34,000-fold. Moreover, TALE-PvuII fusion proteins are active in cellula with minimal cytotoxicity. PMID:24349308

  17. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

    PubMed

    Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

    2017-09-27

    Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Neurospora tryptophan synthase: N-terminal analysis and the sequence of the pyridoxal phosphate active site peptide

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pratt, M.L.; Hsu, P.Y.; DeMoss, J.A.

    1986-05-01

    Tryptophan synthase (TS), which catalyzes the final step of tryptophan biosynthesis, is a multifunctional protein requiring pyridoxal phosphate (B6P) for two of its three distinct enzyme activities. TS from Neurospora has a blocked N-terminal, is a homodimer of 150 KDa and binds one mole of B6P per mole of subunit. The authors shown the N-terminal residue to be acyl-serine. The B6P-active site of holoenzyme was labelled by reduction of the B6P-Schiff base with (/sup 3/H)-NaBH/sub 4/, and resulted in a proportionate loss of activity in the two B6P-requiring reactions. SDS-polyacrylamide gel electrophoresis of CNBr-generated peptides showed the labelled, active sitemore » peptide to be 6 KDa. The sequence of this peptide, purified to apparent homogeneity by a combination of C-18 reversed phase and TSK gel filtration HPLC is: gly-arg-pro-gly-gln-leu-his-lys-ala-glu-arg-leu-thr-glu-tyr-ala-gly-gly-ala-gln-ile-xxx-leu-lys-arg-glu-asp-leu-asn-his-xxx-gly-xxx-his-/sub ***/-ile-asn-asn-ala-leu. Although four residues (xxx, /sub ***/) are unidentified, this peptide is minimally 78% homologous with the corresponding peptide from yeast TS, in which residue (/sub ***/) is the lysine that binds B6P.« less

  19. Structure and DNA-Binding Sites of the SWI1 AT-rich Interaction Domain (ARID) Suggest Determinants for Sequence-Specific DNA Recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Suhkmann; Zhang, Ziming; Upchurch, Sean

    2004-04-16

    2 ARID is a homologous family of DNA-binding domains that occur in DNA binding proteins from a wide variety of species, ranging from yeast to nematodes, insects, mammals and plants. SWI1, a member of the SWI/SNF protein complex that is involved in chromatin remodeling during transcription, contains the ARID motif. The ARID domain of human SWI1 (also known as p270) does not select for a specific DNA sequence from a random sequence pool. The lack of sequence specificity shown by the SWI1 ARID domain stands in contrast to the other characterized ARID domains, which recognize specific AT-rich sequences. We havemore » solved the three-dimensional structure of human SWI1 ARID using solution NMR methods. In addition, we have characterized non-specific DNA-binding by the SWI1 ARID domain. Results from this study indicate that a flexible long internal loop in ARID motif is likely to be important for sequence specific DNA-recognition. The structure of human SWI1 ARID domain also represents a distinct structural subfamily. Studies of ARID indicate that boundary of the DNA binding structural and functional domains can extend beyond the sequence homologous region in a homologous family of proteins. Structural studies of homologous domains such as ARID family of DNA-binding domains should provide information to better predict the boundary of structural and functional domains in structural genomic studies. Key Words: ARID, SWI1, NMR, structural genomics, protein-DNA interaction.« less

  20. Improving the performance of minimizers and winnowing schemes.

    PubMed

    Marçais, Guillaume; Pellow, David; Bork, Daniel; Orenstein, Yaron; Shamir, Ron; Kingsford, Carl

    2017-07-15

    The minimizers scheme is a method for selecting k -mers from sequences. It is used in many bioinformatics software tools to bin comparable sequences or to sample a sequence in a deterministic fashion at approximately regular intervals, in order to reduce memory consumption and processing time. Although very useful, the minimizers selection procedure has undesirable behaviors (e.g. too many k -mers are selected when processing certain sequences). Some of these problems were already known to the authors of the minimizers technique, and the natural lexicographic ordering of k -mers used by minimizers was recognized as their origin. Many software tools using minimizers employ ad hoc variations of the lexicographic order to alleviate those issues. We provide an in-depth analysis of the effect of k -mer ordering on the performance of the minimizers technique. By using small universal hitting sets (a recently defined concept), we show how to significantly improve the performance of minimizers and avoid some of its worse behaviors. Based on these results, we encourage bioinformatics software developers to use an ordering based on a universal hitting set or, if not possible, a randomized ordering, rather than the lexicographic order. This analysis also settles negatively a conjecture (by Schleimer et al. ) on the expected density of minimizers in a random sequence. The software used for this analysis is available on GitHub: https://github.com/gmarcais/minimizers.git . gmarcais@cs.cmu.edu or carlk@cs.cmu.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  1. Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences | Center for Cancer Research

    Cancer.gov

    A graphical method is presented for displaying how binding proteins and other macromolecules interact with individual bases of nucleotide sequences. Characters representing the sequence are either oriented normally and placed above a line indicating favorable contact, or upside-down and placed below the line indicating unfavorable contact. The positive or negative height of

  2. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    PubMed

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    PubMed

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Four distinct types of E.C. 1.2.1.30 enzymes can catalyze the reduction of carboxylic acids to aldehydes.

    PubMed

    Stolterfoht, Holly; Schwendenwein, Daniel; Sensen, Christoph W; Rudroff, Florian; Winkler, Margit

    2017-09-10

    Increasing demand for chemicals from renewable resources calls for the development of new biotechnological methods for the reduction of oxidized bio-based compounds. Enzymatic carboxylate reduction is highly selective, both in terms of chemo- and product selectivity, but not many carboxylate reductase enzymes (CARs) have been identified on the sequence level to date. Thus far, their phylogeny is unexplored and very little is known about their structure-function-relationship. CARs minimally contain an adenylation domain, a phosphopantetheinylation domain and a reductase domain. We have recently identified new enzymes of fungal origin, using similarity searches against genomic sequences from organisms in which aldehydes were detected upon incubation with carboxylic acids. Analysis of sequences with known CAR functionality and CAR enzymes recently identified in our laboratory suggests that the three-domain architecture mentioned above is modular. The construction of a distance tree with a subsequent 1000-replicate bootstrap analysis showed that the CAR sequences included in our study fall into four distinct subgroups (one of bacterial origin and three of fungal origin, respectively), each with a bootstrap value of 100%. The multiple sequence alignment of all experimentally confirmed CAR protein sequences revealed fingerprint sequences of residues which are likely to be involved in substrate and co-substrate binding and one of the three catalytic substeps, respectively. The fingerprint sequences broaden our understanding of the amino acids that might be essential for the reduction of organic acids to the corresponding aldehydes in CAR proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Difference in receptor usage between severe acute respiratory syndrome (SARS) coronavirus and SARS-like coronavirus of bat origin.

    PubMed

    Ren, Wuze; Qu, Xiuxia; Li, Wendong; Han, Zhenggang; Yu, Meng; Zhou, Peng; Zhang, Shu-Yi; Wang, Lin-Fa; Deng, Hongkui; Shi, Zhengli

    2008-02-01

    Severe acute respiratory syndrome (SARS) is caused by the SARS-associated coronavirus (SARS-CoV), which uses angiotensin-converting enzyme 2 (ACE2) as its receptor for cell entry. A group of SARS-like CoVs (SL-CoVs) has been identified in horseshoe bats. SL-CoVs and SARS-CoVs share identical genome organizations and high sequence identities, with the main exception of the N terminus of the spike protein (S), known to be responsible for receptor binding in CoVs. In this study, we investigated the receptor usage of the SL-CoV S by combining a human immunodeficiency virus-based pseudovirus system with cell lines expressing the ACE2 molecules of human, civet, or horseshoe bat. In addition to full-length S of SL-CoV and SARS-CoV, a series of S chimeras was constructed by inserting different sequences of the SARS-CoV S into the SL-CoV S backbone. Several important observations were made from this study. First, the SL-CoV S was unable to use any of the three ACE2 molecules as its receptor. Second, the SARS-CoV S failed to enter cells expressing the bat ACE2. Third, the chimeric S covering the previously defined receptor-binding domain gained its ability to enter cells via human ACE2, albeit with different efficiencies for different constructs. Fourth, a minimal insert region (amino acids 310 to 518) was found to be sufficient to convert the SL-CoV S from non-ACE2 binding to human ACE2 binding, indicating that the SL-CoV S is largely compatible with SARS-CoV S protein both in structure and in function. The significance of these findings in relation to virus origin, virus recombination, and host switching is discussed.

  6. First somatic mutation of E2F1 in a critical DNA binding residue discovered in well-differentiated papillary mesothelioma of the peritoneum

    PubMed Central

    2011-01-01

    Background Well differentiated papillary mesothelioma of the peritoneum (WDPMP) is a rare variant of epithelial mesothelioma of low malignancy potential, usually found in women with no history of asbestos exposure. In this study, we perform the first exome sequencing of WDPMP. Results WDPMP exome sequencing reveals the first somatic mutation of E2F1, R166H, to be identified in human cancer. The location is in the evolutionarily conserved DNA binding domain and computationally predicted to be mutated in the critical contact point between E2F1 and its DNA target. We show that the R166H mutation abrogates E2F1's DNA binding ability and is associated with reduced activation of E2F1 downstream target genes. Mutant E2F1 proteins are also observed in higher quantities when compared with wild-type E2F1 protein levels and the mutant protein's resistance to degradation was found to be the cause of its accumulation within mutant over-expressing cells. Cells over-expressing wild-type E2F1 show decreased proliferation compared to mutant over-expressing cells, but cell proliferation rates of mutant over-expressing cells were comparable to cells over-expressing the empty vector. Conclusions The R166H mutation in E2F1 is shown to have a deleterious effect on its DNA binding ability as well as increasing its stability and subsequent accumulation in R166H mutant cells. Based on the results, two compatible theories can be formed: R166H mutation appears to allow for protein over-expression while minimizing the apoptotic consequence and the R166H mutation may behave similarly to SV40 large T antigen, inhibiting tumor suppressive functions of retinoblastoma protein 1. PMID:21955916

  7. Recruitment of CRISPR-Cas systems by Tn7-like transposons.

    PubMed

    Peters, Joseph E; Makarova, Kira S; Shmakov, Sergey; Koonin, Eugene V

    2017-08-29

    A survey of bacterial and archaeal genomes shows that many Tn7-like transposons contain minimal type I-F CRISPR-Cas systems that consist of fused cas8f and cas5f , cas7f , and cas6f genes and a short CRISPR array. Several small groups of Tn7-like transposons encompass similarly truncated type I-B CRISPR-Cas. This minimal gene complement of the transposon-associated CRISPR-Cas systems implies that they are competent for pre-CRISPR RNA (precrRNA) processing yielding mature crRNAs and target binding but not target cleavage that is required for interference. Phylogenetic analysis demonstrates that evolution of the CRISPR-Cas-containing transposons included a single, ancestral capture of a type I-F locus and two independent instances of type I-B loci capture. We show that the transposon-associated CRISPR arrays contain spacers homologous to plasmid and temperate phage sequences and, in some cases, chromosomal sequences adjacent to the transposon. We hypothesize that the transposon-encoded CRISPR-Cas systems generate displacement (R-loops) in the cognate DNA sites, targeting the transposon to these sites and thus facilitating their spread via plasmids and phages. These findings suggest the existence of RNA-guided transposition and fit the guns-for-hire concept whereby mobile genetic elements capture host defense systems and repurpose them for different stages in the life cycle of the element.

  8. Identification and Structural Characterization of the ALIX-Binding Late Domains of Simian Immunodeficiency Virus SIV mac239 and SIV agmTan-1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Q Zhai; M Landesman; H Robinson

    2011-12-31

    Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub MAC239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less

  9. Identification and Structural Characterization of the ALIX-Binding Late Domains of Simian Immunodeficiency Virus SIVmac239 and SIVagmTan-1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhai, Q.; Robinson, H.; Landesman, M. B.

    2011-01-01

    Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub mac239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less

  10. Transcriptional activation of the Escherichia coli adaptive response gene aidB is mediated by binding of methylated Ada protein. Evidence for a new consensus sequence for Ada-binding sites.

    PubMed

    Landini, P; Volkert, M R

    1995-04-07

    The Escherichia coli aidB gene is part of the adaptive response to DNA methylation damage. Genes belonging to the adaptive response are positively regulated by the ada gene; the Ada protein acts as a transcriptional activator when methylated in one of its cysteine residues at position 69. Through DNaseI protection assays, we show that methylated Ada (meAda) is able to bind a DNA sequence between 40 and 60 base pairs upstream of the aidB transcriptional startpoint. Binding of meAda is necessary to activate transcription of the adaptive response genes; accordingly, in vitro transcription of aidB is dependent on the presence of meAda. Unmethylated Ada protein shows no protection against DNaseI digestion in the aidB promoter region nor does it promote aidB in vitro transcription. The aidB Ada-binding site shows only weak homology to the proposed consensus sequences for Ada-binding sites in E. coli (AAANNAA and AAAGCGCA) but shares a higher degree of similarity with the Ada-binding regions from other bacterial species, such as Salmonella typhimurium and Bacillus subtilis. Based on the comparison of five different Ada-dependent promoter regions, we suggest that a possible recognition sequence for meAda might be AATnnnnnnG-CAA. Higher concentrations of Ada are required for the binding of aidB than for the ada promoter, suggesting lower affinity of the protein for the aidB Ada-binding site. Common features in the Ada-binding regions of ada and aidB are a high A/T content, the presence of an inverted repeat structure, and their position relative to the transcriptional start site. We propose that these elements, in addition to the proposed recognition sequence, are important for binding of the Ada protein.

  11. The amino acid motif L/IIxxFE defines a novel actin-binding sequence in PDZ-RhoGEF

    PubMed Central

    Banerjee, Jayashree; Fischer, Christopher C.; Wedegaertner, Philip B.

    2009-01-01

    PDZ-RhoGEF is a member of the regulator of G protein signaling (RGS) domain-containing RhoGEFs (RGS-RhoGEFs) that link activated heterotrimeric G protein α subunits of the G12 family to activation of the small GTPase RhoA. Unique among the RGS-RhoGEFs, PDZ-RhoGEF contains a short sequence that localizes the protein to the actin cytoskeleton. In this report, we demonstrate that the actin-binding domain, located between amino acids 561–585, directly binds to F-actin in vitro. Extensive mutagenesis identifies isoleucine 568, isoleucine 569, phenylalanine 572, and glutamic acid 573 as necessary for binding to actin and for co-localization with the actin cytoskeleton in cells. These results define a novel actin-binding sequence in PDZ-RhoGEF with a critical amino acid motif of IIxxFE. Moreover, sequence analysis identifies a similar actin-binding motif in the N-terminus of the RhoGEF frabin, and, as with PDZ-RhoGEF, mutagenesis and actin interaction experiments demonstrate a motif of LIxxFE, consisting of the key amino acids leucine 23, isoleucine 24, phenylalanine 27, and glutamic acid 28. Taken together, results with PDZ-RhoGEF and frabin identify a novel actin binding sequence. Lastly, inducible dimerization of the actin-binding region of PDZ-RhoGEF revealed a dimerization-dependent actin bundling activity in vitro. PDZ-RhoGEF exists in cells as a dimer, raising the possibility that PDZ-RhoGEF could influence actin structure independent of its ability to activate RhoA. PMID:19618964

  12. Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

    PubMed Central

    Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-01-01

    Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039

  13. Characterization of the soluble allergenic proteins of cashew nut (Anacardium occidentale L.).

    PubMed

    Teuber, Suzanne S; Sathe, Shridhar K; Peterson, W Rich; Roux, Kenneth H

    2002-10-23

    The allergens associated with cashew food allergy have not been well-characterized. We sought to identify the major allergens in cashew nut by performing IgE immunoblots to dissociated and reduced or nonreduced cashew protein extracts, followed by sequencing of the peptides of interest. Sera from 15 subjects with life-threatening reactions to cashews and 8 subjects who tolerate cashews but have life-threatening reactions to other tree nuts were compared. An aqueous cashew protein extract containing albumin/globulin was separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and subjected to IgE immunoblotting using patient sera. Selected IgE reactive bands were subjected to N-terminal amino acid sequencing. Each of the 15 sera from cashew-allergic subjects showed IgE binding to the cashew protein extract. The dominant IgE-binding antigens in the reduced preparations included peptides in the 31-35 kD range, consistent with the large subunits of the major storage 13S globulin (legumin-like protein). Low-molecular-weight polypeptides of the 2S albumin family, with similarity to the major walnut allergen Jug r 1, also bound IgE. The sera from eight patients who tolerate cashew but displayed allergies to other tree nuts showed only minimal or no IgE binding to cashew. Cashew food allergy is associated with the presence of IgE directed against the major seed storage proteins in cashew, including the 13S globulin (legumin group) and 2S albumins, both of which represent major allergen classes in several plant seeds. Thus, the legumin-group proteins and 2S albumins are again identified as major food allergens, which will help further research into seed protein allergenicity.

  14. Characterization of IntA, a Bidirectional Site-Specific Recombinase Required for Conjugative Transfer of the Symbiotic Plasmid of Rhizobium etli CFN42

    PubMed Central

    Hernández-Tamayo, Rogelio; Sohlenkamp, Christian; Puente, José Luis; Brom, Susana

    2013-01-01

    Site-specific recombination occurs at short specific sequences, mediated by the cognate recombinases. IntA is a recombinase from Rhizobium etli CFN42 and belongs to the tyrosine recombinase family. It allows cointegration of plasmid p42a and the symbiotic plasmid via site-specific recombination between attachment regions (attA and attD) located in each replicon. Cointegration is needed for conjugative transfer of the symbiotic plasmid. To characterize this system, two plasmids harboring the corresponding attachment sites and intA were constructed. Introduction of these plasmids into R. etli revealed IntA-dependent recombination events occurring at high frequency. Interestingly, IntA promotes not only integration, but also excision events, albeit at a lower frequency. Thus, R. etli IntA appears to be a bidirectional recombinase. IntA was purified and used to set up electrophoretic mobility shift assays with linear fragments containing attA and attD. IntA-dependent retarded complexes were observed only with fragments containing either attA or attD. Specific retarded complexes, as well as normal in vivo recombination abilities, were seen even in derivatives harboring only a minimal attachment region (comprising the 5-bp central region flanked by 9- to 11-bp inverted repeats). DNase I-footprinting assays with IntA revealed specific protection of these zones. Mutations that disrupt the integrity of the 9- to 11-bp inverted repeats abolish both specific binding and recombination ability, while mutations in the 5-bp central region severely reduce both binding and recombination. These results show that IntA is a bidirectional recombinase that binds to att regions without requiring neighboring sequences as enhancers of recombination. PMID:23935046

  15. BiPPred: Combined sequence- and structure-based prediction of peptide binding to the Hsp70 chaperone BiP.

    PubMed

    Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris

    2016-10-01

    Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Correlation of Local Effects of DNA Sequence and Position of Beta-Alanine Inserts with Polyamide-DNA Complex Binding Affinities and Kinetics

    PubMed Central

    Wang, Shuo; Nanjunda, Rupesh; Aston, Karl; Bashkin, James K.; Wilson, W. David

    2012-01-01

    In order to better understand the effects of β-alanine (β) substitution and the number of heterocycles on DNA binding affinity and selectivity, the interactions of an eight-ring hairpin polyamide (PA) and two β derivatives as well as a six-heterocycle analog have been investigated with their cognate DNA sequence, 5′-TGGCTT-3′. Binding selectivity and the effects of β have been investigated with the cognate and five mutant DNAs. A set of powerful and complementary methods have been employed for both energetic and structural evaluations: UV-melting, biosensor-surface plasmon resonance, isothermal titration calorimetry, circular dichroism and a DNA ligation ladder global structure assay. The reduced number of heterocycles in the six-ring PA weakens the binding affinity; however, the smaller PA aggregates significantly less than the larger PAs, and allows us to obtain the binding thermodynamics. The PA-DNA binding enthalpy is large and negative with a large negative ΔCp, and is the primary driving component of the Gibbs free energy. The complete SPR binding results clearly show that β substitutions can substantially weaken the binding affinity of hairpin PAs in a position-dependent manner. More importantly, the changes in PA binding to the mutant DNAs further confirm the position-dependent effects on PA-DNA interaction affinity. Comparison of mutant DNA sequences also shows a different effect in recognition of T•A versus A•T base pairs. The effects of DNA mutations on binding of a single PA as well as the effects of the position of β substitution on binding tell a clear and very important story about sequence dependent binding of PAs to DNA. PMID:23167504

  17. Molecular cloning of MSSP-2, a c-myc gene single-strand binding protein: characterization of binding specificity and DNA replication activity.

    PubMed Central

    Takai, T; Nishita, Y; Iguchi-Ariga, S M; Ariga, H

    1994-01-01

    We have previously reported the human cDNA encoding MSSP-1, a sequence-specific double- and single-stranded DNA binding protein [Negishi, Nishita, Saëgusa, Kakizaki, Galli, Kihara, Tamai, Miyajima, Iguchi-Ariga and Ariga (1994) Oncogene, 9, 1133-1143]. MSSP-1 binds to a DNA replication origin/transcriptional enhancer of the human c-myc gene and has turned out to be identical with Scr2, a human protein which complements the defect of cdc2 kinase in S.pombe [Kataoka and Nojima (1994) Nucleic Acid Res., 22, 2687-2693]. We have cloned the cDNA for MSSP-2, another member of the MSSP family of proteins. The MSSP-2 cDNA shares highly homologous sequences with MSSP-1 cDNA, except for the insertion of 48 bp coding 16 amino acids near the C-terminus. Like MSSP-1, MSSP-2 has RNP-1 consensus sequences. The results of the experiments using bacterially expressed MSSP-2, and its deletion mutants, as histidine fusion proteins suggested that the binding specificity of MSSP-2 to double- and single-stranded DNA is the same as that of MSSP-1, and that the RNP consensus sequences are required for the DNA binding of the protein. MSSP-2 stimulated the DNA replication of an SV40-derived plasmid containing the binding sequence for MSSP-1 or -2. MSSP-2 is hence suggested to play an important role in regulation of DNA replication. Images PMID:7838710

  18. Binding Site Turnover Produces Pervasive Quantitative Changes in Transcription Factor Binding between Closely Related Drosophila Species

    PubMed Central

    Trapnell, Cole; Davidson, Stuart; Pachter, Lior; Chu, Hou Cheng; Tonkin, Leath A.; Biggin, Mark D.; Eisen, Michael B.

    2010-01-01

    Changes in gene expression play an important role in evolution, yet the molecular mechanisms underlying regulatory evolution are poorly understood. Here we compare genome-wide binding of the six transcription factors that initiate segmentation along the anterior-posterior axis in embryos of two closely related species: Drosophila melanogaster and Drosophila yakuba. Where we observe binding by a factor in one species, we almost always observe binding by that factor to the orthologous sequence in the other species. Levels of binding, however, vary considerably. The magnitude and direction of the interspecies differences in binding levels of all six factors are strongly correlated, suggesting a role for chromatin or other factor-independent forces in mediating the divergence of transcription factor binding. Nonetheless, factor-specific quantitative variation in binding is common, and we show that it is driven to a large extent by the gain and loss of cognate recognition sequences for the given factor. We find only a weak correlation between binding variation and regulatory function. These data provide the first genome-wide picture of how modest levels of sequence divergence between highly morphologically similar species affect a system of coordinately acting transcription factors during animal development, and highlight the dominant role of quantitative variation in transcription factor binding over short evolutionary distances. PMID:20351773

  19. Dynamic basis for dG•dT misincorporation via tautomerization and ionization

    NASA Astrophysics Data System (ADS)

    Kimsey, Isaac J.; Szymanski, Eric S.; Zahurancik, Walter J.; Shakya, Anisha; Xue, Yi; Chu, Chia-Chieh; Sathyamoorthy, Bharathwaj; Suo, Zucai; Al-Hashimi, Hashim M.

    2018-02-01

    Tautomeric and anionic Watson-Crick-like mismatches have important roles in replication and translation errors through mechanisms that are not fully understood. Here, using NMR relaxation dispersion, we resolve a sequence-dependent kinetic network connecting G•T/U wobbles with three distinct Watson-Crick mismatches: two rapidly exchanging tautomeric species (Genol•T/UG•Tenol/Uenol population less than 0.4%) and one anionic species (G•T-/U- population around 0.001% at neutral pH). The sequence-dependent tautomerization or ionization step was inserted into a minimal kinetic mechanism for correct incorporation during replication after the initial binding of the nucleotide, leading to accurate predictions of the probability of dG•dT misincorporation across different polymerases and pH conditions and for a chemically modified nucleotide, and providing mechanisms for sequence-dependent misincorporation. Our results indicate that the energetic penalty for tautomerization and/or ionization accounts for an approximately 10-2 to 10-3-fold discrimination against misincorporation, which proceeds primarily via tautomeric dGenol•dT and dG•dTenol, with contributions from anionic dG•dT- dominant at pH 8.4 and above or for some mutagenic nucleotides.

  20. In vitro selection of high temperature Zn(2+)-dependent DNAzymes.

    PubMed

    Nelson, Kevin E; Bruesehoff, Peter J; Lu, Yi

    2005-08-01

    In vitro selection of Zn(2+)-dependent RNA-cleaving DNAzymes with activity at 90 degrees C has yielded a diverse spool of selected sequences. The RNA cleavage efficiency was found in all cases to be specific for Zn(2+) over Pb(2+), Ca(2+), Cd(2+), Co(2+), Hg(2+), and Mg(2+). The Zn(2+)-dependent activity assay of the most active sequence showed that the DNAzyme possesses an apparent Zn(2+)-binding dissociation constant of 234 muM and that its activity increases with increasing temperatures from 50-90 degrees C. A fit of the Arrhenius plot data gave E(a) = 15.3 kcal mol(-1). Surprisingly, the selected Zn(2+)-dependent DNAzymes showed only a modest (approximately 3-fold) activity enhancement over the background rate of cleavage of random sequences containing a single embedded ribonucleotide within an otherwise DNA oligonucleotide. The result is attributable to the ability of DNA to sustain cleavage activity at high temperature with minimal secondary structure when Zn(2+) is present. Since this effect is highly specific for Zn(2+), this metal ion may play a special role in molecular evolution of nucleic acids at high temperature.

  1. MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets.

    PubMed

    Kim, Taehyung; Tyndel, Marc S; Huang, Haiming; Sidhu, Sachdev S; Bader, Gary D; Gfeller, David; Kim, Philip M

    2012-03-01

    Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors.

  2. Hepatitis Delta Antigen Requires a Flexible Quasi-Double-Stranded RNA Structure To Bind and Condense Hepatitis Delta Virus RNA in a Ribonucleoprotein Complex

    PubMed Central

    Griffin, Brittany L.; Chasovskikh, Sergey; Dritschilo, Anatoly

    2014-01-01

    ABSTRACT The circular genome and antigenome RNAs of hepatitis delta virus (HDV) form characteristic unbranched, quasi-double-stranded RNA secondary structures in which short double-stranded helical segments are interspersed with internal loops and bulges. The ribonucleoprotein complexes (RNPs) formed by these RNAs with the virus-encoded protein hepatitis delta antigen (HDAg) perform essential roles in the viral life cycle, including viral replication and virion formation. Little is understood about the formation and structure of these complexes and how they function in these key processes. Here, the specific RNA features required for HDAg binding and the topology of the complexes formed were investigated. Selective 2′OH acylation analyzed by primer extension (SHAPE) applied to free and HDAg-bound HDV RNAs indicated that the characteristic secondary structure of the RNA is preserved when bound to HDAg. Notably, the analysis indicated that predicted unpaired positions in the RNA remained dynamic in the RNP. Analysis of the in vitro binding activity of RNAs in which internal loops and bulges were mutated and of synthetically designed RNAs demonstrated that the distinctive secondary structure, not the primary RNA sequence, is the major determinant of HDAg RNA binding specificity. Atomic force microscopy analysis of RNPs formed in vitro revealed complexes in which the HDV RNA is substantially condensed by bending or wrapping. Our results support a model in which the internal loops and bulges in HDV RNA contribute flexibility to the quasi-double-stranded structure that allows RNA bending and condensing by HDAg. IMPORTANCE RNA-protein complexes (RNPs) formed by the hepatitis delta virus RNAs and protein, HDAg, perform critical roles in virus replication. Neither the structures of these RNPs nor the RNA features required to form them have been characterized. HDV RNA is unusual in that it forms an unbranched quasi-double-stranded structure in which short base-paired segments are interspersed with internal loops and bulges. We analyzed the role of the HDV RNA sequence and secondary structure in the formation of a minimal RNP and visualized the structure of this RNP using atomic force microscopy. Our results indicate that HDAg does not recognize the primary sequence of the RNA; rather, the principle contribution of unpaired bases in HDV RNA to HDAg binding is to allow flexibility in the unbranched quasi-double-stranded RNA structure. Visualization of RNPs by atomic force microscopy indicated that the RNA is significantly bent or condensed in the complex. PMID:24741096

  3. The zinc fingers of YY1 bind single-stranded RNA with low sequence specificity.

    PubMed

    Wai, Dorothy C C; Shihab, Manar; Low, Jason K K; Mackay, Joel P

    2016-11-02

    Classical zinc fingers (ZFs) are traditionally considered to act as sequence-specific DNA-binding domains. More recently, classical ZFs have been recognised as potential RNA-binding modules, raising the intriguing possibility that classical-ZF transcription factors are involved in post-transcriptional gene regulation via direct RNA binding. To date, however, only one classical ZF-RNA complex, that involving TFIIIA, has been structurally characterised. Yin Yang-1 (YY1) is a multi-functional transcription factor involved in many regulatory processes, and binds DNA via four classical ZFs. Recent evidence suggests that YY1 also interacts with RNA, but the molecular nature of the interaction remains unknown. In the present work, we directly assess the ability of YY1 to bind RNA using in vitro assays. Systematic Evolution of Ligands by EXponential enrichment (SELEX) was used to identify preferred RNA sequences bound by the YY1 ZFs from a randomised library over multiple rounds of selection. However, a strong motif was not consistently recovered, suggesting that the RNA sequence selectivity of these domains is modest. YY1 ZF residues involved in binding to single-stranded RNA were identified by NMR spectroscopy and found to be largely distinct from the set of residues involved in DNA binding, suggesting that interactions between YY1 and ssRNA constitute a separate mode of nucleic acid binding. Our data are consistent with recent reports that YY1 can bind to RNA in a low-specificity, yet physiologically relevant manner. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Human T-cell leukemia virus type 1 Tax requires direct access to DNA for recruitment of CREB binding protein to the viral promoter.

    PubMed

    Lenzmeier, B A; Giebler, H A; Nyborg, J K

    1998-02-01

    Efficient human T-cell leukemia virus type 1 (HTLV-1) replication and viral gene expression are dependent upon the virally encoded oncoprotein Tax. To activate HTLV-1 transcription, Tax interacts with the cellular DNA binding protein cyclic AMP-responsive element binding protein (CREB) and recruits the coactivator CREB binding protein (CBP), forming a nucleoprotein complex on the three viral cyclic AMP-responsive elements (CREs) in the HTLV-1 promoter. Short stretches of dG-dC-rich (GC-rich) DNA, immediately flanking each of the viral CREs, are essential for Tax recruitment of CBP in vitro and Tax transactivation in vivo. Although the importance of the viral CRE-flanking sequences is well established, several studies have failed to identify an interaction between Tax and the DNA. The mechanistic role of the viral CRE-flanking sequences has therefore remained enigmatic. In this study, we used high resolution methidiumpropyl-EDTA iron(II) footprinting to show that Tax extended the CREB footprint into the GC-rich DNA flanking sequences of the viral CRE. The Tax-CREB footprint was enhanced but not extended by the KIX domain of CBP, suggesting that the coactivator increased the stability of the nucleoprotein complex. Conversely, the footprint pattern of CREB on a cellular CRE lacking GC-rich flanking sequences did not change in the presence of Tax or Tax plus KIX. The minor-groove DNA binding drug chromomycin A3 bound to the GC-rich flanking sequences and inhibited the association of Tax and the Tax-CBP complex without affecting CREB binding. Tax specifically cross-linked to the viral CRE in the 5'-flanking sequence, and this cross-link was blocked by chromomycin A3. Together, these data support a model where Tax interacts directly with both CREB and the minor-groove viral CRE-flanking sequences to form a high-affinity binding site for the recruitment of CBP to the HTLV-1 promoter.

  5. Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

    PubMed Central

    Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

    2011-01-01

    DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738

  6. Exploiting sequence similarity to validate the sensitivity of SNP arrays in detecting fine-scaled copy number variations.

    PubMed

    Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam

    2010-04-15

    High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.

  7. Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning.

    PubMed

    Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan

    2016-04-20

    DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .

  8. Molecular dynamics studies on the DNA-binding process of ERG.

    PubMed

    Beuerle, Matthias G; Dufton, Neil P; Randi, Anna M; Gould, Ian R

    2016-11-15

    The ETS family of transcription factors regulate gene targets by binding to a core GGAA DNA-sequence. The ETS factor ERG is required for homeostasis and lineage-specific functions in endothelial cells, some subset of haemopoietic cells and chondrocytes; its ectopic expression is linked to oncogenesis in multiple tissues. To date details of the DNA-binding process of ERG including DNA-sequence recognition outside the core GGAA-sequence are largely unknown. We combined available structural and experimental data to perform molecular dynamics simulations to study the DNA-binding process of ERG. In particular we were able to reproduce the ERG DNA-complex with a DNA-binding simulation starting in an unbound configuration with a final root-mean-square-deviation (RMSD) of 2.1 Å to the core ETS domain DNA-complex crystal structure. This allowed us to elucidate the relevance of amino acids involved in the formation of the ERG DNA-complex and to identify Arg385 as a novel key residue in the DNA-binding process. Moreover we were able to show that water-mediated hydrogen bonds are present between ERG and DNA in our simulations and that those interactions have the potential to achieve sequence recognition outside the GGAA core DNA-sequence. The methodology employed in this study shows the promising capabilities of modern molecular dynamics simulations in the field of protein DNA-interactions.

  9. Unexpected DNA affinity and sequence selectivity through core rigidity in guanidinium-based minor groove binders.

    PubMed

    Nagle, Padraic S; McKeever, Caitriona; Rodriguez, Fernando; Nguyen, Binh; Wilson, W David; Rozas, Isabel

    2014-09-25

    In this paper we report the design and biophysical evaluation of novel rigid-core symmetric and asymmetric dicationic DNA binders containing 9H-fluorene and 9,10-dihydroanthracene cores as well as the synthesis of one of these fluorene derivatives. First, the affinity toward particular DNA sequences of these compounds and flexible core derivatives was evaluated by means of surface plasmon resonance and thermal denaturation experiments finding that the position of the cations significantly influence the binding strength. Then their affinity and mode of binding were further studied by performing circular dichroism and UV studies and the results obtained were rationalized by means of DFT calculations. We found that the fluorene derivatives prepared have the ability to bind to the minor groove of certain DNA sequences and intercalate to others, whereas the dihydroanthracene compounds bind via intercalation to all the DNA sequences studied here.

  10. Bioinformatic Analysis of the Contribution of Primer Sequences to Aptamer Structures

    PubMed Central

    Ellington, Andrew D.

    2009-01-01

    Aptamers are nucleic acid molecules selected in vitro to bind a particular ligand. While numerous experimental studies have examined the sequences, structures, and functions of individual aptamers, considerably fewer studies have applied bioinformatics approaches to try to infer more general principles from these individual studies. We have used a large Aptamer Database to parse the contributions of both random and constant regions to the secondary structures of more than 2000 aptamers. We find that the constant, primer-binding regions do not, in general, contribute significantly to aptamer structures. These results suggest that (a) binding function is not contributed to nor constrained by constant regions; (b) in consequence, the landscape of functional binding sequences is sparse but robust, favoring scenarios for short, functional nucleic acid sequences near origins; and (c) many pool designs for the selection of aptamers are likely to prove robust. PMID:18594898

  11. A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

    PubMed

    Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

    2013-07-18

    Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.

  12. The Adaptor Molecule Nck Localizes the WAVE Complex to Promote Actin Polymerization during CEACAM3-Mediated Phagocytosis of Bacteria

    PubMed Central

    Delgado Tascón, Julia; Nyffenegger-Jann, Naja J.; Hauck, Christof R.

    2012-01-01

    Background CEACAM3 is a granulocyte receptor mediating the opsonin-independent recognition and phagocytosis of human-restricted CEACAM-binding bacteria. CEACAM3 function depends on an intracellular immunoreceptor tyrosine-based activation motif (ITAM)-like sequence that is tyrosine phosphorylated by Src family kinases upon receptor engagement. The phosphorylated ITAM-like sequence triggers GTP-loading of Rac by directly associating with the guanine nucleotide exchange factor (GEF) Vav. Rac stimulation in turn is critical for actin cytoskeleton rearrangements that generate lamellipodial protrusions and lead to bacterial uptake. Principal Findings In our present study we provide biochemical and microscopic evidence that the adaptor proteins Nck1 and Nck2, but not CrkL, Grb2 or SLP-76, bind to tyrosine phosphorylated CEACAM3. The association is phosphorylation-dependent and requires the Nck SH2 domain. Overexpression of the isolated Nck1 SH2 domain, RNAi-mediated knock-down of Nck1, or genetic deletion of Nck1 and Nck2 interfere with CEACAM3-mediated bacterial internalization and with the formation of lamellipodial protrusions. Nck is constitutively associated with WAVE2 and directs the actin nucleation promoting WAVE complex to tyrosine phosphorylated CEACAM3. In turn, dominant-negative WAVE2 as well as shRNA-mediated knock-down of WAVE2 or the WAVE-complex component Nap1 reduce internalization of bacteria. Conclusions Our results provide novel mechanistic insight into CEACAM3-initiated phagocytosis. We suggest that the CEACAM3 ITAM-like sequence is optimized to co-ordinate a minimal set of cellular factors needed to efficiently trigger actin-based lamellipodial protrusions and rapid pathogen engulfment. PMID:22448228

  13. The adaptor molecule Nck localizes the WAVE complex to promote actin polymerization during CEACAM3-mediated phagocytosis of bacteria.

    PubMed

    Pils, Stefan; Kopp, Kathrin; Peterson, Lisa; Delgado Tascón, Julia; Nyffenegger-Jann, Naja J; Hauck, Christof R

    2012-01-01

    CEACAM3 is a granulocyte receptor mediating the opsonin-independent recognition and phagocytosis of human-restricted CEACAM-binding bacteria. CEACAM3 function depends on an intracellular immunoreceptor tyrosine-based activation motif (ITAM)-like sequence that is tyrosine phosphorylated by Src family kinases upon receptor engagement. The phosphorylated ITAM-like sequence triggers GTP-loading of Rac by directly associating with the guanine nucleotide exchange factor (GEF) Vav. Rac stimulation in turn is critical for actin cytoskeleton rearrangements that generate lamellipodial protrusions and lead to bacterial uptake. In our present study we provide biochemical and microscopic evidence that the adaptor proteins Nck1 and Nck2, but not CrkL, Grb2 or SLP-76, bind to tyrosine phosphorylated CEACAM3. The association is phosphorylation-dependent and requires the Nck SH2 domain. Overexpression of the isolated Nck1 SH2 domain, RNAi-mediated knock-down of Nck1, or genetic deletion of Nck1 and Nck2 interfere with CEACAM3-mediated bacterial internalization and with the formation of lamellipodial protrusions. Nck is constitutively associated with WAVE2 and directs the actin nucleation promoting WAVE complex to tyrosine phosphorylated CEACAM3. In turn, dominant-negative WAVE2 as well as shRNA-mediated knock-down of WAVE2 or the WAVE-complex component Nap1 reduce internalization of bacteria. Our results provide novel mechanistic insight into CEACAM3-initiated phagocytosis. We suggest that the CEACAM3 ITAM-like sequence is optimized to co-ordinate a minimal set of cellular factors needed to efficiently trigger actin-based lamellipodial protrusions and rapid pathogen engulfment.

  14. Specific binding of the Xanthomonas campestris pv. vesicatoria AraC-type transcriptional activator HrpX to plant-inducible promoter boxes.

    PubMed

    Koebnik, Ralf; Krüger, Antje; Thieme, Frank; Urban, Alexander; Bonas, Ulla

    2006-11-01

    The pathogenicity of the plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria depends on a type III secretion system which is encoded by the 23-kb hrp (hypersensitive response and pathogenicity) gene cluster. Expression of the hrp operons is strongly induced in planta and in a special minimal medium and depends on two regulatory proteins, HrpG and HrpX. In this study, DNA affinity enrichment was used to demonstrate that the AraC-type transcriptional activator HrpX binds to a conserved cis-regulatory element, the plant-inducible promoter (PIP) box (TTCGC-N(15)-TTCGC), present in the promoter regions of four hrp operons. No binding of HrpX was observed when DNA fragments lacking a PIP box were used. HrpX also bound to a DNA fragment containing an imperfect PIP box (TTCGC-N(8)-TTCGT). Dinucleotide replacements in each half-site of the PIP box strongly decreased binding of HrpX, while simultaneous dinucleotide replacements in both half-sites completely abolished binding. Based on the complete genome sequence of Xanthomonas campestris pv. vesicatoria, putative plant-inducible promoters consisting of a PIP box and a -10 promoter motif were identified in the promoter regions of almost all HrpX-activated genes. Bioinformatic analyses and reverse transcription-PCR experiments revealed novel HrpX-dependent genes, among them a NUDIX hydrolase gene and several genes with a predicted role in the degradation of the plant cell wall. We conclude that HrpX is the most downstream component of the hrp regulatory cascade, which is proposed to directly activate most genes of the hrpX regulon via binding to corresponding PIP boxes.

  15. Topology-based modeling of intrinsically disordered proteins: balancing intrinsic folding and intermolecular interactions.

    PubMed

    Ganguly, Debabani; Chen, Jianhan

    2011-04-01

    Coupled binding and folding is frequently involved in specific recognition of so-called intrinsically disordered proteins (IDPs), a newly recognized class of proteins that rely on a lack of stable tertiary fold for function. Here, we exploit topology-based Gō-like modeling as an effective tool for the mechanism of IDP recognition within the theoretical framework of minimally frustrated energy landscape. Importantly, substantial differences exist between IDPs and globular proteins in both amino acid sequence and binding interface characteristics. We demonstrate that established Gō-like models designed for folded proteins tend to over-estimate the level of residual structures in unbound IDPs, whereas under-estimating the strength of intermolecular interactions. Such systematic biases have important consequences in the predicted mechanism of interaction. A strategy is proposed to recalibrate topology-derived models to balance intrinsic folding propensities and intermolecular interactions, based on experimental knowledge of the overall residual structure level and binding affinity. Applied to pKID/KIX, the calibrated Gō-like model predicts a dominant multistep sequential pathway for binding-induced folding of pKID that is initiated by KIX binding via the C-terminus in disordered conformations, followed by binding and folding of the rest of C-terminal helix and finally the N-terminal helix. This novel mechanism is consistent with key observations derived from a recent NMR titration and relaxation dispersion study and provides a molecular-level interpretation of kinetic rates derived from dispersion curve analysis. These case studies provide important insight into the applicability and potential pitfalls of topology-based modeling for studying IDP folding and interaction in general. Copyright © 2011 Wiley-Liss, Inc.

  16. Control of DEMETER DNA demethylase gene transcription in male and female gamete companion cells in Arabidopsis thaliana.

    PubMed

    Park, Jin-Sup; Frost, Jennifer M; Park, Kyunghyuk; Ohr, Hyonhwa; Park, Guen Tae; Kim, Seohyun; Eom, Hyunjoo; Lee, Ilha; Brooks, Janie S; Fischer, Robert L; Choi, Yeonhee

    2017-02-21

    The DEMETER (DME) DNA glycosylase initiates active DNA demethylation via the base-excision repair pathway and is vital for reproduction in Arabidopsis thaliana DME-mediated DNA demethylation is preferentially targeted to small, AT-rich, and nucleosome-depleted euchromatic transposable elements, influencing expression of adjacent genes and leading to imprinting in the endosperm. In the female gametophyte, DME expression and subsequent genome-wide DNA demethylation are confined to the companion cell of the egg, the central cell. Here, we show that, in the male gametophyte, DME expression is limited to the companion cell of sperm, the vegetative cell, and to a narrow window of time: immediately after separation of the companion cell lineage from the germline. We define transcriptional regulatory elements of DME using reporter genes, showing that a small region, which surprisingly lies within the DME gene, controls its expression in male and female companion cells. DME expression from this minimal promoter is sufficient to rescue seed abortion and the aberrant DNA methylome associated with the null dme-2 mutation. Within this minimal promoter, we found short, conserved enhancer sequences necessary for the transcriptional activities of DME and combined predicted binding motifs with published transcription factor binding coordinates to produce a list of candidate upstream pathway members in the genetic circuitry controlling DNA demethylation in gamete companion cells. These data show how DNA demethylation is regulated to facilitate endosperm gene imprinting and potential transgenerational epigenetic regulation, without subjecting the germline to potentially deleterious transposable element demethylation.

  17. Mapping of the immunophilin-immunosuppressant site of interaction on calcineurin.

    PubMed

    Husi, H; Luyten, M A; Zurini, M G

    1994-05-13

    The interaction of the immunosuppressive complexes cyclosporin A-cyclophilin A and FK506 binding protein-FK506 with the Ca(2+)- and calmodulin-dependent protein phosphatase calcineurin has been investigated by means of photoaffinity labeling and chemical cross-linking. Photolabeling of purified bovine brain calcineurin with the affinity label [O-[4-[4-(1-diazo-2,2,2-trifluoroethyl)benzoyl]aminobutanoyl]-D- serine8]cyclosporin in the presence of cyclophilin A results, in addition to the labeling of cyclophilin itself, in the transfer of some of the chemical probe to both the catalytic subunit A and the regulatory subunit B of calcineurin. Chemical cross-linking studies with disuccinimidyl suberate in the presence of either cyclophilin A, B, or C in complex with cyclosporin A or FK506 binding protein-FK506 result on the other hand in the apparently exclusive and strictly immunosuppressant-dependent formation of covalent immunophilin-calcineurin B subunit products. Cross-linking of immunophilins to calcineurin B subunit requires the presence of subunit A. In the present study, using a set of recombinant maltose-binding protein fusion products representing different stretches of the catalytic subunit A, we were able to map the minimal calcineurin A sequence necessary for immunophilin-ligand-calcineurin B interaction to occur.

  18. Ultra-sensitive detection of kanamycin for food safety using a reduced graphene oxide-based fluorescent aptasensor

    NASA Astrophysics Data System (ADS)

    Ha, Na-Reum; Jung, In-Pil; La, Im-Joung; Jung, Ho-Sup; Yoon, Moon-Young

    2017-01-01

    Overuse of antibiotics has caused serious problems, such as appearance of super bacteria, whose accumulation in the human body through the food chain is a concern. Kanamycin is a common antibiotic used to treat diverse infections; however, residual kanamycin can cause many side effects in humans. Thus, development of an ultra-sensitive, precise, and simple detection system for residual kanamycin in food products is urgently needed for food safety. In this study, we identified kanamycin-binding aptamers via a new screening method, and truncated variants were analyzed for optimization of the minimal sequence required for target binding. We found various aptamers with high binding affinity from 34.7 to 669 nanomolar Kdapp values with good specificity against kanamycin. Furthermore, we developed a reduced graphene oxide (RGO)-based fluorescent aptasensor for kanamycin detection. In this system, kanamycin was detected at a concentration as low as 1 pM (582.6 fg/mL). In addition, this method could detect kanamycin accurately in kanamycin-spiked blood serum and milk samples. Consequently, this simple, rapid, and sensitive kanamycin detection system with newly structural and functional analysis aptamer exhibits outstanding detection compared to previous methods and provides a new possibility for point of care testing and food safety.

  19. Nuclear scaffold attachment stimulates, but is not essential for ARS activity in Saccharomyces cerevisiae: analysis of the Drosophila ftz SAR.

    PubMed Central

    Amati, B; Pick, L; Laroche, T; Gasser, S M

    1990-01-01

    Nuclei isolated from eukaryotic cells can be depleted of histones and most soluble nuclear proteins to isolate a structural framework called the nuclear scaffold. This structure maintains specific interactions with genomic DNA at sites known as scaffold attached regions (SARs), which are thought to be the bases of DNA loops. In both Saccharomyces cerevisiae and Schizosaccharomyces pombe, genomic ARS elements are recovered as SARs. In addition, SARs from Drosophila melanogaster bind to yeast nuclear scaffolds in vitro and a subclass of these promotes autonomous replication of plasmids in yeast. In the present report, we present fine mapping studies of the Drosophila ftz SAR, which has both SAR and ARS activities in yeast. The data establish a close relationship between the sequences involved in ARS activity and scaffold binding: ARS elements that can bind the nuclear scaffold in vitro promote more efficient plasmid replication in vivo, but scaffold association is not a strict prerequisite for ARS function. Efficient interaction with nuclear scaffolds from both yeast and Drosophila requires a minimal length of SAR DNA that contains reiteration of a narrow minor groove structure of the double helix. Images Fig. 1. Fig. 2. Fig. 3. Fig. 4. PMID:2123454

  20. Localization and characterization of an alpha-thrombin-binding site on platelet glycoprotein Ib alpha.

    PubMed

    De Marco, L; Mazzucato, M; Masotti, A; Ruggeri, Z M

    1994-03-04

    Glycoprotein (GP) Ib alpha is required for expression of the highest affinity alpha-thrombin-binding site on platelets, possibly contributing to platelet activation through a pathway involving cleavage of a specific receptor. This function may be important for the initiation of hemostasis and may also play a role in the development of pathological vascular occlusion. We have now identified a discrete sequence in the extracytoplasmic domain of GP Ib alpha, including residues 271-284 of the mature protein, which appears to be part of the high affinity alpha-thrombin-binding site. Synthetic peptidyl mimetics of this sequence inhibit alpha-thrombin binding to GP Ib as well as platelet activation and aggregation induced by subnanomolar concentrations of the agonist; they also inhibit alpha-thrombin binding to purified glycocalicin, the isolated extracytoplasmic portion of GP Ib alpha. The inhibitory peptides interfere with the clotting of fibrinogen by alpha-thrombin but not with the amidolytic activity of the enzyme on a small synthetic substrate, a finding compatible with the concept that the identified GP Ib alpha sequence interacts with the anion-binding exosite of alpha-thrombin but not with its active proteolytic site. The crucial structural elements of this sequence necessary for thrombin binding appear to be a cluster of negatively charged residues as well as three tyrosine residues that, in the native protein, may be sulfated. GP Ib alpha has no significant overall sequence homology with the thrombin inhibitor, hirudin, nor with the specific thrombin receptor on platelets; all three molecules, however, possess a distinct region rich in negatively charged residues that appear to be involved in thrombin binding. This may represent a case of convergent evolution of unrelated proteins for high affinity interaction with the same ligand.

  1. DNA binding specificity of the basic-helix-loop-helix protein MASH-1.

    PubMed

    Meierhan, D; el-Ariss, C; Neuenschwander, M; Sieber, M; Stackhouse, J F; Allemann, R K

    1995-09-05

    Despite the high degree of sequence similarity in their basic-helix-loop-helix (BHLH) domains, MASH-1 and MyoD are involved in different biological processes. In order to define possible differences between the DNA binding specificities of these two proteins, we investigated the DNA binding properties of MASH-1 by circular dichroism spectroscopy and by electrophoretic mobility shift assays (EMSA). Upon binding to DNA, the BHLH domain of MASH-1 underwent a conformational change from a mainly unfolded to a largely alpha-helical form, and surprisingly, this change was independent of the specific DNA sequence. The same conformational transition could be induced by the addition of 20% 2,2,2-trifluoroethanol. The apparent dissociation constants (KD) of the complexes of full-length MASH-1 with various oligonucleotides were determined from half-saturation points in EMSAs. MASH-1 bound as a dimer to DNA sequences containing an E-box with high affinity KD = 1.4-4.1 x 10(-14) M2). However, the specificity of DNA binding was low. The dissociation constant for the complex between MASH-1 and the highest affinity E-box sequence (KD = 1.4 x 10(-14) M2) was only a factor of 10 smaller than for completely unrelated DNA sequences (KD = approximately 1 x 10(-13) M2). The DNA binding specificity of MASH-1 was not significantly increased by the formation of an heterodimer with the ubiquitous E12 protein. MASH-1 and MyoD displayed similar binding site preferences, suggesting that their different target gene specificities cannot be explained solely by differential DNA binding. An explanation for these findings is provided on the basis of the known crystal structure of the BHLH domain of MyoD.

  2. DNA breathing dynamics distinguish binding from nonbinding consensus sites for transcription factor YY1 in cells.

    PubMed

    Alexandrov, Boian S; Fukuyo, Yayoi; Lange, Martin; Horikoshi, Nobuo; Gelev, Vladimir; Rasmussen, Kim Ø; Bishop, Alan R; Usheva, Anny

    2012-11-01

    The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.

  3. Functional genetic selection of Helix 66 in Escherichia coli 23S rRNA identified the eukaryotic-binding sequence for ribosomal protein L2

    PubMed Central

    Kitahara, Kei; Kajiura, Akimasa; Sato, Neuza Satomi; Suzuki, Tsutomu

    2007-01-01

    Ribosomal protein L2 is a highly conserved primary 23S rRNA-binding protein. L2 specifically recognizes the internal bulge sequence in Helix 66 (H66) of 23S rRNA and is localized to the intersubunit space through formation of bridge B7b with 16S rRNA. The L2-binding site in H66 is highly conserved in prokaryotic ribosomes, whereas the corresponding site in eukaryotic ribosomes has evolved into distinct classes of sequences. We performed a systematic genetic selection of randomized rRNA sequences in Escherichia coli, and isolated 20 functional variants of the L2-binding site. The isolated variants consisted of eukaryotic sequences, in addition to prokaryotic sequences. These results suggest that L2/L8e does not recognize a specific base sequence of H66, but rather a characteristic architecture of H66. The growth phenotype of the isolated variants correlated well with their ability of subunit association. Upon continuous cultivation of a deleterious variant, we isolated two spontaneous mutations within domain IV of 23S rRNA that compensated for its weak subunit association, and alleviated its growth defect, implying that functional interactions between intersubunit bridges compensate ribosomal function. PMID:17553838

  4. A rapid, generally applicable method to engineer zinc fingers illustrated by targeting the HIV-1 promoter.

    PubMed

    Isalan, M; Klug, A; Choo, Y

    2001-07-01

    DNA-binding domains with predetermined sequence specificity are engineered by selection of zinc finger modules using phage display, allowing the construction of customized transcription factors. Despite remarkable progress in this field, the available protein-engineering methods are deficient in many respects, thus hampering the applicability of the technique. Here we present a rapid and convenient method that can be used to design zinc finger proteins against a variety of DNA-binding sites. This is based on a pair of pre-made zinc finger phage-display libraries, which are used in parallel to select two DNA-binding domains each of which recognizes given 5 base pair sequences, and whose products are recombined to produce a single protein that recognizes a composite (9 base pair) site of predefined sequence. Engineering using this system can be completed in less than two weeks and yields proteins that bind sequence-specifically to DNA with Kd values in the nanomolar range. To illustrate the technique, we have selected seven different proteins to bind various regions of the human immunodeficiency virus 1 (HIV-1) promoter.

  5. SELMAP - SELEX affinity landscape MAPping of transcription factor binding sites using integrated microfluidics

    PubMed Central

    Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron

    2016-01-01

    Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341

  6. Molecular coevolution of mammalian ribosomal gene terminator sequences and the transcription termination factor TTF-I.

    PubMed Central

    Evers, R; Grummt, I

    1995-01-01

    Both the DNA elements and the nuclear factors that direct termination of ribosomal gene transcription exhibit species-specific differences. Even between mammals--e.g., human and mouse--the termination signals are not identical and the respective transcription termination factors (TTFs) which bind to the terminator sequence are not fully interchangeable. To elucidate the molecular basis for this species-specificity, we have cloned TTF-I from human and mouse cells and compared their structural and functional properties. Recombinant TTF-I exhibits species-specific DNA binding and terminates transcription both in cell-free transcription assays and in transfection experiments. Chimeric constructs of mouse TTF-I and human TTF-I reveal that the major determinant for species-specific DNA binding resides within the C terminus of TTF-I. Replacing 31 C-terminal amino acids of mouse TTF-I with the homologous human sequences relaxes the DNA-binding specificity and, as a consequence, allows the chimeric factor to bind the human terminator sequence and to specifically stop rDNA transcription. Images Fig. 2 Fig. 3 Fig. 4 PMID:7597036

  7. The PRC2-binding long non-coding RNAs in human and mouse genomes are associated with predictive sequence features

    NASA Astrophysics Data System (ADS)

    Tu, Shiqi; Yuan, Guo-Cheng; Shao, Zhen

    2017-01-01

    Recently, long non-coding RNAs (lncRNAs) have emerged as an important class of molecules involved in many cellular processes. One of their primary functions is to shape epigenetic landscape through interactions with chromatin modifying proteins. However, mechanisms contributing to the specificity of such interactions remain poorly understood. Here we took the human and mouse lncRNAs that were experimentally determined to have physical interactions with Polycomb repressive complex 2 (PRC2), and systematically investigated the sequence features of these lncRNAs by developing a new computational pipeline for sequences composition analysis, in which each sequence is considered as a series of transitions between adjacent nucleotides. Through that, PRC2-binding lncRNAs were found to be associated with a set of distinctive and evolutionarily conserved sequence features, which can be utilized to distinguish them from the others with considerable accuracy. We further identified fragments of PRC2-binding lncRNAs that are enriched with these sequence features, and found they show strong PRC2-binding signals and are more highly conserved across species than the other parts, implying their functional importance.

  8. Genomic Heat Shock Element Sequences Drive Cooperative Human Heat Shock Factor 1 DNA Binding and Selectivity*

    PubMed Central

    Jaeger, Alex M.; Makley, Leah N.; Gestwicki, Jason E.; Thiele, Dennis J.

    2014-01-01

    The heat shock transcription factor 1 (HSF1) activates expression of a variety of genes involved in cell survival, including protein chaperones, the protein degradation machinery, anti-apoptotic proteins, and transcription factors. Although HSF1 activation has been linked to amelioration of neurodegenerative disease, cancer cells exhibit a dependence on HSF1 for survival. Indeed, HSF1 drives a program of gene expression in cancer cells that is distinct from that activated in response to proteotoxic stress, and HSF1 DNA binding activity is elevated in cycling cells as compared with arrested cells. Active HSF1 homotrimerizes and binds to a DNA sequence consisting of inverted repeats of the pentameric sequence nGAAn, known as heat shock elements (HSEs). Recent comprehensive ChIP-seq experiments demonstrated that the architecture of HSEs is very diverse in the human genome, with deviations from the consensus sequence in the spacing, orientation, and extent of HSE repeats that could influence HSF1 DNA binding efficacy and the kinetics and magnitude of target gene expression. To understand the mechanisms that dictate binding specificity, HSF1 was purified as either a monomer or trimer and used to evaluate DNA-binding site preferences in vitro using fluorescence polarization and thermal denaturation profiling. These results were compared with quantitative chromatin immunoprecipitation assays in vivo. We demonstrate a role for specific orientations of extended HSE sequences in driving preferential HSF1 DNA binding to target loci in vivo. These studies provide a biochemical basis for understanding differential HSF1 target gene recognition and transcription in neurodegenerative disease and in cancer. PMID:25204655

  9. Interactions of DNA binding proteins with G-Quadruplex structures at the single molecule level

    NASA Astrophysics Data System (ADS)

    Ray, Sujay

    Guanine-rich nucleic acid (DNA/RNA) sequences can form non-canonical secondary structures, known as G-quadruplex (GQ). Numerous in vivo and in vitro studies have demonstrated formation of these structures in telomeric and non-telomeric regions of the genome. Telomeric GQs protect the chromosome ends whereas non-telomeric GQs either act as road blocks or recognition sites for DNA metabolic machinery. These observations suggest the significance of these structures in regulation of different metabolic processes, such as replication and repair. GQs are typically thermodynamically more stable than the corresponding Watson-Crick base pairing formed by G-rich and C-rich strands, making protein activity a crucial factor for their destabilization. Inside the cell, GQs interact with different proteins and their enzymatic activity is the determining factor for their stability. We studied interactions of several proteins with GQs to understand the underlying principles of protein-GQ interactions using single-molecule FRET and other biophysical techniques. Replication Protein-A (RPA), a single stranded DNA (ssDNA) binding protein, is known to posses GQ unfolding activity. First, we compared the thermal stability of three potentially GQ-forming DNA sequences (PQS) to their stability against RPA-mediated unfolding. One of these sequences is the human telomeric repeat and the other two, located in the promoter region of tyrosine hydroxylase gene, are highly heterogeneous sequences that better represent PQS in the genome. The thermal stability of these structures do not necessarily correlate with their stability against protein-mediated unfolding. We conclude that thermal stability is not necessarily an adequate criterion for predicting the physiological viability of GQ structures. To determine the critical structural factors that influence protein-GQ interactions we studied two groups of GQ structures that have systematically varying loop lengths and number of G-tetrad layers. We observed a linear increase in the steady-state stability of the GQ against RPA-mediated unfolding with increasing number of layers or decreasing loop length. The stability demonstrated by different GQ structures varied by at least three orders of magnitude. Finally, we studied another protein-GQ system where a protein complex works synergistically with a GQ to suppress DNA damage signals by preventing RPA to bind to telomeric DNA. Human telomeres that terminate with a single-stranded 3' G-overhang can be recognized as a DNA damage site by RPA. The protection of telomere-1 (POT1) and POT1-interacting protein (TPP1) heterodimer, binds specifically to telomeric DNA and protects it against RPA binding. Using model telomeric DNA, we studied the competition between POT1/TPP1 and RPA to access telomeric GQs in vitro. Under physiological salt and pH conditions, POT1/TPP1 stably load to a minimal DNA sequence adjacent to a folded GQ and unfolds the anti-parallel GQ as the parallel conformation remains folded. We showed that GQ formation of telomeres enhances the ability of POT1/TPP1 to block RPA's access to telomeres by two orders of magnitude and contributes to suppress DNA damage signals.

  10. Deciphering the molecular mechanisms underlying the binding of the TWIST1/E12 complex to regulatory E-box sequences

    PubMed Central

    Bouard, Charlotte; Terreux, Raphael; Honorat, Mylène; Manship, Brigitte; Ansieau, Stéphane; Vigneron, Arnaud M.; Puisieux, Alain; Payen, Léa

    2016-01-01

    Abstract The TWIST1 bHLH transcription factor controls embryonic development and cancer processes. Although molecular and genetic analyses have provided a wealth of data on the role of bHLH transcription factors, very little is known on the molecular mechanisms underlying their binding affinity to the E-box sequence of the promoter. Here, we used an in silico model of the TWIST1/E12 (TE) heterocomplex and performed molecular dynamics (MD) simulations of its binding to specific (TE-box) and modified E-box sequences. We focused on (i) active E-box and inactive E-box sequences, on (ii) modified active E-box sequences, as well as on (iii) two box sequences with modified adjacent bases the AT- and TA-boxes. Our in silico models were supported by functional in vitro binding assays. This exploration highlighted the predominant role of protein side-chain residues, close to the heart of the complex, at anchoring the dimer to DNA sequences, and unveiled a shift towards adjacent ((-1) and (-1*)) bases and conserved bases of modified E-box sequences. In conclusion, our study provides proof of the predictive value of these MD simulations, which may contribute to the characterization of specific inhibitors by docking approaches, and their use in pharmacological therapies by blocking the tumoral TWIST1/E12 function in cancers. PMID:27151200

  11. Functional Analysis of the Gene Cluster Involved in Production of the Bacteriocin Circularin A by Clostridium beijerinckii ATCC 25752

    PubMed Central

    Kemperman, Robèr; Jonker, Marnix; Nauta, Arjen; Kuipers, Oscar P.; Kok, Jan

    2003-01-01

    A region of 12 kb flanking the structural gene of the cyclic antibacterial peptide circularin A of Clostridium beijerinckii ATCC 25752 was sequenced, and the putative proteins involved in the production and secretion of circularin A were identified. The genes are tightly organized in overlapping open reading frames. Heterologous expression of circularin A in Enterococcus faecalis was achieved, and five genes were identified as minimally required for bacteriocin production and secretion. Two of the putative proteins, CirB and CirC, are predicted to contain membrane-spanning domains, while CirD contains a highly conserved ATP-binding domain. Together with CirB and CirC, this ATP-binding protein is involved in the production of circularin A. The fifth gene, cirE, confers immunity towards circularin A when expressed in either Lactococcus lactis or E. faecalis and is needed in order to allow the bacteria to produce bacteriocin. Additional resistance against circularin A is conferred by the activity of the putative transporter consisting of CirB and CirD. PMID:14532033

  12. Amyloid fibril formation in vitro from halophilic metal binding protein: Its high solubility and reversibility minimized formation of amorphous protein aggregations

    PubMed Central

    Tokunaga, Yuhei; Matsumoto, Mitsuharu; Tokunaga, Masao; Arakawa, Tsutomu; Sugimoto, Yasushi

    2013-01-01

    Halophilic proteins are characterized by high net negative charges and relatively small fraction of hydrophobic amino acids, rendering them aggregation resistant. These properties are also shared by histidine-rich metal binding protein (HP) from moderate halophile, Chromohalobacter salexigens, used in this study. Here, we examined how halophilic proteins form amyloid fibrils in vitro. His-tagged HP, incubated at pH 2.0 and 58°C, readily formed amyloid fibrils, as observed by thioflavin fluorescence, CD spectra, and transmission or atomic force microscopies. Under these low-pH harsh conditions, however, His-HP was promptly hydrolyzed to smaller peptides most likely responsible for rapid formation of amyloid fibril. Three major acid-hydrolyzed peptides were isolated from fibrils and turned out to readily form fibrils. The synthetic peptides predicted to form fibrils in these peptide sequences by Waltz software also formed fibrils. Amyloid fibril was also readily formed from full-length His-HP when incubated with 10–20% 2,2,2-trifluoroethanol at pH 7.8 and 25°C without peptide bond cleavage. PMID:24038709

  13. Analysis and modeling of heat-labile enterotoxins of Escherichia coli suggests a novel space with insights into receptor preference.

    PubMed

    Krishna Raja, M; Ghosh, Asit Ranjan; Vino, S; Sajitha Lulu, S

    2015-01-01

    Features of heat-labile enterotoxins of Escherichia coli which make them fit to use as novel receptors for antidiarrheals are not completely explored. Data-set of 14 different serovars of enterotoxigenic Escherichia coli producing heat-labile toxins were taken from NCBI Genbank database and used in the study. Sequence analysis showed mutations in different subunits and also at their interface residues. As these toxins lack crystallography structures, homology modeling using Modeller 9.11 led to the structural approximation for the E. coli producing heat-labile toxins. Interaction of modeled toxin subunits with proanthocyanidin, an antidiarrheal showed several strong hydrogen bonding interactions at the cost of minimized energy. The hits were subsequently characterized by molecular dynamics simulation studies to monitor their binding stabilities. This study looks into novel space where the ligand can choose the receptor preference not as a whole but as an individual subunit. Mutation at interface residues and interaction among subunits along with the binding of ligand to individual subunits would help to design a non-toxic labile toxin and also to improve the therapeutics.

  14. Integrating computational and chemical biology tools in the discovery of antiangiogenic small molecule ligands of FGF2 derived from endogenous inhibitors

    PubMed Central

    Foglieni, Chiara; Pagano, Katiuscia; Lessi, Marco; Bugatti, Antonella; Moroni, Elisabetta; Pinessi, Denise; Resovi, Andrea; Ribatti, Domenico; Bertini, Sabrina; Ragona, Laura; Bellina, Fabio; Rusnati, Marco; Colombo, Giorgio; Taraboletti, Giulia

    2016-01-01

    The FGFs/FGFRs system is a recognized actionable target for therapeutic approaches aimed at inhibiting tumor growth, angiogenesis, metastasis, and resistance to therapy. We previously identified a non-peptidic compound (SM27) that retains the structural and functional properties of the FGF2-binding sequence of thrombospondin-1 (TSP-1), a major endogenous inhibitor of angiogenesis. Here we identified new small molecule inhibitors of FGF2 based on the initial lead. A similarity-based screening of small molecule libraries, followed by docking calculations and experimental studies, allowed selecting 7 bi-naphthalenic compounds that bound FGF2 inhibiting its binding to both heparan sulfate proteoglycans and FGFR-1. The compounds inhibit FGF2 activity in in vitro and ex vivo models of angiogenesis, with improved potency over SM27. Comparative analysis of the selected hits, complemented by NMR and biochemical analysis of 4 newly synthesized functionalized phenylamino-substituted naphthalenes, allowed identifying the minimal stereochemical requirements to improve the design of naphthalene sulfonates as FGF2 inhibitors. PMID:27000667

  15. Binding of Signal Recognition Particle Gives Ribosome/Nascent Chain Complexes a Competitive Advantage in Endoplasmic Reticulum Membrane Interaction

    PubMed Central

    Neuhof, Andrea; Rolls, Melissa M.; Jungnickel, Berit; Kalies, Kai-Uwe; Rapoport, Tom A.

    1998-01-01

    Most secretory and membrane proteins are sorted by signal sequences to the endoplasmic reticulum (ER) membrane early during their synthesis. Targeting of the ribosome-nascent chain complex (RNC) involves the binding of the signal sequence to the signal recognition particle (SRP), followed by an interaction of ribosome-bound SRP with the SRP receptor. However, ribosomes can also independently bind to the ER translocation channel formed by the Sec61p complex. To explain the specificity of membrane targeting, it has therefore been proposed that nascent polypeptide-associated complex functions as a cytosolic inhibitor of signal sequence- and SRP-independent ribosome binding to the ER membrane. We report here that SRP-independent binding of RNCs to the ER membrane can occur in the presence of all cytosolic factors, including nascent polypeptide-associated complex. Nontranslating ribosomes competitively inhibit SRP-independent membrane binding of RNCs but have no effect when SRP is bound to the RNCs. The protective effect of SRP against ribosome competition depends on a functional signal sequence in the nascent chain and is also observed with reconstituted proteoliposomes containing only the Sec61p complex and the SRP receptor. We conclude that cytosolic factors do not prevent the membrane binding of ribosomes. Instead, specific ribosome targeting to the Sec61p complex is provided by the binding of SRP to RNCs, followed by an interaction with the SRP receptor, which gives RNC–SRP complexes a selective advantage in membrane targeting over nontranslating ribosomes. PMID:9436994

  16. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain.

    PubMed

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein-nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5' TOPs (5' terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations.

  17. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain

    PubMed Central

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein–nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5′ TOPs (5′ terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations. PMID:24824036

  18. Sequences Flanking the Gephyrin-Binding Site of GlyRβ Tune Receptor Stabilization at Synapses

    PubMed Central

    Grünewald, Nora; Salvatico, Charlotte; Kress, Vanessa

    2018-01-01

    Abstract The efficacy of synaptic transmission is determined by the number of neurotransmitter receptors at synapses. Their recruitment depends upon the availability of postsynaptic scaffolding molecules that interact with specific binding sequences of the receptor. At inhibitory synapses, gephyrin is the major scaffold protein that mediates the accumulation of heteromeric glycine receptors (GlyRs) via the cytoplasmic loop in the β-subunit (β-loop). This binding involves high- and low-affinity interactions, but the molecular mechanism of this bimodal binding and its implication in GlyR stabilization at synapses remain unknown. We have approached this question using a combination of quantitative biochemical tools and high-density single molecule tracking in cultured rat spinal cord neurons. The high-affinity binding site could be identified and was shown to rely on the formation of a 310-helix C-terminal to the β-loop core gephyrin-binding motif. This site plays a structural role in shaping the core motif and represents the major contributor to the synaptic confinement of GlyRs by gephyrin. The N-terminal flanking sequence promotes lower affinity interactions by occupying newly identified binding sites on gephyrin. Despite its low affinity, this binding site plays a modulatory role in tuning the mobility of the receptor. Together, the GlyR β-loop sequences flanking the core-binding site differentially regulate the affinity of the receptor for gephyrin and its trapping at synapses. Our experimental approach thus bridges the gap between thermodynamic aspects of receptor-scaffold interactions and functional receptor stabilization at synapses in living cells. PMID:29464196

  19. TmiRUSite and TmiROSite scripts: searching for mRNA fragments with miRNA binding sites with encoded amino acid residues.

    PubMed

    Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly

    2014-01-01

    microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.

  20. Non-B-DNA structures on the interferon-beta promoter?

    PubMed

    Robbe, K; Bonnefoy, E

    1998-01-01

    The high mobility group (HMG) I protein intervenes as an essential factor during the virus induced expression of the interferon-beta (IFN-beta) gene. It is a non-histone chromatine associated protein that has the dual capacity of binding to a non-B-DNA structure such as cruciform-DNA as well as to AT rich B-DNA sequences. In this work we compare the binding affinity of HMGI for a synthetic cruciform-DNA to its binding affinity for the HMGI-binding-site present in the positive regulatory domain II (PRDII) of the IFN-beta promoter. Using gel retardation experiments, we show that HMGI protein binds with at least ten times more affinity to the synthetic cruciform-DNA structure than to the PRDII B-DNA sequence. DNA hairpin sequences are present in both the human and the murine PRDII-DNAs. We discuss in this work the presence of, yet putative, non-B-DNA structures in the IFN-beta promoter.

  1. An improved SELEX technique for selection of DNA aptamers binding to M-type 11 of Streptococcus pyogenes.

    PubMed

    Hamula, Camille L A; Peng, Hanyong; Wang, Zhixin; Tyrrell, Gregory J; Li, Xing-Fang; Le, X Chris

    2016-03-15

    Streptococcus pyogenes is a clinically important pathogen consisting of various serotypes determined by different M proteins expressed on the cell surface. The M type is therefore a useful marker to monitor the spread of invasive S. pyogenes in a population. Serotyping and nucleic acid amplification/sequencing methods for the identification of M types are laborious, inconsistent, and usually confined to reference laboratories. The primary objective of this work is to develop a technique that enables generation of aptamers binding to specific M-types of S. pyogenes. We describe here an in vitro technique that directly used live bacterial cells and the Systematic Evolution of Ligands by Exponential Enrichment (SELEX) strategy. Live S. pyogenes cells were incubated with DNA libraries consisting of 40-nucleotides randomized sequences. Those sequences that bound to the cells were separated, amplified using polymerase chain reaction (PCR), purified using gel electrophoresis, and served as the input DNA pool for the next round of SELEX selection. A specially designed forward primer containing extended polyA20/5Sp9 facilitated gel electrophoresis purification of ssDNA after PCR amplification. A counter-selection step using non-target cells was introduced to improve selectivity. DNA libraries of different starting sequence diversity (10(16) and 10(14)) were compared. Aptamer pools from each round of selection were tested for their binding to the target and non-target cells using flow cytometry. Selected aptamer pools were then cloned and sequenced. Individual aptamer sequences were screened on the basis of their binding to the 10 M-types that were used as targets. Aptamer pools obtained from SELEX rounds 5-8 showed high affinity to the target S. pyogenes cells. Tests against non-target Streptococcus bovis, Streptococcus pneumoniae, and Enterococcus species demonstrated selectivity of these aptamers for binding to S. pyogenes. Several aptamer sequences were found to bind preferentially to the M11 M-type of S. pyogenes. Estimated binding dissociation constants (Kd) were in the low nanomolar range for the M11 specific sequences; for example, sequence E-CA20 had a Kd of 7±1 nM. These affinities are comparable to those of a monoclonal antibody. The improved bacterial cell-SELEX technique is successful in generating aptamers selective for S. pyogenes and some of its M-types. These aptamers are potentially useful for detecting S. pyogenes, achieving binding profiles of the various M-types, and developing new M-typing technologies for non-specialized laboratories or point-of-care testing. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Human H3N2 Influenza Viruses Isolated from 1968 To 2012 Show Varying Preference for Receptor Substructures with No Apparent Consequences for Disease or Spread

    PubMed Central

    Gulati, Shelly; Smith, David F.; Cummings, Richard D.; Couch, Robert B.; Griesemer, Sara B.; St. George, Kirsten; Webster, Robert G.; Air, Gillian M.

    2013-01-01

    It is generally accepted that human influenza viruses bind glycans containing sialic acid linked α2–6 to the next sugar, that avian influenza viruses bind glycans containing the α2–3 linkage, and that mutations that change the binding specificity might change the host tropism. We noted that human H3N2 viruses showed dramatic differences in their binding specificity, and so we embarked on a study of representative human H3N2 influenza viruses, isolated from 1968 to 2012, that had been isolated and minimally passaged only in mammalian cells, never in eggs. The 45 viruses were grown in MDCK cells, purified, fluorescently labeled and screened on the Consortium for Functional Glycomics Glycan Array. Viruses isolated in the same season have similar binding specificity profiles but the profiles show marked year-to-year variation. None of the 610 glycans on the array (166 sialylated glycans) bound to all viruses; the closest was Neu5Acα2–6(Galβ1–4GlcNAc)3 in either a linear or biantennary form, that bound 42 of the 45 viruses. The earliest human H3N2 viruses preferentially bound short, branched sialylated glycans while recent viruses bind better to long polylactosamine chains terminating in sialic acid. Viruses isolated in 1996, 2006, 2010 and 2012 bind glycans with α2–3 linked sialic acid; for 2006, 2010 and 2012 viruses this binding was inhibited by oseltamivir, indicating binding of α2–3 sialylated glycans by neuraminidase. More significantly, oseltamivir inhibited virus entry of 2010 and 2012 viruses into MDCK cells. All of these viruses were representative of epidemic strains that spread around the world, so all could infect and transmit between humans with high efficiency. We conclude that the year-to-year variation in receptor binding specificity is a consequence of amino acid sequence changes driven by antigenic drift, and that viruses with quite different binding specificity and avidity are equally fit to infect and transmit in the human population. PMID:23805213

  3. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-08-01

    The speed of site-specific binding of transcription factor (TFs) proteins with genomic DNA seems to be strongly retarded by the randomly occurring sequence traps. Traps are those DNA sequences sharing significant similarity with the original specific binding sites (SBSs). It is an intriguing question how the naturally occurring TFs and their SBSs are designed to manage the retarding effects of such randomly occurring traps. We develop a simple random walk model on the site-specific binding of TFs with genomic DNA in the presence of sequence traps. Our dynamical model predicts that (a) the retarding effects of traps will be minimum when the traps are arranged around the SBS such that there is a negative correlation between the binding strength of TFs with traps and the distance of traps from the SBS and (b) the retarding effects of sequence traps can be appeased by the condensed conformational state of DNA. Our computational analysis results on the distribution of sequence traps around the putative binding sites of various TFs in mouse and human genome clearly agree well the theoretical predictions. We propose that the distribution of traps can be used as an additional metric to efficiently identify the SBSs of TFs on genomic DNA.

  4. Xenopus origin recognition complex (ORC) initiates DNA replication preferentially at sequences targeted by Schizosaccharomyces pombe ORC

    PubMed Central

    Kong, Daochun; Coleman, Thomas R.; DePamphilis, Melvin L.

    2003-01-01

    Budding yeast (Saccharomyces cerevisiae) origin recognition complex (ORC) requires ATP to bind specific DNA sequences, whereas fission yeast (Schizosaccharomyces pombe) ORC binds to specific, asymmetric A:T-rich sites within replication origins, independently of ATP, and frog (Xenopus laevis) ORC seems to bind DNA non-specifically. Here we show that despite these differences, ORCs are functionally conserved. Firstly, SpOrc1, SpOrc4 and SpOrc5, like those from other eukaryotes, bound ATP and exhibited ATPase activity, suggesting that ATP is required for pre-replication complex (pre-RC) assembly rather than origin specificity. Secondly, SpOrc4, which is solely responsible for binding SpORC to DNA, inhibited up to 70% of XlORC-dependent DNA replication in Xenopus egg extract by preventing XlORC from binding to chromatin and assembling pre-RCs. Chromatin-bound SpOrc4 was located at AT-rich sequences. XlORC in egg extract bound preferentially to asymmetric A:T-sequences in either bare DNA or in sperm chromatin, and it recruited XlCdc6 and XlMcm proteins to these sequences. These results reveal that XlORC initiates DNA replication preferentially at the same or similar sites to those targeted in S.pombe. PMID:12840006

  5. GuiTope: an application for mapping random-sequence peptides to protein sequences.

    PubMed

    Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert

    2012-01-03

    Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  6. Wld S protein requires Nmnat activity and a short N-terminal sequence to protect axons in mice.

    PubMed

    Conforti, Laura; Wilbrey, Anna; Morreale, Giacomo; Janeckova, Lucie; Beirowski, Bogdan; Adalbert, Robert; Mazzola, Francesca; Di Stefano, Michele; Hartley, Robert; Babetto, Elisabetta; Smith, Trevor; Gilley, Jonathan; Billington, Richard A; Genazzani, Armando A; Ribchester, Richard R; Magni, Giulio; Coleman, Michael

    2009-02-23

    The slow Wallerian degeneration (Wld(S)) protein protects injured axons from degeneration. This unusual chimeric protein fuses a 70-amino acid N-terminal sequence from the Ube4b multiubiquitination factor with the nicotinamide adenine dinucleotide-synthesizing enzyme nicotinamide mononucleotide adenylyl transferase 1. The requirement for these components and the mechanism of Wld(S)-mediated neuroprotection remain highly controversial. The Ube4b domain is necessary for the protective phenotype in mice, but precisely which sequence is essential and why are unclear. Binding to the AAA adenosine triphosphatase valosin-containing protein (VCP)/p97 is the only known biochemical property of the Ube4b domain. Using an in vivo approach, we show that removing the VCP-binding sequence abolishes axon protection. Replacing the Wld(S) VCP-binding domain with an alternative ataxin-3-derived VCP-binding sequence restores its protective function. Enzyme-dead Wld(S) is unable to delay Wallerian degeneration in mice. Thus, neither domain is effective without the function of the other. Wld(S) requires both of its components to protect axons from degeneration.

  7. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.

    PubMed

    Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.

  8. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome

    PubMed Central

    Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274

  9. Selection of a platinum-binding sequence in a loop of a four-helix bundle protein.

    PubMed

    Yagi, Sota; Akanuma, Satoshi; Kaji, Asumi; Niiro, Hiroya; Akiyama, Hayato; Uchida, Tatsuya; Yamagishi, Akihiko

    2018-02-01

    Protein-metal hybrids are functional materials with various industrial applications. For example, a redox enzyme immobilized on a platinum electrode is a key component of some biofuel cells and biosensors. To create these hybrid materials, protein molecules are bound to metal surfaces. Here, we report the selection of a novel platinum-binding sequence in a loop of a four-helix bundle protein, the Lac repressor four-helix protein (LARFH), an artificial protein in which four identical α-helices are connected via three identical loops. We created a genetic library in which the Ser-Gly-Gln-Gly-Gly-Ser sequence within the first inter-helical loop of LARFH was semi-randomly mutated. The library was then subjected to selection for platinum-binding affinity by using the T7 phage display method. The majority of the selected variants contained the Tyr-Lys-Arg-Gly-Tyr-Lys (YKRGYK) sequence in their randomized segment. We characterized the platinum-binding properties of mutant LARFH by using quartz crystal microbalance analysis. Mutant LARFH seemed to interact with platinum through its loop containing the YKRGYK sequence, as judged by the estimated exclusive area occupied by a single molecule. Furthermore, a 10-residue peptide containing the YKRGYK sequence bound to platinum with reasonably high affinity and basic side chains in the peptide were crucial in mediating this interaction. In conclusion, we have identified an amino acid sequence, YKRGYK, in the loop of a helix-loop-helix motif that shows high platinum-binding affinity. This sequence could be grafted into loops of other polypeptides as an approach to immobilize proteins on platinum electrodes for use as biosensors among other applications. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  10. Free energy minimization to predict RNA secondary structures and computational RNA design.

    PubMed

    Churkin, Alexander; Weinbrand, Lina; Barash, Danny

    2015-01-01

    Determining the RNA secondary structure from sequence data by computational predictions is a long-standing problem. Its solution has been approached in two distinctive ways. If a multiple sequence alignment of a collection of homologous sequences is available, the comparative method uses phylogeny to determine conserved base pairs that are more likely to form as a result of billions of years of evolution than by chance. In the case of single sequences, recursive algorithms that compute free energy structures by using empirically derived energy parameters have been developed. This latter approach of RNA folding prediction by energy minimization is widely used to predict RNA secondary structure from sequence. For a significant number of RNA molecules, the secondary structure of the RNA molecule is indicative of its function and its computational prediction by minimizing its free energy is important for its functional analysis. A general method for free energy minimization to predict RNA secondary structures is dynamic programming, although other optimization methods have been developed as well along with empirically derived energy parameters. In this chapter, we introduce and illustrate by examples the approach of free energy minimization to predict RNA secondary structures.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi

    Highlights: {yields} The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. {yields} The core promoter was located in the 5F-1. {yields} Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. {yields} These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, butmore » little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.« less

  12. A nonpolymorphic major histocompatibility complex class Ib molecule binds a large array of diverse self-peptides

    PubMed Central

    1994-01-01

    Unlike the highly polymorphic major histocompatibility complex (MHC) class Ia molecules, which present a wide variety of peptides to T cells, it is generally assumed that the nonpolymorphic MHC class Ib molecules may have evolved to function as highly specialized receptors for the presentation of structurally unique peptides. However, a thorough biochemical analysis of one class Ib molecule, the soluble isoform of Qa-2 antigen (H-2SQ7b), has revealed that it binds a diverse array of structurally similar peptides derived from intracellular proteins in much the same manner as the classical antigen-presenting molecules. Specifically, we find that SQ7b molecules are heterodimers of heavy and light chains complexed with nonameric peptides in a 1:1:1 ratio. These peptides contain a conserved hydrophobic residue at the COOH terminus and a combination of one or more conserved residue(s) at P7 (histidine), P2 (glutamine/leucine), and/or P3 (leucine/asparagine) as anchors for binding SQ7b. 2 of 18 sequenced peptides matched cytosolic proteins (cofilin and L19 ribosomal protein), suggesting an intracellular source of the SQ7b ligands. Minimal estimates of the peptide repertoire revealed that at least 200 different naturally processed self-peptides can bind SQ7b molecules. Since Qa-2 molecules associate with a diverse array of peptides, we suggest that they function as effective presenting molecules of endogenously synthesized proteins like the class Ia molecules. PMID:8294869

  13. Role of the chromatin landscape and sequence in determining cell type-specific genomic glucocorticoid receptor binding and gene regulation

    PubMed Central

    Huska, Matthew R.; Jurk, Marcel; Schöpflin, Robert; Starick, Stephan R.; Schwahn, Kevin; Cooper, Samantha B.; Yamamoto, Keith R.; Thomas-Chollier, Morgane; Vingron, Martin

    2017-01-01

    Abstract The genomic loci bound by the glucocorticoid receptor (GR), a hormone-activated transcription factor, show little overlap between cell types. To study the role of chromatin and sequence in specifying where GR binds, we used Bayesian modeling within the universe of accessible chromatin. Taken together, our results uncovered that although GR preferentially binds accessible chromatin, its binding is biased against accessible chromatin located at promoter regions. This bias can only be explained partially by the presence of fewer GR recognition sequences, arguing for the existence of additional mechanisms that interfere with GR binding at promoters. Therefore, we tested the role of H3K9ac, the chromatin feature with the strongest negative association with GR binding, but found that this correlation does not reflect a causative link. Finally, we find a higher percentage of promoter–proximal GR binding for genes regulated by GR across cell types than for cell type-specific target genes. Given that GR almost exclusively binds accessible chromatin, we propose that cell type-specific regulation by GR preferentially occurs via distal enhancers, whose chromatin accessibility is typically cell type-specific, whereas ubiquitous target gene regulation is more likely to result from binding to promoter regions, which are often accessible regardless of cell type examined. PMID:27903902

  14. The 1.3 A resolution structure of the RNA tridecamer r(GCGUUUGAAACGC): metal ion binding correlates with base unstacking and groove contraction.

    PubMed

    Timsit, Youri; Bombard, Sophie

    2007-12-01

    Metal ions play a key role in RNA folding and activity. Elucidating the rules that govern the binding of metal ions is therefore an essential step for better understanding the RNA functions. High-resolution data are a prerequisite for a detailed structural analysis of ion binding on RNA and, in particular, the observation of monovalent cations. Here, the high-resolution crystal structures of the tridecamer duplex r(GCGUUUGAAACGC) crystallized under different conditions provides new structural insights on ion binding on GAAA/UUU sequences that exhibit both unusual structural and functional properties in RNA. The present study extends the repertory of RNA ion binding sites in showing that the two first bases of UUU triplets constitute a specific site for sodium ions. A striking asymmetric pattern of metal ion binding in the two equivalent halves of the palindromic sequence demonstrates that sequence and its environment act together to bind metal ions. A highly ionophilic half that binds six metal ions allows, for the first time, the observation of a disodium cluster in RNA. The comparison of the equivalent halves of the duplex provides experimental evidences that ion binding correlates with structural alterations and groove contraction.

  15. Sequence- and Interactome-Based Prediction of Viral Protein Hotspots Targeting Host Proteins: A Case Study for HIV Nef

    PubMed Central

    Sarmady, Mahdi; Dampier, William; Tozeren, Aydin

    2011-01-01

    Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk. PMID:21738584

  16. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets.

    PubMed

    Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L

    2013-07-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.

  17. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets

    PubMed Central

    Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.

    2013-01-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967

  18. DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

    PubMed

    Ma, Xin; Guo, Jing; Sun, Xiao

    2016-01-01

    DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.

  19. Contacts between the factor TUF and RPG sequences.

    PubMed

    Vignais, M L; Huet, J; Buhler, J M; Sentenac, A

    1990-08-25

    The yeast TUF factor binds specifically to RPG-like sequences involved in multiple functions at enhancers, silencers, and telomeres. We have characterized the interaction of TUF with its optimal binding sequence, rpg-1 (1-ACACCCATACATTT-14), using a gel DNA-binding assay in combination with methylation protection and mutagenesis experiments. As many as 10 base pairs appear to be engaged in factor binding. Analysis of a collection of 30 different RPG mutants demonstrated the importance of 8 base pairs at position 2, 3, 4, 5, 6, 7, 10, and 12 and the critical role of the central GC pair at position 5. Methylation protection data on four different natural sites confirmed a close contact at positions 4, 5, 6, and 10 and suggested additional contacts at base pairs 8, 12, and 13. The derived consensus sequence was RCAAYCCRYNCAYY. A quantitative band shift analysis was used to determine the equilibrium dissociation constant for the complex of TUF and its optimal binding site rpg-1. The specific dissociation constant (K8) was found to be 1.3 x 10(-11) M. The comparison of the K8 value with the dissociation constant obtained for nonspecific DNA sites (Kn8 = 8.7 x 10(-6) M) shows the high binding selectivity of TUF for its specific RPG target.

  20. High-Mobility Group Chromatin Proteins 1 and 2 Functionally Interact with Steroid Hormone Receptors To Enhance Their DNA Binding In Vitro and Transcriptional Activity in Mammalian Cells

    PubMed Central

    Boonyaratanakornkit, Viroj; Melvin, Vida; Prendergast, Paul; Altmann, Magda; Ronfani, Lorenza; Bianchi, Marco E.; Taraseviciene, Laima; Nordeen, Steven K.; Allegretto, Elizabeth A.; Edwards, Dean P.

    1998-01-01

    We previously reported that the chromatin high-mobility group protein 1 (HMG-1) enhances the sequence-specific DNA binding activity of progesterone receptor (PR) in vitro, thus providing the first evidence that HMG-1 may have a coregulatory role in steroid receptor-mediated gene transcription. Here we show that HMG-1 and the highly related HMG-2 stimulate DNA binding by other steroid receptors, including estrogen, androgen, and glucocorticoid receptors, but have no effect on DNA binding by several nonsteroid nuclear receptors, including retinoid acid receptor (RAR), retinoic X receptor (RXR), and vitamin D receptor (VDR). As highly purified recombinant full-length proteins, all steroid receptors tested exhibited weak binding affinity for their optimal palindromic hormone response elements (HREs), and the addition of purified HMG-1 or -2 substantially increased their affinity for HREs. Purified RAR, RXR, and VDR also exhibited little to no detectable binding to their cognate direct repeat HREs but, in contrast to results with steroid receptors, the addition of HMG-1 or HMG-2 had no stimulatory effect. Instead, the addition of purified RXR enhanced RAR and VDR DNA binding through a heterodimerization mechanism and HMG-1 or HMG-2 had no further effect on DNA binding by RXR-RAR or RXR-VDR heterodimers. HMG-1 and HMG-2 (HMG-1/-2) themselves do not bind to progesterone response elements, but in the presence of PR they were detected as part of an HMG-PR-DNA ternary complex. HMG-1/-2 can also interact transiently in vitro with PR in the absence of DNA; however, no direct protein interaction was detected with VDR. These results, taken together with the fact that PR can bend its target DNA and that HMG-1/-2 are non-sequence-specific DNA binding proteins that recognize DNA structure, suggest that HMG-1/-2 are recruited to the PR-DNA complex by the combined effect of transient protein interaction and DNA bending. In transient-transfection assays, coexpression of HMG-1 or HMG-2 increased PR-mediated transcription in mammalian cells by as much as 7- to 10-fold without altering the basal promoter activity of target reporter genes. This increase in PR-mediated gene activation by coexpression of HMG-1/-2 was observed in different cell types and with different target promoters, suggesting a generality to the functional interaction between HMG-1/-2 and PR in vivo. Cotransfection of HMG-1 also increased reporter gene activation mediated by other steroid receptors, including glucocorticoid and androgen receptors, but it had a minimal influence on VDR-dependent transcription in vivo. These results support the conclusion that HMG-1/-2 are coregulatory proteins that increase the DNA binding and transcriptional activity of the steroid hormone class of receptors but that do not functionally interact with certain nonsteroid classes of nuclear receptors. PMID:9671457

  1. Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsodikov, Oleg V.; Biswas, Tapan

    An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less

  2. Structure and Function of Lipopolysaccharide Binding Protein

    NASA Astrophysics Data System (ADS)

    Schumann, Ralf R.; Leong, Steven R.; Flaggs, Gail W.; Gray, Patrick W.; Wright, Samuel D.; Mathison, John C.; Tobias, Peter S.; Ulevitch, Richard J.

    1990-09-01

    The primary structure of lipopolysaccharide binding protein (LBP), a trace plasma protein that binds to the lipid A moiety of bacterial lipopolysaccharides (LPSs), was deduced by sequencing cloned complementary DNA. LBP shares sequence identity with another LPS binding protein found in granulocytes, bactericidal/permeability-increasing protein, and with cholesterol ester transport protein of the plasma. LBP may control the response to LPS under physiologic conditions by forming high-affinity complexes with LPS that bind to monocytes and macrophages, which then secrete tumor necrosis factor. The identification of this pathway for LPS-induced monocyte stimulation may aid in the development of treatments for diseases in which Gram-negative sepsis or endotoxemia are involved.

  3. Accumulation, internalization and therapeutic efficacy of neuropilin-1-targeted liposomes

    PubMed Central

    Paoli, Eric E.; Ingham, Elizabeth S.; Zhang, Hua; Mahakian, Lisa M.; Fite, Brett Z.; Gagnon, M. Karen; Tam, Sarah; Kheirolomoom, Azadeh; Cardiff, Robert D.; Ferrara, Katherine W.

    2014-01-01

    Advancements in liposomal drug delivery have produced long circulating and very stable drug formulations. These formulations minimize systemic exposure; however, unfortunately, therapeutic efficacy has remained limited due to the slow diffusion of liposomal particles within the tumor and limited release or uptake of the encapsulated drug. Here, the carboxyl-terminated CRPPR peptide, with affinity for the receptor neuropilin-1 (NRP), which is expressed on both endothelial and cancer cells, was conjugated to liposomes to enhance the tumor accumulation. Using a pH sensitive probe, liposomes were optimized for specific NRP binding and subsequent cellular internalization using in vitro cellular assays. Liposomes conjugated with the carboxyl-terminated CRPPR peptide (termed C-LPP liposomes) bound to the NRP-positive primary prostatic carcinoma cell line (PPC-1) but did not bind to the NRP-negative PC-3 cell line, and binding was observed with liposomal peptide concentrations as low as 0.16 mol%. Binding of the C-LPP liposomes was receptor-limited, with saturation observed at high liposome concentrations. The identical peptide sequence bearing an amide terminus did not bind specifically, accumulating only with a high (2.5 mol%) peptide concentration and adhering equally to NRP positive and negative cell lines. The binding of C-LPP liposomes conjugated with 0.63 mol% of the peptide was 83-fold greater than liposomes conjugated with the amide version of the peptide. Cellular internalization was also enhanced with C-LPP liposomes, with 80% internalized following 3hr incubation. Additionally, fluorescence in the blood pool (~40% of the injected dose) was similar for liposomes conjugated with 0.63 mol% of carboxyl-terminated peptide and non-targeted liposomes at 24 hr after injection, indicating stable circulation. Prior to doxorubicin treatment, in vivo tumor accumulation and vascular targeting were increased for peptide-conjugated liposomes compared to non-targeted liposomes based on confocal imaging of a fluorescent cargo, and the availability of the vascular receptor was confirmed with ultrasound molecular imaging. Finally, over a 4-week course of therapy, tumor knockdown resulting from doxorubicin-loaded, C-LPP liposomes was similar to non-targeted liposomes in syngeneic tumor-bearing FVB mice and C-LPP liposomes reduced doxorubicin accumulation in the skin and heart and eliminated skin toxicity. Taken together, our results demonstrate that a carboxyl-terminated RXXR peptide sequence, conjugated to liposomes at a concentration of 0.63 mol%, retains long circulation but enhances binding and internalization, and reduces toxicity. PMID:24434424

  4. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. p53 Specifically Binds Triplex DNA In Vitro and in Cells

    PubMed Central

    Brázdová, Marie; Tichý, Vlastimil; Helma, Robert; Bažantová, Pavla; Polášková, Alena; Krejčí, Aneta; Petr, Marek; Navrátilová, Lucie; Tichá, Olga; Nejedlý, Karel; Bennink, Martin L.; Subramaniam, Vinod; Bábková, Zuzana; Martínek, Tomáš; Lexa, Matej; Adámik, Matej

    2016-01-01

    Triplex DNA is implicated in a wide range of biological activities, including regulation of gene expression and genomic instability leading to cancer. The tumor suppressor p53 is a central regulator of cell fate in response to different type of insults. Sequence and structure specific modes of DNA recognition are core attributes of the p53 protein. The focus of this work is the structure-specific binding of p53 to DNA containing triplex-forming sequences in vitro and in cells and the effect on p53-driven transcription. This is the first DNA binding study of full-length p53 and its deletion variants to both intermolecular and intramolecular T.A.T triplexes. We demonstrate that the interaction of p53 with intermolecular T.A.T triplex is comparable to the recognition of CTG-hairpin non-B DNA structure. Using deletion mutants we determined the C-terminal DNA binding domain of p53 to be crucial for triplex recognition. Furthermore, strong p53 recognition of intramolecular T.A.T triplexes (H-DNA), stabilized by negative superhelicity in plasmid DNA, was detected by competition and immunoprecipitation experiments, and visualized by AFM. Moreover, chromatin immunoprecipitation revealed p53 binding T.A.T forming sequence in vivo. Enhanced reporter transactivation by p53 on insertion of triplex forming sequence into plasmid with p53 consensus sequence was observed by luciferase reporter assays. In-silico scan of human regulatory regions for the simultaneous presence of both consensus sequence and T.A.T motifs identified a set of candidate p53 target genes and p53-dependent activation of several of them (ABCG5, ENOX1, INSR, MCC, NFAT5) was confirmed by RT-qPCR. Our results show that T.A.T triplex comprises a new class of p53 binding sites targeted by p53 in a DNA structure-dependent mode in vitro and in cells. The contribution of p53 DNA structure-dependent binding to the regulation of transcription is discussed. PMID:27907175

  6. Characterization of the Organic Component of Low-Molecular-Weight Chromium-Binding Substance and Its Binding of Chromium123

    PubMed Central

    Chen, Yuan; Watson, Heather M.; Gao, Junjie; Sinha, Sarmistha Halder; Cassady, Carolyn J.; Vincent, John B.

    2011-01-01

    Chromium was proposed to be an essential element over 50 y ago and was shown to have therapeutic potential in treating the symptoms of type 2 diabetes; however, its mechanism of action at a molecular level is unknown. One chromium-binding biomolecule, low-molecular weight chromium-binding substance (LMWCr or chromodulin), has been found to be biologically active in in vitro assays and proposed as a potential candidate for the in vivo biologically active form of chromium. Characterization of the organic component of LMWCr has proven difficult. Treating bovine LMWCr with trifluoroacetic acid followed by purification on a graphite powder micro-column generates a heptapeptide fragment of LMWCr. The peptide sequence of the fragment was analyzed by MS and tandem MS (MS/MS and MS/MS/MS) using collision-induced dissociation and post-source decay. Two candidate sequences, pEEEEGDD and pEEEGEDD (where pE is pyroglutamate), were identified from the MS/MS experiments; additional tandem MS suggests the sequence is pEEEEGDD. The N-terminal glutamate residues explain the inability to sequence LMWCr by the Edman method. Langmuir isotherms and Hill plots were used to analyze the binding constants of chromic ions to synthetic peptides similar in composition to apoLMWCr. The sequence pEEEEGDD was found to bind 4 chromic ions per peptide with nearly identical cooperativity and binding constants to those of apoLMWCr. This work should lead to further studies elucidating or eliminating a potential role for LMWCr in treating the symptoms of type 2 diabetes and other conditions resulting from improper carbohydrate and lipid metabolism. PMID:21593351

  7. Regulation of Bacteria-Induced Intercellular Adhesion Molecule-1 by CCAAT/Enhancer Binding Proteins

    PubMed Central

    Manzel, Lori J.; Chin, Cecilia L.; Behlke, Mark A.; Look, Dwight C.

    2009-01-01

    Direct interaction between bacteria and epithelial cells may initiate or amplify the airway response through induction of epithelial defense gene expression by nuclear factor-κB (NF-κB). However, multiple signaling pathways modify NF-κB effects to modulate gene expression. In this study, the effects of CCAAT/enhancer binding protein (C/EBP) family members on induction of the leukocyte adhesion glycoprotein intercellular adhesion molecule-1 (ICAM-1) was examined in primary cultures of human tracheobronchial epithelial cells incubated with nontypeable Haemophilus influenzae. Increased ICAM-1 gene transcription in response to H. influenzae required gene sequences located at −200 to −135 in the 5′-flanking region that contain a C/EBP-binding sequence immediately upstream of the NF-κB enhancer site. Constitutive C/EBPβ was found to have an important role in epithelial cell ICAM-1 regulation, while the adjacent NF-κB sequence binds the RelA/p65 and NF-κB1/p50 members of the NF-κB family to induce ICAM-1 expression in response to H. influenzae. The expression of C/EBP proteins is not regulated by p38 mitogen-activated protein kinase activation, but p38 affects gene transcription by increasing the binding of TATA-binding protein to TATA-box–containing gene sequences. Epithelial cell ICAM-1 expression in response to H. influenzae was decreased by expressing dominant-negative protein or RNA interference against C/EBPβ, confirming its role in ICAM-1 regulation. Although airway epithelial cells express multiple constitutive and inducible C/EBP family members that bind C/EBP sequences, the results indicate that C/EBPβ plays a central role in modulation of NF-κB–dependent defense gene expression in human airway epithelial cells after exposure to H. influenzae. PMID:18703796

  8. Genome-wide identification and characterization of Notch transcription complex-binding sequence paired sites in leukemia cells

    PubMed Central

    Severson, Eric; Arnett, Kelly L.; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S.; Liu, X. Shirley; Blacklow, Stephen C.; Aster, Jon C.

    2018-01-01

    Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and are linked to the Notch-responsiveness of a few genes, but their overall contribution to Notch-dependent gene regulation is unknown. To address this issue, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay, and applied insights from these in vitro studies to Notch-“addicted” leukemia cells. We find that SPSs contribute to the regulation of approximately a third of direct Notch target genes. While originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5. Our work provides a general method for identifying sequence-paired sites in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. PMID:28465412

  9. Recombinant soluble adenovirus receptor

    DOEpatents

    Freimuth, Paul I.

    2002-01-01

    Disclosed are isolated polypeptides from human CAR (coxsackievirus and adenovirus receptor) protein which bind adenovirus. Specifically disclosed are amino acid sequences which corresponds to adenovirus binding domain D1 and the entire extracellular domain of human CAR protein comprising D1 and D2. In other aspects, the disclosure relates to nucleic acid sequences encoding these domains as well as expression vectors which encode the domains and bacterial cells containing such vectors. Also disclosed is an isolated fusion protein comprised of the D1 polypeptide sequence fused to a polypeptide sequence which facilitates folding of D1 into a functional, soluble domain when expressed in bacteria. The functional D1 domain finds application for example in a therapeutic method for treating a patient infected with a virus which binds to D1, and also in a method for identifying an antiviral compound which interferes with viral attachment. Also included is a method for specifically targeting a cell for infection by a virus which binds to D1.

  10. [Screening specific recognition motif of RNA-binding proteins by SELEX in combination with next-generation sequencing technique].

    PubMed

    Zhang, Lu; Xu, Jinhao; Ma, Jinbiao

    2016-07-25

    RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.

  11. Changes in tau phosphorylation in hibernating rodents.

    PubMed

    León-Espinosa, Gonzalo; García, Esther; García-Escudero, Vega; Hernández, Félix; Defelipe, Javier; Avila, Jesús

    2013-07-01

    Tau is a cytoskeletal protein present mainly in the neurons of vertebrates. By comparing the sequence of tau molecule among different vertebrates, it was found that the variability of the N-terminal sequence in tau protein is higher than that of the C-terminal region. The N-terminal region is involved mainly in the binding of tau to cellular membranes, whereas the C-terminal region of the tau molecule contains the microtubule-binding sites. We have compared the sequence of Syrian hamster tau with the sequences of other hibernating and nonhibernating rodents and investigated how differences in the N-terminal region of tau could affect the phosphorylation level and tau binding to cell membranes. We also describe a change, in tau phosphorylation, on a casein kinase 1 (ck1)-dependent site that is found only in hibernating rodents. This ck1 site seems to play an important role in the regulation of tau binding to membranes. Copyright © 2013 Wiley Periodicals, Inc.

  12. MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression

    PubMed Central

    Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.

    2016-01-01

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769

  13. Impact of cadmium, cobalt and nickel on sequence-specific DNA binding of p63 and p73 in vitro and in cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adámik, Matej; Bažantová, Pavla; Department of Biology and Ecology, Faculty of Science, University of Ostrava, Chittussiho 10, 701 03 Ostrava

    Highlights: • DNA binding of p53 family core domains is inhibited by cadmium, cobalt and nickel. • Binding to DNA protects p53 family core domains from metal induced inhibition. • Cadmium, cobalt and nickel induced inhibition was reverted by EDTA in vitro. - Abstract: Site-specific DNA recognition and binding activity belong to common attributes of all three members of tumor suppressor p53 family proteins: p53, p63 and p73. It was previously shown that heavy metals can affect p53 conformation, sequence-specific binding and suppress p53 response to DNA damage. Here we report for the first time that cadmium, nickel and cobalt,more » which have already been shown to disturb various DNA repair mechanisms, can also influence p63 and p73 sequence-specific DNA binding activity and transactivation of p53 family target genes. Based on results of electrophoretic mobility shift assay and luciferase reporter assay, we conclude that cadmium inhibits sequence-specific binding of all three core domains to p53 consensus sequences and abolishes transactivation of several promoters (e.g. BAX and MDM2) by 50 μM concentrations. In the presence of specific DNA, all p53 family core domains were partially protected against loss of DNA binding activity due to cadmium treatment. Effective cadmium concentration to abolish DNA–protein interactions was about two times higher for p63 and p73 proteins than for p53. Furthermore, we detected partial reversibility of cadmium inhibition for all p53 family members by EDTA. DTT was able to reverse cadmium inhibition only for p53 and p73. Nickel and cobalt abolished DNA–p53 interaction at sub-millimolar concentrations while inhibition of p63 and p73 DNA binding was observed at millimolar concentrations. In summary, cadmium strongly inhibits p53, p63 and p73 DNA binding in vitro and in cells in comparison to nickel and cobalt. The role of cadmium inhibition of p53 tumor suppressor family in carcinogenesis is discussed.« less

  14. Selection of peptides binding to metallic borides by screening M13 phage display libraries.

    PubMed

    Ploss, Martin; Facey, Sandra J; Bruhn, Carina; Zemel, Limor; Hofmann, Kathrin; Stark, Robert W; Albert, Barbara; Hauer, Bernhard

    2014-02-10

    Metal borides are a class of inorganic solids that is much less known and investigated than for example metal oxides or intermetallics. At the same time it is a highly versatile and interesting class of compounds in terms of physical and chemical properties, like semiconductivity, ferromagnetism, or catalytic activity. This makes these substances attractive for the generation of new materials. Very little is known about the interaction between organic materials and borides. To generate nanostructured and composite materials which consist of metal borides and organic modifiers it is necessary to develop new synthetic strategies. Phage peptide display libraries are commonly used to select peptides that bind specifically to metals, metal oxides, and semiconductors. Further, these binding peptides can serve as templates to control the nucleation and growth of inorganic nanoparticles. Additionally, the combination of two different binding motifs into a single bifunctional phage could be useful for the generation of new composite materials. In this study, we have identified a unique set of sequences that bind to amorphous and crystalline nickel boride (Ni3B) nanoparticles, from a random peptide library using the phage display technique. Using this technique, strong binders were identified that are selective for nickel boride. Sequence analysis of the peptides revealed that the sequences exhibit similar, yet subtle different patterns of amino acid usage. Although a predominant binding motif was not observed, certain charged amino acids emerged as essential in specific binding to both substrates. The 7-mer peptide sequence LGFREKE, isolated on amorphous Ni3B emerged as the best binder for both substrates. Fluorescence microscopy and atomic force microscopy confirmed the specific binding affinity of LGFREKE expressing phage to amorphous and crystalline Ni3B nanoparticles. This study is, to our knowledge, the first to identify peptides that bind specifically to amorphous and to crystalline Ni3B nanoparticles. We think that the identified strong binding sequences described here could potentially serve for the utilisation of M13 phage as a viable alternative to other methods to create tailor-made boride composite materials or new catalytic surfaces by a biologically driven nano-assembly synthesis and structuring.

  15. Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

    PubMed Central

    Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

    2010-01-01

    Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10–20% nucleotide deviation from the canonical ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966

  16. Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements.

    PubMed

    Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B

    2010-04-01

    Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10-20% nucleotide deviation from the canonical ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.

  17. Improved pan-specific MHC class I peptide-binding predictions using a novel representation of the MHC-binding cleft environment.

    PubMed

    Carrasco Pro, S; Zimic, M; Nielsen, M

    2014-02-01

    Major histocompatibility complex (MHC) molecules play a key role in cell-mediated immune responses presenting bounded peptides for recognition by the immune system cells. Several in silico methods have been developed to predict the binding affinity of a given peptide to a specific MHC molecule. One of the current state-of-the-art methods for MHC class I is NetMHCpan, which has a core ingredient for the representation of the MHC class I molecule using a pseudo-sequence representation of the binding cleft amino acid environment. New and large MHC-peptide-binding data sets are constantly being made available, and also new structures of MHC class I molecules with a bound peptide have been published. In order to test if the NetMHCpan method can be improved by integrating this novel information, we created new pseudo-sequence definitions for the MHC-binding cleft environment from sequence and structural analyses of different MHC data sets including human leukocyte antigen (HLA), non-human primates (chimpanzee, macaque and gorilla) and other animal alleles (cattle, mouse and swine). From these constructs, we showed that by focusing on MHC sequence positions found to be polymorphic across the MHC molecules used to train the method, the NetMHCpan method achieved a significant increase in the predictive performance, in particular, of non-human MHCs. This study hence showed that an improved performance of MHC-binding methods can be achieved not only by the accumulation of more MHC-peptide-binding data but also by a refined definition of the MHC-binding environment including information from non-human species. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  18. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2008-04-08

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  19. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2010-10-05

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  20. Sequence-specific DNA binding Pyrrole-imidazole polyamides and their applications.

    PubMed

    Kawamoto, Yusuke; Bando, Toshikazu; Sugiyama, Hiroshi

    2018-05-01

    Pyrrole-imidazole polyamides (Py-Im polyamides) are cell-permeable compounds that bind to the minor groove of double-stranded DNA in a sequence-specific manner without causing denaturation of the DNA. These compounds can be used to control gene expression and to stain specific sequences in cells. Here, we review the history, structural variations, and functional investigations of Py-Im polyamides. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. Endogenous Hot Spots of De Novo Telomere Addition in the Yeast Genome Contain Proximal Enhancers That Bind Cdc13

    PubMed Central

    Obodo, Udochukwu C.; Epum, Esther A.; Platts, Margaret H.; Seloff, Jacob; Dahlson, Nicole A.; Velkovsky, Stoycho M.; Paul, Shira R.

    2016-01-01

    DNA double-strand breaks (DSBs) pose a threat to genome stability and are repaired through multiple mechanisms. Rarely, telomerase, the enzyme that maintains telomeres, acts upon a DSB in a mutagenic process termed telomere healing. The probability of telomere addition is increased at specific genomic sequences termed sites of repair-associated telomere addition (SiRTAs). By monitoring repair of an induced DSB, we show that SiRTAs on chromosomes V and IX share a bipartite structure in which a core sequence (Core) is directly targeted by telomerase, while a proximal sequence (Stim) enhances the probability of de novo telomere formation. The Stim and Core sequences are sufficient to confer a high frequency of telomere addition to an ectopic site. Cdc13, a single-stranded DNA binding protein that recruits telomerase to endogenous telomeres, is known to stimulate de novo telomere addition when artificially recruited to an induced DSB. Here we show that the ability of the Stim sequence to enhance de novo telomere addition correlates with its ability to bind Cdc13, indicating that natural sites at which telomere addition occurs at high frequency require binding by Cdc13 to a sequence 20 to 100 bp internal from the site at which telomerase acts to initiate de novo telomere addition. PMID:27044869

  2. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  3. Human Lineage-Specific Transcriptional Regulation through GA-Binding Protein Transcription Factor Alpha (GABPa)

    PubMed Central

    Perdomo-Sabogal, Alvaro; Nowick, Katja; Piccini, Ilaria; Sudbrak, Ralf; Lehrach, Hans; Yaspo, Marie-Laure; Warnatz, Hans-Jörg; Querfurth, Robert

    2016-01-01

    A substantial fraction of phenotypic differences between closely related species are likely caused by differences in gene regulation. While this has already been postulated over 30 years ago, only few examples of evolutionary changes in gene regulation have been verified. Here, we identified and investigated binding sites of the transcription factor GA-binding protein alpha (GABPa) aiming to discover cis-regulatory adaptations on the human lineage. By performing chromatin immunoprecipitation-sequencing experiments in a human cell line, we found 11,619 putative GABPa binding sites. Through sequence comparisons of the human GABPa binding regions with orthologous sequences from 34 mammals, we identified substitutions that have resulted in 224 putative human-specific GABPa binding sites. To experimentally assess the transcriptional impact of those substitutions, we selected four promoters for promoter-reporter gene assays using human and African green monkey cells. We compared the activities of wild-type promoters to mutated forms, where we have introduced one or more substitutions to mimic the ancestral state devoid of the GABPa consensus binding sequence. Similarly, we introduced the human-specific substitutions into chimpanzee and macaque promoter backgrounds. Our results demonstrate that the identified substitutions are functional, both in human and nonhuman promoters. In addition, we performed GABPa knock-down experiments and found 1,215 genes as strong candidates for primary targets. Further analyses of our data sets link GABPa to cognitive disorders, diabetes, KRAB zinc finger (KRAB-ZNF), and human-specific genes. Thus, we propose that differences in GABPa binding sites played important roles in the evolution of human-specific phenotypes. PMID:26814189

  4. RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

    PubMed

    Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-01-01

    Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.

  5. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-01-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163

  6. Degenerate Pax2 and Senseless binding motifs improve detection of low-affinity sites required for enhancer specificity

    PubMed Central

    Zandvakili, Arya; Campbell, Ian; Weirauch, Matthew T.

    2018-01-01

    Cells use thousands of regulatory sequences to recruit transcription factors (TFs) and produce specific transcriptional outcomes. Since TFs bind degenerate DNA sequences, discriminating functional TF binding sites (TFBSs) from background sequences represents a significant challenge. Here, we show that a Drosophila regulatory element that activates Epidermal Growth Factor signaling requires overlapping, low-affinity TFBSs for competing TFs (Pax2 and Senseless) to ensure cell- and segment-specific activity. Testing available TF binding models for Pax2 and Senseless, however, revealed variable accuracy in predicting such low-affinity TFBSs. To better define parameters that increase accuracy, we developed a method that systematically selects subsets of TFBSs based on predicted affinity to generate hundreds of position-weight matrices (PWMs). Counterintuitively, we found that degenerate PWMs produced from datasets depleted of high-affinity sequences were more accurate in identifying both low- and high-affinity TFBSs for the Pax2 and Senseless TFs. Taken together, these findings reveal how TFBS arrangement can be constrained by competition rather than cooperativity and that degenerate models of TF binding preferences can improve identification of biologically relevant low affinity TFBSs. PMID:29617378

  7. Scientific Communication and the Unified Laboratory Sequence1

    NASA Astrophysics Data System (ADS)

    Silverstein, Todd P.; Hudak, Norman J.; Chapple, Frances H.; Goodney, David E.; Brink, Christina P.; Whitehead, Joyce P.

    1997-02-01

    The "Temperature Dependent Relaxation Kinetics" lab was first implemented in 1987; it uses stopped-flow pH jump techniques to determine rate constants and activation parameters (H, S, G) for a reaction mechanism. Two new experiments (Monoamine Oxidase, and Molecular Modeling) will be implemented in the fall of 1997. The "Monoamine Oxidase" project uses chromatography and spectrophotometry to purify and characterize the enzyme. Subsequent photometric assays explore the enzyme's substrate specificity, activation energy, and denaturation. Finally, in the "Molecular Modeling"project, students characterize enzyme - substrate and drug - receptor interactions. Energy minimization protocols are used to make predictions about protein structure and ligand binding, and to explore pharmacological and biomedical implications. With these additions, the twelve Unified Laboratory projects introduce our chemistry majors to nearly all of the instrumental methods commonly encountered in modern chemistry.

  8. Identification of distant drug off-targets by direct superposition of binding pocket surfaces.

    PubMed

    Schumann, Marcel; Armen, Roger S

    2013-01-01

    Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target ("distant off-targets"). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target ("distant off-target").

  9. Identification of Distant Drug Off-Targets by Direct Superposition of Binding Pocket Surfaces

    PubMed Central

    Schumann, Marcel; Armen, Roger S.

    2013-01-01

    Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target (“distant off-targets”). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target (“distant off-target”). PMID:24391782

  10. The pig CYP2E1 promoter is activated by COUP-TF1 and HNF-1 and is inhibited by androstenone.

    PubMed

    Tambyrajah, Winston S; Doran, Elena; Wood, Jeffrey D; McGivan, John D

    2004-11-15

    Functional analysis of the pig cytochrome P4502E1 (CYP2E1) promoter identified two major activating elements. One corresponded to the hepatic nuclear factor 1 (HNF-1) consensus binding sequence at nucleotides -128/-98 and the other was located in the region -292/-266. The binding of proteins in pig liver nuclear extracts to a synthetic double-stranded oligonucleotide corresponding to this more distal activating sequence was studied by electrophoretic mobility shift assay. The minimum protein binding sequence was identified as TGTTCTGACCTCTGGG. Gel super-shift assays identified the protein binding to this site as chick ovalbumin upstream promoter transcription factor 1 (COUP-TF1). Androstenone inhibited promoter activity in transfection experiments only with constructs which included the COUP-TF1 binding site. Androstenone inhibited COUP-TF1 binding to synthetic oligonucleotides but did not affect HNF-1 binding. The results offer an explanation for the inhibition of CYP2E1 protein expression by androstenone in isolated pig hepatocytes and may be relevant to the low expression of hepatic CYP2E1 in those pigs which accumulate high levels of androstenone in vivo.

  11. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes.

    PubMed

    Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying

    2015-01-01

    Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.

  12. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

    PubMed Central

    2015-01-01

    Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483

  13. Isolation from genomic DNA of sequences binding specific regulatory proteins by the acceleration of protein electrophoretic mobility upon DNA binding.

    PubMed

    Subrahmanyam, S; Cronan, J E

    1999-01-21

    We report an efficient and flexible in vitro method for the isolation of genomic DNA sequences that are the binding targets of a given DNA binding protein. This method takes advantage of the fact that binding of a protein to a DNA molecule generally increases the rate of migration of the protein in nondenaturing gel electrophoresis. By the use of a radioactively labeled DNA-binding protein and nonradioactive DNA coupled with PCR amplification from gel slices, we show that specific binding sites can be isolated from Escherichia coli genomic DNA. We have applied this method to isolate a binding site for FadR, a global regulator of fatty acid metabolism in E. coli. We have also isolated a second binding site for BirA, the biotin operon repressor/biotin ligase, from the E. coli genome that has a very low binding efficiency compared with the bio operator region.

  14. Effects of nucleoside analog incorporation on DNA binding to the DNA binding domain of the GATA-1 erythroid transcription factor.

    PubMed

    Foti, M; Omichinski, J G; Stahl, S; Maloney, D; West, J; Schweitzer, B I

    1999-02-05

    We investigate here the effects of the incorporation of the nucleoside analogs araC (1-beta-D-arabinofuranosylcytosine) and ganciclovir (9-[(1,3-dihydroxy-2-propoxy)methyl] guanine) into the DNA binding recognition sequence for the GATA-1 erythroid transcription factor. A 10-fold decrease in binding affinity was observed for the ganciclovir-substituted DNA complex in comparison to an unmodified DNA of the same sequence composition. AraC substitution did not result in any changes in binding affinity. 1H-15N HSQC and NOESY NMR experiments revealed a number of chemical shift changes in both DNA and protein in the ganciclovir-modified DNA-protein complex when compared to the unmodified DNA-protein complex. These changes in chemical shift and binding affinity suggest a change in the binding mode of the complex when ganciclovir is incorporated into the GATA DNA binding site.

  15. Sequence-specific binding of counterions to B-DNA

    PubMed Central

    Denisov, Vladimir P.; Halle, Bertil

    2000-01-01

    Recent studies by x-ray crystallography, NMR, and molecular simulations have suggested that monovalent counterions can penetrate deeply into the minor groove of B form DNA. Such groove-bound ions potentially could play an important role in AT-tract bending and groove narrowing, thereby modulating DNA function in vivo. To address this issue, we report here 23Na magnetic relaxation dispersion measurements on oligonucleotides, including difference experiments with the groove-binding drug netropsin. The exquisite sensitivity of this method to ions in long-lived and intimate association with DNA allows us to detect sequence-specific sodium ion binding in the minor groove AT tract of three B-DNA dodecamers. The sodium ion occupancy is only a few percent, however, and therefore is not likely to contribute importantly to the ensemble of B-DNA structures. We also report results of ion competition experiments, indicating that potassium, rubidium, and cesium ions bind to the minor groove with similarly weak affinity as sodium ions, whereas ammonium ion binding is somewhat stronger. The present findings are discussed in the light of previous NMR and diffraction studies of sequence-specific counterion binding to DNA. PMID:10639130

  16. Identification of a p53-response element in the promoter of the proline oxidase gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maxwell, Steve A.; Kochevar, Gerald J.

    2008-05-02

    Proline oxidase (POX) is a p53-induced proapoptotic gene. We investigated whether p53 could bind directly to the POX gene promoter. Chromatin immunoprecipitation (ChIP) assays detected p53 bound to POX upstream gene sequences. In support of the ChIP results, sequence analysis of the POX gene and its 5' flanking sequences revealed a potential p53-binding site, GGGCTTGTCTTCGTGTGACTTCTGTCT, located at 1161 base pairs (bp) upstream of the transcriptional start site. A 711-bp DNA fragment containing the candidate p53-binding site exhibited reporter gene activity that was induced by p53. In contrast, the same DNA region lacking the candidate p53-binding site did not show significantmore » p53-response activity. Electrophoretic mobility shift assay (EMSA) in ACHN renal carcinoma cell nuclear lysates confirmed that p53 could bind to the 711-bp POX DNA fragment. We concluded from these experiments that a p53-binding site is positioned at -1161 to -1188 bp upstream of the POX transcriptional start site.« less

  17. Frequency of the first feature in action sequences influences feature binding.

    PubMed

    Mattson, Paul S; Fournier, Lisa R; Behmer, Lawrence P

    2012-10-01

    We investigated whether binding among perception and action feature codes is a preliminary step toward creating a more durable memory trace of an action event. If so, increasing the frequency of a particular event (e.g., a stimulus requiring a movement with the left or right hand in an up or down direction) should increase the strength and speed of feature binding for this event. The results from two experiments, using a partial-repetition paradigm, confirmed that feature binding increased in strength and/or occurred earlier for a high-frequency (e.g., left hand moving up) than for a low-frequency (e.g., right hand moving down) event. Moreover, increasing the frequency of the first-specified feature in the action sequence alone (e.g., "left" hand) increased the strength and/or speed of action feature binding (e.g., between the "left" hand and movement in an "up" or "down" direction). The latter finding suggests an update to the theory of event coding, as not all features in the action sequence equally determine binding strength. We conclude that action planning involves serial binding of features in the order of action feature execution (i.e., associations among features are not bidirectional but are directional), which can lead to a more durable memory trace. This is consistent with physiological evidence suggesting that serial order is preserved in an action plan executed from memory and that the first feature in the action sequence may be critical in preserving this serial order.

  18. Mass Spectrometric Determination of ILPR G-quadruplex Binding Sites in Insulin and IGF-2

    PubMed Central

    Xiao, JunFeng

    2009-01-01

    The insulin-linked polymorphic region (ILPR) of the human insulin gene promoter region forms G-quadruplex structures in vitro. Previous studies show that insulin and insulin-like growth factor-2 (IGF-2) exhibit high affinity binding in vitro to 2-repeat sequences of ILPR variants a and h, but negligible binding to variant i. Two-repeat sequences of variants a and h form intramolecular G-quadruplex structures that are not evidenced for variant i. Here we report on the use of protein digestion combined with affinity capture and MALDI-MS detection to pinpoint ILPR binding sites in insulin and IGF-2. Peptides captured by ILPR variants a and h were sequenced by MALDI-MS/MS, LC-MS and in silico digestion. On-bead digestion of insulin-ILPR variant a complexes supported the conclusions. The results indicate that the sequence VCG(N)RGF is generally present in the captured peptides and is likely involved in the affinity binding interactions of the proteins with the ILPR G-quadruplexes. The significance of arginine in the interactions was studied by comparing the affinities of synthesized peptides VCGERGF and VCGEAGF with ILPR variant a. Peptides from other regions of the proteins that are connected through disulfide linkages were also detected in some capture experiments. Identification of binding sites could facilitate design of DNA binding ligands for capture and detection of insulin and IGF-2. The interactions may have biological significance as well. PMID:19747845

  19. Zinc-binding Domain of the Bacteriophage T7 DNA Primase Modulates Binding to the DNA Template*

    PubMed Central

    Lee, Seung-Joo; Zhu, Bin; Akabayov, Barak; Richardson, Charles C.

    2012-01-01

    The zinc-binding domain (ZBD) of prokaryotic DNA primases has been postulated to be crucial for recognition of specific sequences in the single-stranded DNA template. To determine the molecular basis for this role in recognition, we carried out homolog-scanning mutagenesis of the zinc-binding domain of DNA primase of bacteriophage T7 using a bacterial homolog from Geobacillus stearothermophilus. The ability of T7 DNA primase to catalyze template-directed oligoribonucleotide synthesis is eliminated by substitution of any five-amino acid residue-long segment within the ZBD. The most significant defect occurs upon substitution of a region (Pro-16 to Cys-20) spanning two cysteines that coordinate the zinc ion. The role of this region in primase function was further investigated by generating a protein library composed of multiple amino acid substitutions for Pro-16, Asp-18, and Asn-19 followed by genetic screening for functional proteins. Examination of proteins selected from the screening reveals no change in sequence-specific recognition. However, the more positively charged residues in the region facilitate DNA binding, leading to more efficient oligoribonucleotide synthesis on short templates. The results suggest that the zinc-binding mode alone is not responsible for sequence recognition, but rather its interaction with the RNA polymerase domain is critical for DNA binding and for sequence recognition. Consequently, any alteration in the ZBD that disturbs its conformation leads to loss of DNA-dependent oligoribonucleotide synthesis. PMID:23024359

  20. Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

    PubMed

    Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

    2013-02-01

    Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

  1. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

    PubMed Central

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  2. Selection and Screening of DNA Aptamers for Inorganic Nanomaterials.

    PubMed

    Zhou, Yibo; Huang, Zhicheng; Yang, Ronghua; Liu, Juewen

    2018-02-21

    Searching for DNA sequences that can strongly and selectively bind to inorganic surfaces is a long-standing topic in bionanotechnology, analytical chemistry and biointerface research. This can be achieved either by aptamer selection starting with a very large library of ≈10 14 random DNA sequences, or by careful screening of a much smaller library (usually from a few to a few hundred) with rationally designed sequences. Unlike typical molecular targets, inorganic surfaces often have quite strong DNA adsorption affinities due to polyvalent binding and even chemical interactions. This leads to a very high background binding making aptamer selection difficult. Screening, on the other hand, can be designed to compare relative binding affinities of different DNA sequences and could be more appropriate for inorganic surfaces. The resulting sequences have been used for DNA-directed assembly, sorting of carbon nanotubes, and DNA-controlled growth of inorganic nanomaterials. It was recently discovered that poly-cytosine (C) DNA can strongly bind to a diverse range of nanomaterials including nanocarbons (graphene oxide and carbon nanotubes), various metal oxides and transition-metal dichalcogenides. In this Concept article, we articulate the need for screening and potential artifacts associated with traditional aptamer selection methods for inorganic surfaces. Representative examples of application are discussed, and a few future research opportunities are proposed towards the end of this article. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. A single amino-acid substitution in the Ets domain alters core DNA binding specificity of Ets1 to that of the related transcription factors Elf1 and E74.

    PubMed

    Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J

    1993-11-11

    Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.

  4. Structural analysis of DNA binding by C.Csp231I, a member of a novel class of R-M controller proteins regulating gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shevtsov, M. B.; Streeter, S. D.; Thresh, S.-J.

    2015-02-01

    The structure of the new class of controller proteins (exemplified by C.Csp231I) in complex with its 21 bp DNA-recognition sequence is presented, and the molecular basis of sequence recognition in this class of proteins is discussed. An unusual extended spacer between the dimer binding sites suggests a novel interaction between the two C-protein dimers. In a wide variety of bacterial restriction–modification systems, a regulatory ‘controller’ protein (or C-protein) is required for effective transcription of its own gene and for transcription of the endonuclease gene found on the same operon. We have recently turned our attention to a new class ofmore » controller proteins (exemplified by C.Csp231I) that have quite novel features, including a much larger DNA-binding site with an 18 bp (∼60 Å) spacer between the two palindromic DNA-binding sequences and a very different recognition sequence from the canonical GACT/AGTC. Using X-ray crystallography, the structure of the protein in complex with its 21 bp DNA-recognition sequence was solved to 1.8 Å resolution, and the molecular basis of sequence recognition in this class of proteins was elucidated. An unusual aspect of the promoter sequence is the extended spacer between the dimer binding sites, suggesting a novel interaction between the two C-protein dimers when bound to both recognition sites correctly spaced on the DNA. A U-bend model is proposed for this tetrameric complex, based on the results of gel-mobility assays, hydrodynamic analysis and the observation of key contacts at the interface between dimers in the crystal.« less

  5. A serum response factor-dependent transcriptional regulatory program identifies distinct smooth muscle cell sublineages.

    PubMed Central

    Kim, S; Ip, H S; Lu, M M; Clendenin, C; Parmacek, M S

    1997-01-01

    The SM22alpha promoter has been used as a model system to define the molecular mechanisms that regulate smooth muscle cell (SMC) specific gene expression during mammalian development. The SM22alpha gene is expressed exclusively in vascular and visceral SMCs during postnatal development and is transiently expressed in the heart and somites during embryogenesis. Analysis of the SM22alpha promoter in transgenic mice revealed that 280 bp of 5' flanking sequence is sufficient to restrict expression of the lacZ reporter gene to arterial SMCs and the myotomal component of the somites. DNase I footprint and electrophoretic mobility shift analyses revealed that the SM22alpha promoter contains six nuclear protein binding sites (designated smooth muscle elements [SMEs] -1 to -6, respectively), two of which bind serum response factor (SRF) (SME-1 and SME-4). Mutational analyses demonstrated that a two-nucleotide substitution that selectively eliminates SRF binding to SME-4 decreases SM22alpha promoter activity in arterial SMCs by approximately 90%. Moreover, mutations that abolish binding of SRF to SME-1 and SME-4 or mutations that eliminate each SME-3 binding activity totally abolished SM22alpha promoter activity in the arterial SMCs and somites of transgenic mice. Finally, we have shown that a multimerized copy of SME-4 (bp -190 to -110) when linked to the minimal SM22alpha promoter (bp -90 to +41) is necessary and sufficient to direct high-level transcription in an SMC lineage-restricted fashion. Taken together, these data demonstrate that distinct transcriptional regulatory programs control SM22alpha gene expression in arterial versus visceral SMCs. Moreover, these data are consistent with a model in which combinatorial interactions between SRF and other transcription factors that bind to SME-4 (and that bind directly to SRF) activate transcription of the SM22alpha gene in arterial SMCs. PMID:9121477

  6. SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data

    PubMed Central

    Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo

    2018-01-01

    RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423

  7. DNA sequencing using polymerase substrate-binding kinetics

    PubMed Central

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  8. Identification of functional features of synthetic SINEUPs, antisense lncRNAs that specifically enhance protein translation

    PubMed Central

    Kozhuharova, Ana; Sharma, Harshita; Ohyama, Takako; Fasolo, Francesca; Yamazaki, Toshio; Cotella, Diego; Santoro, Claudio; Zucchelli, Silvia; Gustincich, Stefano; Carninci, Piero

    2018-01-01

    SINEUPs are antisense long noncoding RNAs, in which an embedded SINE B2 element UP-regulates translation of partially overlapping target sense mRNAs. SINEUPs contain two functional domains. First, the binding domain (BD) is located in the region antisense to the target, providing specific targeting to the overlapping mRNA. Second, the inverted SINE B2 represents the effector domain (ED) and enhances translation. To adapt SINEUP technology to a broader number of targets, we took advantage of a high-throughput, semi-automated imaging system to optimize synthetic SINEUP BD and ED design in HEK293T cell lines. Using SINEUP-GFP as a model SINEUP, we extensively screened variants of the BD to map features needed for optimal design. We found that most active SINEUPs overlap an AUG-Kozak sequence. Moreover, we report our screening of the inverted SINE B2 sequence to identify active sub-domains and map the length of the minimal active ED. Our synthetic SINEUP-GFP screening of both BDs and EDs constitutes a broad test with flexible applications to any target gene of interest. PMID:29414979

  9. The human luteinizing hormone receptor gene promoter: activation by Sp1 and Sp3 and inhibitory regulation.

    PubMed

    Geng, Y; Tsai-Morris, C H; Zhang, Y; Dufau, M L

    1999-09-24

    To understand the transcriptional mechanism(s) of human LH receptor (LHR) gene expression, we have identified the dominant functional cis-elements that regulate the activity of the promoter domain (-1 to -176 bp from ATG). Mutagenesis demonstrated that the promoter activity was dependent on two Sp1 domains (-79 bp, -120 bp) in a transformed normal placental cell (PLC) and the choriocarcinoma JAR cell. Both elements interacted with endogenous Sp1 and Sp3 factors but not with Sp2 or Sp4. In Drosophila SL2 cells, the promoter was activated by either Sp1 or Sp3. An ERE half-site (EREhs) at -174 bp was inhibitory (by 100%), but was unresponsive to estradiol and did not bind the estrogen receptor or orphan receptors ERR1 and SF-1. The 5' upstream sequence (-177 to -2056 bp) inhibited promoter activity in PLC by 60%, but only minimally in JAR cells. Activation of the human LHR promoter through Sp1/3 factors is negatively regulated through EREhs and upstream sequences to exert control of gene expression. Copyright 1999 Academic Press.

  10. A pH-sensitive heparin-binding sequence from Baculovirus gp64 protein is important for binding to mammalian cells but not to Sf9 insect cells.

    PubMed

    Wu, Chunxiao; Wang, Shu

    2012-01-01

    Binding to heparan sulfate is essential for baculovirus transduction of mammalian cells. Our previous study shows that gp64, the major glycoprotein on the virus surface, binds to heparin in a pH-dependent way, with a stronger binding at pH 6.2 than at 7.4. Using fluorescently labeled peptides, we mapped the pH-dependent heparin-binding sequence of gp64 to a 22-amino-acid region between residues 271 and 292. Binding of this region to the cell surface was also pH dependent, and peptides containing this sequence could efficiently inhibit baculovirus transduction of mammalian cells at pH 6.2. When the heparin-binding peptide was immobilized onto the bead surface to mimic the high local concentration of gp64 on the virus surface, the peptide-coated magnetic beads could efficiently pull down cells expressing heparan sulfate but not cells pretreated with heparinase or cells not expressing heparan sulfate. Interestingly, although this heparin-binding function is essential for baculovirus transduction of mammalian cells, it is dispensable for infection of Sf9 insect cells. Virus infectivity on Sf9 cells was not reduced by the presence of heparin or the identified heparin-binding peptide, even though the peptide could bind to Sf9 cell surface and be efficiently internalized. Thus, our data suggest that, depending on the availability of the target molecules on the cell surface, baculoviruses can use two different methods, electrostatic interaction with heparan sulfate and more specific receptor binding, for cell attachment.

  11. Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)

    PubMed Central

    Das, Sourav; Kokardekar, Arshad

    2009-01-01

    Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089

  12. Role of the chromatin landscape and sequence in determining cell type-specific genomic glucocorticoid receptor binding and gene regulation.

    PubMed

    Love, Michael I; Huska, Matthew R; Jurk, Marcel; Schöpflin, Robert; Starick, Stephan R; Schwahn, Kevin; Cooper, Samantha B; Yamamoto, Keith R; Thomas-Chollier, Morgane; Vingron, Martin; Meijsing, Sebastiaan H

    2017-02-28

    The genomic loci bound by the glucocorticoid receptor (GR), a hormone-activated transcription factor, show little overlap between cell types. To study the role of chromatin and sequence in specifying where GR binds, we used Bayesian modeling within the universe of accessible chromatin. Taken together, our results uncovered that although GR preferentially binds accessible chromatin, its binding is biased against accessible chromatin located at promoter regions. This bias can only be explained partially by the presence of fewer GR recognition sequences, arguing for the existence of additional mechanisms that interfere with GR binding at promoters. Therefore, we tested the role of H3K9ac, the chromatin feature with the strongest negative association with GR binding, but found that this correlation does not reflect a causative link. Finally, we find a higher percentage of promoter-proximal GR binding for genes regulated by GR across cell types than for cell type-specific target genes. Given that GR almost exclusively binds accessible chromatin, we propose that cell type-specific regulation by GR preferentially occurs via distal enhancers, whose chromatin accessibility is typically cell type-specific, whereas ubiquitous target gene regulation is more likely to result from binding to promoter regions, which are often accessible regardless of cell type examined. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

    PubMed

    Robasky, Kimberly; Bulyk, Martha L

    2011-01-01

    The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.

  14. Hardware Acceleration Of Multi-Deme Genetic Algorithm for DNA Codeword Searching

    DTIC Science & Technology

    2008-01-01

    C and G are complementary to each other. A Watson - Crick complement of a DNA sequence is another DNA sequence which replaces all the A with T or vise...versa and replaces all the T with A or vise versa, and also switches the 5’ and 3’ ends. A DNA sequence binds most stably with its Watson - Crick ...bind with 5 Watson - Crick pairs. The length of the longest complementary sequence between two flexible DNA strands, A and B, is the same as the

  15. Activation of erythropoietin receptor in the absence of hormone by a peptide that binds to a domain different from the hormone binding site

    PubMed Central

    Naranda, Tatjana; Wong, Kenneth; Kaufman, R. Ilene; Goldstein, Avram; Olsson, Lennart

    1999-01-01

    Applying a homology search method previously described, we identified a sequence in the extracellular dimerization site of the erythropoietin receptor, distant from the hormone binding site. A peptide identical to that sequence was synthesized. Remarkably, it activated receptor signaling in the absence of erythropoietin. Neither the peptide nor the hormone altered the affinity of the other for the receptor; thus, the peptide does not bind to the hormone binding site. The combined activation of signal transduction by hormone and peptide was strongly synergistic. In mice, the peptide acted like the hormone, protecting against the decrease in hematocrit caused by carboplatin. PMID:10377456

  16. Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection.

    PubMed

    Ma, Xin; Guo, Jing; Sun, Xiao

    2015-01-01

    The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.

  17. Deformability in the cleavage site of primary microRNA is not sensed by the double-stranded RNA binding domains in the microprocessor component DGCR8.

    PubMed

    Quarles, Kaycee A; Chadalavada, Durga; Showalter, Scott A

    2015-06-01

    The prevalence of double-stranded RNA (dsRNA) in eukaryotic cells has only recently been appreciated. Of interest here, RNA silencing begins with dsRNA substrates that are bound by the dsRNA-binding domains (dsRBDs) of their processing proteins. Specifically, processing of microRNA (miRNA) in the nucleus minimally requires the enzyme Drosha and its dsRBD-containing cofactor protein, DGCR8. The smallest recombinant construct of DGCR8 that is sufficient for in vitro dsRNA binding, referred to as DGCR8-Core, consists of its two dsRBDs and a C-terminal tail. As dsRBDs rarely recognize the nucleotide sequence of dsRNA, it is reasonable to hypothesize that DGCR8 function is dependent on the recognition of specific structural features in the miRNA precursor. Previously, we demonstrated that noncanonical structural elements that promote RNA flexibility within the stem of miRNA precursors are necessary for efficient in vitro cleavage by reconstituted Microprocessor complexes. Here, we combine gel shift assays with in vitro processing assays to demonstrate that neither the N-terminal dsRBD of DGCR8 in isolation nor the DGCR8-Core construct is sensitive to the presence of noncanonical structural elements within the stem of miRNA precursors, or to single-stranded segments flanking the stem. Extending DGCR8-Core to include an N-terminal heme-binding region does not change our conclusions. Thus, our data suggest that although the DGCR8-Core region is necessary for dsRNA binding and recruitment to the Microprocessor, it is not sufficient to establish the previously observed connection between RNA flexibility and processing efficiency. © 2015 Wiley Periodicals, Inc.

  18. Biological Nanoplatforms for Self-Assembled Electronics

    DTIC Science & Technology

    2015-03-24

    as M13 , a virus that infects Escherichia coli. Approximately one billion different amino acid sequences are displayed on different viruses in the...sequence when contained within a phage M13 coat protein sequence, not chemically linked to the surface of phage MS2 VLPs. Thus, binding properties may...gallium arsenide in a bacteriophage M13 phage display library, MS2 VLPs modified with the metal binding peptides do not display the same activity

  19. Nonparametric Combinatorial Sequence Models

    NASA Astrophysics Data System (ADS)

    Wauthier, Fabian L.; Jordan, Michael I.; Jojic, Nebojsa

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This paper presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three sequence datasets which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution induced by the prior. By integrating out the posterior our method compares favorably to leading binding predictors.

  20. Screening the sequence selectivity of DNA-binding molecules using a gold nanoparticle-based colorimetric approach.

    PubMed

    Hurst, Sarah J; Han, Min Su; Lytton-Jean, Abigail K R; Mirkin, Chad A

    2007-09-15

    We have developed a novel competition assay that uses a gold nanoparticle (Au NP)-based, high-throughput colorimetric approach to screen the sequence selectivity of DNA-binding molecules. This assay hinges on the observation that the melting behavior of DNA-functionalized Au NP aggregates is sensitive to the concentration of the DNA-binding molecule in solution. When short, oligomeric hairpin DNA sequences were added to a reaction solution consisting of DNA-functionalized Au NP aggregates and DNA-binding molecules, these molecules may either bind to the Au NP aggregate interconnects or the hairpin stems based on their relative affinity for each. This relative affinity can be measured as a change in the melting temperature (Tm) of the DNA-modified Au NP aggregates in solution. As a proof of concept, we evaluated the selectivity of 4',6-diamidino-2-phenylindone (an AT-specific binder), ethidium bromide (a nonspecific binder), and chromomycin A (a GC-specific binder) for six sequences of hairpin DNA having different numbers of AT pairs in a five-base pair variable stem region. Our assay accurately and easily confirmed the known trends in selectivity for the DNA binders in question without the use of complicated instrumentation. This novel assay will be useful in assessing large libraries of potential drug candidates that work by binding DNA to form a drug/DNA complex.

  1. Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters

    PubMed Central

    Wozniak, Christopher E.; Hughes, Kelly T.

    2008-01-01

    Summary Computational searches for DNA binding sites often utilize consensus sequences. These search models make assumptions that the frequency of a base pair in an alignment relates to the base pair’s importance in binding and presume that base pairs contribute independently to the overall interaction with the DNA binding protein. These two assumptions have generally been found to be accurate for DNA binding sites. However, these assumptions are often not satisfied for promoters, which are involved in additional steps in transcription initiation after RNA polymerase has bound to the DNA. To test these assumptions for the flagellar regulatory hierarchy, class 2 and class 3 flagellar promoters were randomly mutagenized in Salmonella. Important positions were then saturated for mutagenesis and compared to scores calculated from the consensus sequence. Double mutants were constructed to determine how mutations combined for each promoter type. Mutations in the binding site for FlhD4C2, the activator of class 2 promoters, better satisfied the assumptions for the binding model than did mutations in the class 3 promoter, which is recognized by the σ28 transcription factor. These in vivo results indicate that the activator sites within flagellar promoters can be modeled using simple assumptions but that the DNA sequences recognized by the flagellar sigma factor require more complex models. PMID:18486950

  2. Conserved RNA binding activity of a Yin-Yang 1 homologue in the ova of the purple sea urchin Strongylocentrotus purpuratus.

    PubMed

    Belak, Zachery R; Ovsenek, Nicholas; Eskiw, Christopher H

    2018-05-23

    Yin-Yang 1 (YY1) is a highly conserved transcription factor possessing RNA-binding activity. A putative YY1 homologue was previously identified in the developmental model organism Strongylocentrotus purpuratus (the purple sea urchin) by genomic sequencing. We identified a high degree of sequence similarity with YY1 homologues of vertebrate origin which shared 100% protein sequence identity over the DNA- and RNA-binding zinc-finger region with high similarity in the N-terminal transcriptional activation domain. SpYY1 demonstrated identical DNA- and RNA-binding characteristics between Xenopus laevis and S. purpuratus indicating that it maintains similar functional and biochemical properties across widely divergent deuterostome species. SpYY1 binds to the consensus YY1 DNA element, and also to U-rich RNA sequences. Although we detected SpYY1 RNA-binding activity in ova lysates and observed cytoplasmic localization, SpYY1 was not associated with maternal mRNA in ova. SpYY1 expressed in Xenopus oocytes was excluded from the nucleus and associated with maternally expressed cytoplasmic mRNA molecules. These data demonstrate the existence of an YY1 homologue in S. purpuratus with similar structural and biochemical features to those of the well-studied vertebrate YY1; however, the data reveal major differences in the biological role of YY1 in the regulation of maternally expressed mRNA in the two species.

  3. A purified truncated form of yeast Gal4 expressed in Escherichia coli and used to functionalize poly(lactic acid) nanoparticle surface is transcriptionally active in cellulo.

    PubMed

    Legaz, Sophie; Exposito, Jean-Yves; Borel, Agnès; Candusso, Marie-Pierre; Megy, Simon; Montserret, Roland; Lahaye, Vincent; Terzian, Christophe; Verrier, Bernard

    2015-09-01

    Gal4/UAS system is a powerful tool for the analysis of numerous biological processes. Gal4 is a large yeast transcription factor that activates genes including UAS sequences in their promoter. Here, we have synthesized a minimal form of Gal4 DNA sequence coding for the binding and dimerization regions, but also part of the transcriptional activation domain. This truncated Gal4 protein was expressed as inclusion bodies in Escherichia coli. A structured and active form of this recombinant protein was purified and used to cover poly(lactic acid) (PLA) nanoparticles. In cellulo, these Gal4-vehicles were able to activate the expression of a Green Fluorescent Protein (GFP) gene under the control of UAS sequences, demonstrating that the decorated Gal4 variant can be delivery into cells where it still retains its transcription factor capacities. Thus, we have produced in E. coli and purified a short active form of Gal4 that retains its functions at the surface of PLA-nanoparticles in cellular assay. These decorated Gal4-nanoparticles will be useful to decipher their tissue distribution and their potential after ingestion or injection in UAS-GFP recombinant animal models. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Computational analysis of protein-protein interfaces involving an alpha helix: insights for terphenyl-like molecules binding.

    PubMed

    Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A

    2013-06-14

    Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.

  5. Control of DEMETER DNA demethylase gene transcription in male and female gamete companion cells in Arabidopsis thaliana

    PubMed Central

    Park, Jin-Sup; Frost, Jennifer M.; Park, Kyunghyuk; Ohr, Hyonhwa; Park, Guen Tae; Kim, Seohyun; Eom, Hyunjoo; Lee, Ilha; Brooks, Janie S.; Fischer, Robert L.; Choi, Yeonhee

    2017-01-01

    The DEMETER (DME) DNA glycosylase initiates active DNA demethylation via the base-excision repair pathway and is vital for reproduction in Arabidopsis thaliana. DME-mediated DNA demethylation is preferentially targeted to small, AT-rich, and nucleosome-depleted euchromatic transposable elements, influencing expression of adjacent genes and leading to imprinting in the endosperm. In the female gametophyte, DME expression and subsequent genome-wide DNA demethylation are confined to the companion cell of the egg, the central cell. Here, we show that, in the male gametophyte, DME expression is limited to the companion cell of sperm, the vegetative cell, and to a narrow window of time: immediately after separation of the companion cell lineage from the germline. We define transcriptional regulatory elements of DME using reporter genes, showing that a small region, which surprisingly lies within the DME gene, controls its expression in male and female companion cells. DME expression from this minimal promoter is sufficient to rescue seed abortion and the aberrant DNA methylome associated with the null dme-2 mutation. Within this minimal promoter, we found short, conserved enhancer sequences necessary for the transcriptional activities of DME and combined predicted binding motifs with published transcription factor binding coordinates to produce a list of candidate upstream pathway members in the genetic circuitry controlling DNA demethylation in gamete companion cells. These data show how DNA demethylation is regulated to facilitate endosperm gene imprinting and potential transgenerational epigenetic regulation, without subjecting the germline to potentially deleterious transposable element demethylation. PMID:28130550

  6. Regulation of C. elegans L4 cuticle collagen genes by the heterochronic protein LIN-29.

    PubMed

    Abete-Luzi, Patricia; Eisenmann, David M

    2018-05-01

    The cuticle, the outer covering of the nematode C. elegans, is synthesized five times during the worm's life by the underlying hypodermis. Cuticle collagens, the major cuticle component, are encoded by a large family of col genes and, interestingly, many of these genes express predominantly at a single developmental stage. This temporal preference motivated us to investigate the mechanisms underlying col gene expression and here we focus on a subset of col genes expressed in the L4 stage. We identified minimal promoter regions of <300 bp for col-38, col-49, and col-63. In these regions, we predicted cis-regulatory sequences and evaluated their function in vivo via mutagenesis of a col-38p::yfp reporter. We used RNAi to study the requirement for candidate transcription regulators ELT-1 and ELT-3, LIN-29, and the LIN-29 co-factor MAB-10, and found LIN-29 to be necessary for the expression of four L4-specific genes (col-38, col-49, col-63, and col-138). Temporal misexpression of LIN-29 was also sufficient to activate these genes at a different developmental stage. The LIN-29 DNA-binding domain bound the col-38, col-49, and col-63 minimal promoters in vitro. For col-38 we showed that the LIN-29 sites necessary for reporter expression in vivo are also bound in vitro: this is the first identification of specific binding sites for LIN-29 necessary for in vivo target gene expression. © 2018 Wiley Periodicals, Inc.

  7. Binding of mitochondrial leader sequences to Tom20 assessed using a bacterial two-hybrid system shows that hydrophobic interactions are essential and that some mutated leaders that do not bind Tom20 can still be imported.

    PubMed

    Mukhopadhyay, Abhijit; Yang, Chun-Song; Weiner, Henry

    2006-12-01

    Previous studies pointed to the importance of leucine residues in the binding of mitochondrial leader sequences to Tom20, an outer membrane protein translocator that initially binds the leader during import. A bacteria two-hybrid assay was here employed to determine if this could be an alternative way to investigate the binding of leader to the receptor. Leucine to alanine and arginine to glutamine mutations were made in the leader sequence from rat liver aldehyde dehydrogenase (pALDH). The leucine residues in the C-terminal of pALDH leader were found to be essential for TOM20 binding. The hydrophobic residues of another mitochondrial leader F1beta-ATPase that were important for Tom20 binding were found at the C-terminus of the leader. In contrast, it was the leucines in the N-terminus of the leader of ornithine transcarbamylase that were essential for binding. Modeling the peptides to the structure of Tom20 showed that the hydrophobic residues from the three proteins could all fit into the hydrophobic binding pocket. The mutants of pALDH that did not bind to Tom20 were still imported in vivo in transformed HeLa cells or in vitro into isolated mitochondria. In contrast, the mutant from pOTC was imported less well ( approximately 50%) while the mutant from F1beta-ATPase was not imported to any measurable extent. Binding to Tom20 might not be a prerequisite for import; however, it also is possible that import can occur even if binding to a receptor component is poor, so long as the leader binds tightly to another component of the translocator.

  8. Comparison between TRF2 and TRF1 of their telomeric DNA-bound structures and DNA-binding activities

    PubMed Central

    Hanaoka, Shingo; Nagadoi, Aritaka; Nishimura, Yoshifumi

    2005-01-01

    Mammalian telomeres consist of long tandem arrays of double-stranded telomeric TTAGGG repeats packaged by the telomeric DNA-binding proteins TRF1 and TRF2. Both contain a similar C-terminal Myb domain that mediates sequence-specific binding to telomeric DNA. In a DNA complex of TRF1, only the single Myb-like domain consisting of three helices can bind specifically to double-stranded telomeric DNA. TRF2 also binds to double-stranded telomeric DNA. Although the DNA binding mode of TRF2 is likely identical to that of TRF1, TRF2 plays an important role in the t-loop formation that protects the ends of telomeres. Here, to clarify the details of the double-stranded telomeric DNA-binding modes of TRF1 and TRF2, we determined the solution structure of the DNA-binding domain of human TRF2 bound to telomeric DNA; it consists of three helices, and like TRF1, the third helix recognizes TAGGG sequence in the major groove of DNA with the N-terminal arm locating in the minor groove. However, small but significant differences are observed; in contrast to the minor groove recognition of TRF1, in which an arginine residue recognizes the TT sequence, a lysine residue of TRF2 interacts with the TT part. We examined the telomeric DNA-binding activities of both DNA-binding domains of TRF1 and TRF2 and found that TRF1 binds more strongly than TRF2. Based on the structural differences of both domains, we created several mutants of the DNA-binding domain of TRF2 with stronger binding activities compared to the wild-type TRF2. PMID:15608118

  9. Fungal-type carbohydrate binding modules from the coccolithophore Emiliania huxleyi show binding affinity to cellulose and chitin.

    PubMed

    Rooijakkers, Bart J M; Ikonen, Martina S; Linder, Markus B

    2018-01-01

    Six fungal-type cellulose binding domains were found in the genome of the coccolithophore Emiliania huxleyi and cloned and expressed in Escherichia coli. Sequence comparison indicate high similarity to fungal cellulose binding domains, raising the question of why these domains exist in coccolithophores. The proteins were tested for binding with cellulose and chitin as ligands, which resulted in the identification of two functional carbohydrate binding modules: EHUX2 and EHUX4. Compared to benchmark fungal cellulose binding domain Cel7A-CBM1 from Trichoderma reesei, these proteins showed slightly lower binding to birch and bacterial cellulose, but were more efficient chitin binders. Finally, a set of cellulose binding domains was created based on the shuffling of one well-functioning and one non-functional domain. These were characterized in order to get more information of the binding domain's sequence-function relationship, indicating characteristic differences between the molecular basis of cellulose versus chitin recognition. As previous reports have showed the presence of cellulose in coccoliths and here we find functional cellulose binding modules, a possible connection is discussed.

  10. Identification of the cAMP response element that controls transcriptional activation of the insulin-like growth factor-I gene by prostaglandin E2 in osteoblasts

    NASA Technical Reports Server (NTRS)

    Thomas, M. J.; Umayahara, Y.; Shu, H.; Centrella, M.; Rotwein, P.; McCarthy, T. L.

    1996-01-01

    Insulin-like growth factor-I (IGF-I), a multifunctional growth factor, plays a key role in skeletal growth and can enhance bone cell replication and differentiation. We previously showed that prostaglandin E2 (PGE2) and other agents that increase cAMP activated IGF-I gene transcription in primary rat osteoblast cultures through promoter 1 (P1), the major IGF-I promoter, and found that transcriptional induction was mediated by protein kinase A. We now have identified a short segment of P1 that is essential for full hormonal regulation and have characterized inducible DNA-protein interactions involving this site. Transient transfections of IGF-I P1 reporter genes into primary rat osteoblasts showed that the 328-base pair untranslated region of exon 1 was required for a full 5.3-fold response to PGE2; mutation in a previously footprinted site, HS3D (base pairs +193 to +215), reduced induction by 65%. PGE2 stimulated nuclear protein binding to HS3D. Binding, as determined by gel mobility shift assay, was not seen in nuclear extracts from untreated osteoblast cultures, was detected within 2 h of PGE2 treatment, and was maximal by 4 h. This DNA-protein interaction was not observed in cytoplasmic extracts from PGE2-treated cultures, indicating nuclear localization of the protein kinase A-activated factor(s). Activation of this factor was not blocked by cycloheximide (Chx), and Chx did not impair stimulation of IGF-I gene expression by PGE2. In contrast, binding to a consensus cAMP response element (CRE; 5'-TGACGTCA-3') from the rat somatostatin gene was not modulated by PGE2 or Chx. Competition gel mobility shift analysis using mutated DNA probes identified 5'-CGCAATCG-3' as the minimal sequence needed for inducible binding. All modified IGF-I P1 promoterreporter genes with mutations within this CRE sequence also showed a diminished functional response to PGE2. These results identify the CRE within the 5'-untranslated region of IGF-I exon 1 that is required for hormonal activation of IGF-I gene transcription by cAMP in osteoblasts.

  11. CCAAT/enhancer-binding protein delta is a critical regulator of insulin-like growth factor-I gene transcription in osteoblasts

    NASA Technical Reports Server (NTRS)

    Umayahara, Y.; Billiard, J.; Ji, C.; Centrella, M.; McCarthy, T. L.; Rotwein, P.

    1999-01-01

    Insulin-like growth factor-I (IGF-I) plays a major role in promoting skeletal growth by stimulating bone cell replication and differentiation. Prostaglandin E2 and other agents that induce cAMP production enhance IGF-I gene transcription in cultured rat osteoblasts through a DNA element termed HS3D, located in the proximal part of the major rat IGF-I promoter. We previously determined that CCAAT/enhancer-binding protein delta (C/EBPdelta) is the key cAMP-stimulated regulator of IGF-I transcription in these cells and showed that it transactivates the rat IGF-I promoter through the HS3D site. We now have defined the physical-chemical properties and functional consequences of the interactions between C/EBPdelta and HS3D. C/EBPdelta, expressed in COS-7 cells or purified as a recombinant protein from Escherichia coli, bound to HS3D with an affinity at least equivalent to that of the albumin D-site, a known high affinity C/EBP binding sequence, and both DNA elements competed equally for C/EBPdelta. C/EBPdelta bound to HS3D as a dimer, with protein-DNA contact points located on guanine residues on both DNA strands within and just adjacent to the core C/EBP half-site, GCAAT, as determined by methylation interference footprinting. C/EBPdelta also formed protein-protein dimers in the absence of interactions with its DNA binding site, as indicated by results of glutaraldehyde cross-linking studies. As established by competition gel-mobility shift experiments, the conserved HS3D sequence from rat, human, and chicken also bound C/EBPdelta with similar affinity. We also found that prostaglandin E2-induced expression of reporter genes containing human IGF-I promoter 1 or four tandem copies of the human HS3D element fused to a minimal promoter and show that these effects were enhanced by a co-transfected C/EBPdelta expression plasmid. Taken together, our results provide evidence that C/EBPdelta is a critical activator of IGF-I gene transcription in osteoblasts and potentially in other cell types and species.

  12. The highly conserved MraZ protein is a transcriptional regulator in Escherichia coli

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eraso, Jesus M.; Markillie, Lye Meng; Mitchell, Hugh D.

    2014-05-05

    The mraZ and mraW genes are highly conserved in bacteria, both in sequence and location at the head of the division and cell wall (dcw) gene cluster. Although MraZ has structural similarity to the AbrB transition state regulator and the MazE antitoxin, and MraW is known to methylate ribosomal RNA, mraZ and mraW null mutants have no detectable growth phenotype in any species tested to date, hampering progress in understanding their physiological role. Here we show that overproduction of Escherichia coli MraZ perturbs cell division and the cell envelope, is more lethal at high levels or in minimal growth medium,more » and that MraW antagonizes these effects. MraZGFP localizes to the nucleoid, suggesting that it binds DNA. Indeed, purified MraZ directly binds a region upstream from its own promoter containing three direct repeats to regulate its own expression and that of downstream cell division and cell wall genes. MraZ-LacZ fusions are repressed by excess MraZ but not when DNA binding by MraZ is inhibited. RNAseq analysis indicates that MraZ is a global transcriptional regulator with numerous targets in addition to dcw genes. One of these targets, mioC, is directly bound by MraZ in a region with three direct repeats.« less

  13. Locating Sequence on FPC Maps and Selecting a Minimal Tiling Path

    PubMed Central

    Engler, Friedrich W.; Hatfield, James; Nelson, William; Soderlund, Carol A.

    2003-01-01

    This study discusses three software tools, the first two aid in integrating sequence with an FPC physical map and the third automatically selects a minimal tiling path given genomic draft sequence and BAC end sequences. The first tool, FSD (FPC Simulated Digest), takes a sequenced clone and adds it back to the map based on a fingerprint generated by an in silico digest of the clone. This allows verification of sequenced clone positions and the integration of sequenced clones that were not originally part of the FPC map. The second tool, BSS (Blast Some Sequence), takes a query sequence and positions it on the map based on sequence associated with the clones in the map. BSS has multiple uses as follows: (1) When the query is a file of marker sequences, they can be added as electronic markers. (2) When the query is draft sequence, the results of BSS can be used to close gaps in a sequenced clone or the physical map. (3) When the query is a sequenced clone and the target is BAC end sequences, one may select the next clone for sequencing using both sequence comparison results and map location. (4) When the query is whole-genome draft sequence and the target is BAC end sequences, the results can be used to select many clones for a minimal tiling path at once. The third tool, pickMTP, automates the majority of this last usage of BSS. Results are presented using the rice FPC map, BAC end sequences, and whole-genome shotgun from Syngenta. PMID:12915486

  14. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy

    PubMed Central

    Matkovich, Scot J.; Dorn, Gerald W.

    2018-01-01

    Summary MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicates purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses. PMID:25836573

  15. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy.

    PubMed

    Matkovich, Scot J; Dorn, Gerald W

    2015-01-01

    MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicate purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses.

  16. Minimal and Contributing Sequence Determinants of the cis-Acting Locus of Transfer (clt) of Streptomycete Plasmid pIJ101 Occur within an Intrinsically Curved Plasmid Region

    PubMed Central

    Ducote, Matthew J.; Prakash, Shubha; Pettis, Gregg S.

    2000-01-01

    Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3′ end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer. PMID:11073933

  17. Minimal and contributing sequence determinants of the cis-acting locus of transfer (clt) of streptomycete plasmid pIJ101 occur within an intrinsically curved plasmid region.

    PubMed

    Ducote, M J; Prakash, S; Pettis, G S

    2000-12-01

    Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3' end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer.

  18. Linking network topology to function. Comment on "Drivers of structural features in gene regulatory networks: From biophysical constraints to biological function" by O.C. Martin, A. Krzywicki and M. Zagorski

    NASA Astrophysics Data System (ADS)

    di Bernardo, Diego

    2016-07-01

    The review by Martin et al. deals with a long standing problem at the interface of complex systems and molecular biology, that is the relationship between the topology of a complex network and its function. In biological terms the problem translates to relating the topology of gene regulatory networks (GRNs) to specific cellular functions. GRNs control the spatial and temporal activity of the genes encoded in the cell's genome by means of specialised proteins called Transcription Factors (TFs). A TF is able to recognise and bind specifically to a sequence (TF biding site) of variable length (order of magnitude of 10) found upstream of the sequence encoding one or more genes (at least in prokaryotes) and thus activating or repressing their transcription. TFs can thus be distinguished in activator and repressor. The picture can become more complex since some classes of TFs can form hetero-dimers consisting of a protein complex whose subunits are the individual TFs. Heterodimers can have completely different binding sites and activity compared to their individual parts. In this review the authors limit their attention to prokaryotes where the complexity of GRNs is somewhat reduced. Moreover they exploit a unique feature of living systems, i.e. evolution, to understand whether function can shape network topology. Indeed, prokaryotes such as bacteria are among the oldest living systems that have become perfectly adapted to their environment over geological scales and thus have reached an evolutionary steady-state where the fitness of the population has reached a plateau. By integrating in silico analysis and comparative evolution, the authors show that indeed function does tend to shape the structure of a GRN, however this trend is not always present and depends on the properties of the network being examined. Interestingly, the trend is more apparent for sparse networks, i.e. where the density of edges is very low. Sparsity is indeed one of the most prominent features of natural occurring GRNs, and more specifically GRNs have been found to approximate a power-law ;scale-free; degree distribution by Barabasi and Albert [2]. Why sparsity arises is still under debate, but Price in 1976 proposed a model [1], later renamed ;preferential attachment; by Barabasi and Albert [2], able to give rise to sparse scale-free networks. In this model, a network grows over time (such as GRN during evolution) by sequential addition of new nodes (caused by genome duplications) that attach with higher probability to nodes with higher degree. In this review, Martin et al. propose that sparsity could also be caused phenotypic constrains even in the absence of genome duplications, in order for the network to be robust against random mutations in the genome sequence, which in turn affect the specificity of TF binding sites. The authors also found that network motifs, i.e. subnetworks consisting of 3 or 4 nodes with a specific topology that are over-represented in the network, are also shaped by phenotypic constrains. Theoretical and computational approaches to understand the forces that shape network topology are of extreme interest in biology, although at this stage their impact has been limited. Neverteless, these approaches may soon have important practical applications. The era of synthetic biology is upon us, novel organisms with ;minimal genomes; are being built with the dual aim of simplifying engineering of new functions useful to humans and to understand which is the minimal set of genes needed to support life [3]. The first minimal organism has just been created [3] by randomly deleting genes and genomic regions until a minimal set supporting cell growth and replication was found. The GRN of this minimal organism has not been investigated yet, but it will be of limited complexity. What is the GRN structure in this organism? Will the cell phenotypes be robust to mutations? Is it possible to re-engineer the GRN in order to find an optimal structure that confers phenotypic robustness to the cell? All of these questions can be tackled only by understanding the guiding principles linking network topology to network function.

  19. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  20. Exposure to Nickel, Chromium, or Cadmium Causes Distinct Changes in the Gene Expression Patterns of Rat Liver-Derived Cell Lines

    DTIC Science & Technology

    2010-05-22

    member B8 Blue 1370939_at Acsl1 acyl-CoA synthetase long-chain family member 1 Yellow 1372006_at --- --- Blue 1372101_at Ppap2b phosphatidic acid ...Stress L-ascorbic Acid Binding Cation Binding Identical Protein Binding Protein Dimerization Activity Dioxygenase Activity Oxidoreductase...Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts, and proteins. Nucleic Acid Research. 35: D61-65. Ryter SW

  1. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry

    USDA-ARS?s Scientific Manuscript database

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibo...

  2. Trichoderma genes

    DOEpatents

    Foreman, Pamela [Los Altos, CA; Goedegebuur, Frits [Vlaardingen, NL; Van Solingen, Pieter [Naaldwijk, NL; Ward, Michael [San Francisco, CA

    2012-06-19

    Described herein are novel gene sequences isolated from Trichoderma reesei. Two genes encoding proteins comprising a cellulose binding domain, one encoding an arabionfuranosidase and one encoding an acetylxylanesterase are described. The sequences, CIP1 and CIP2, contain a cellulose binding domain. These proteins are especially useful in the textile and detergent industry and in pulp and paper industry.

  3. From synthetic coiled coils to functional proteins: automated design of a receptor for the calmodulin-binding domain of calcineurin.

    PubMed

    Ghirlanda, G; Lear, J D; Lombardi, A; DeGrado, W F

    1998-08-14

    A series of synthetic receptors capable of binding to the calmodulin-binding domain of calcineurin (CN393-414) was designed, synthesized and characterized. The design was accomplished by docking CN393-414 against a two-helix receptor, using an idealized three-stranded coiled coil as a starting geometry. The sequence of the receptor was chosen using a side-chain re-packing program, which employed a genetic algorithm to select potential binders from a total of 7.5x10(6) possible sequences. A total of 25 receptors were prepared, representing 13 sequences predicted by the algorithm as well as 12 related sequences that were not predicted. The receptors were characterized by CD spectroscopy, analytical ultracentrifugation, and binding assays. The receptors predicted by the algorithm bound CN393-414 with apparent dissociation constants ranging from 0.2 microM to >50 microM. Many of the receptors that were not predicted by the algorithm also bound to CN393-414. Methods to circumvent this problem and to improve the automated design of functional proteins are discussed. Copyright 1998 Academic Press

  4. Predicting the binding preference of transcription factors to individual DNA k-mers.

    PubMed

    Alleyne, Trevis M; Peña-Castillo, Lourdes; Badis, Gwenael; Talukder, Shaheynoor; Berger, Michael F; Gehrke, Andrew R; Philippakis, Anthony A; Bulyk, Martha L; Morris, Quaid D; Hughes, Timothy R

    2009-04-15

    Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.

  5. Two distinct DNA sequences recognized by transcription factors represent enthalpy and entropy optima

    PubMed Central

    Yin, Yimeng; Das, Pratyush K; Jolma, Arttu; Zhu, Fangjie; Popov, Alexander; Xu, You; Nilsson, Lennart

    2018-01-01

    Most transcription factors (TFs) can bind to a population of sequences closely related to a single optimal site. However, some TFs can bind to two distinct sequences that represent two local optima in the Gibbs free energy of binding (ΔG). To determine the molecular mechanism behind this effect, we solved the structures of human HOXB13 and CDX2 bound to their two optimal DNA sequences, CAATAAA and TCGTAAA. Thermodynamic analyses by isothermal titration calorimetry revealed that both sites were bound with similar ΔG. However, the interaction with the CAA sequence was driven by change in enthalpy (ΔH), whereas the TCG site was bound with similar affinity due to smaller loss of entropy (ΔS). This thermodynamic mechanism that leads to at least two local optima likely affects many macromolecular interactions, as ΔG depends on two partially independent variables ΔH and ΔS according to the central equation of thermodynamics, ΔG = ΔH - TΔS. PMID:29638214

  6. Role of indirect readout mechanism in TATA box binding protein-DNA interaction.

    PubMed

    Mondal, Manas; Choudhury, Devapriya; Chakrabarti, Jaydeb; Bhattacharyya, Dhananjay

    2015-03-01

    Gene expression generally initiates from recognition of TATA-box binding protein (TBP) to the minor groove of DNA of TATA box sequence where the DNA structure is significantly different from B-DNA. We have carried out molecular dynamics simulation studies of TBP-DNA system to understand how the DNA structure alters for efficient binding. We observed rigid nature of the protein while the DNA of TATA box sequence has an inherent flexibility in terms of bending and minor groove widening. The bending analysis of the free DNA and the TBP bound DNA systems indicate presence of some similar structures. Principal coordinate ordination analysis also indicates some structural features of the protein bound and free DNA are similar. Thus we suggest that the DNA of TATA box sequence regularly oscillates between several alternate structures and the one suitable for TBP binding is induced further by the protein for proper complex formation.

  7. Specificity determinants for the abscisic acid response element.

    PubMed

    Sarkar, Aditya Kumar; Lahiri, Ansuman

    2013-01-01

    Abscisic acid (ABA) response elements (ABREs) are a group of cis-acting DNA elements that have been identified from promoter analysis of many ABA-regulated genes in plants. We are interested in understanding the mechanism of binding specificity between ABREs and a class of bZIP transcription factors known as ABRE binding factors (ABFs). In this work, we have modeled the homodimeric structure of the bZIP domain of ABRE binding factor 1 from Arabidopsis thaliana (AtABF1) and studied its interaction with ACGT core motif-containing ABRE sequences. We have also examined the variation in the stability of the protein-DNA complex upon mutating ABRE sequences using the protein design algorithm FoldX. The high throughput free energy calculations successfully predicted the ability of ABF1 to bind to alternative core motifs like GCGT or AAGT and also rationalized the role of the flanking sequences in determining the specificity of the protein-DNA interaction.

  8. Direct inhibition of the DNA-binding activity of POU transcription factors Pit-1 and Brn-3 by selective binding of a phenyl-furan-benzimidazole dication.

    PubMed

    Peixoto, Paul; Liu, Yang; Depauw, Sabine; Hildebrand, Marie-Paule; Boykin, David W; Bailly, Christian; Wilson, W David; David-Cordonnier, Marie-Hélène

    2008-06-01

    The development of small molecules to control gene expression could be the spearhead of future-targeted therapeutic approaches in multiple pathologies. Among heterocyclic dications developed with this aim, a phenyl-furan-benzimidazole dication DB293 binds AT-rich sites as a monomer and 5'-ATGA sequence as a stacked dimer, both in the minor groove. Here, we used a protein/DNA array approach to evaluate the ability of DB293 to specifically inhibit transcription factors DNA-binding in a single-step, competitive mode. DB293 inhibits two POU-domain transcription factors Pit-1 and Brn-3 but not IRF-1, despite the presence of an ATGA and AT-rich sites within all three consensus sequences. EMSA, DNase I footprinting and surface-plasmon-resonance experiments determined the precise binding site, affinity and stoichiometry of DB293 interaction to the consensus targets. Binding of DB293 occurred as a cooperative dimer on the ATGA part of Brn-3 site but as two monomers on AT-rich sites of IRF-1 sequence. For Pit-1 site, ATGA or AT-rich mutated sequences identified the contribution of both sites for DB293 recognition. In conclusion, DB293 is a strong inhibitor of two POU-domain transcription factors through a cooperative binding to ATGA. These findings are the first to show that heterocyclic dications can inhibit major groove transcription factors and they open the door to the control of transcription factors activity by those compounds.

  9. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    PubMed

    Allevato, Michael; Bolotin, Eugene; Grossman, Mark; Mane-Padros, Daniel; Sladek, Frances M; Martinez, Ernest

    2017-01-01

    The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX) bind Enhancer box (E-box) DNA elements (CANNTG) and have the greatest affinity for the canonical MYC E-box (CME) CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87%) of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  10. The gene for stinging nettle lectin (Urtica dioica agglutinin) encodes both a lectin and a chitinase.

    PubMed

    Lerner, D R; Raikhel, N V

    1992-06-05

    Chitin-binding proteins are present in a wide range of plant species, including both monocots and dicots, even though these plants contain no chitin. To investigate the relationship between in vitro antifungal and insecticidal activities of chitin-binding proteins and their unknown endogenous functions, the stinging nettle lectin (Urtica dioica agglutinin, UDA) cDNA was cloned using a synthetic gene as the probe. The nettle lectin cDNA clone contained an open reading frame encoding 374 amino acids. Analysis of the deduced amino acid sequence revealed a 21-amino acid putative signal sequence and the 86 amino acids encoding the two chitin-binding domains of nettle lectin. These domains were fused to a 19-amino acid "spacer" domain and a 244-amino acid carboxyl extension with partial identity to a chitinase catalytic domain. The authenticity of the cDNA clone was confirmed by deduced amino acid sequence identity with sequence data obtained from tryptic digests, RNA gel blot, and polymerase chain reaction analyses. RNA gel blot analysis also showed the nettle lectin message was present primarily in rhizomes and inflorescence (with immature seeds) but not in leaves or stems. Chitinase enzymatic activity was found when the chitinase-like domain alone or the chitinase-like domain with the chitin-binding domains were expressed in Escherichia coli. This is the first example of a chitin-binding protein with both a duplication of the 43-amino acid chitin-binding domain and a fusion of the chitin-binding domains to a structurally unrelated domain, the chitinase domain.

  11. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

    PubMed

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-06-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Improved bioactivity of G-rich triplex-forming oligonucleotides containing modified guanine bases

    PubMed Central

    Rogers, Faye A; Lloyd, Janice A; Tiwari, Meetu Kaushik

    2014-01-01

    Triplex structures generated by sequence-specific triplex-forming oligonucleotides (TFOs) have proven to be promising tools for gene targeting strategies. In addition, triplex technology has been highly utilized to study the molecular mechanisms of DNA repair, recombination and mutagenesis. However, triplex formation utilizing guanine-rich oligonucleotides as third strands can be inhibited by potassium-induced self-association resulting in G-quadruplex formation. We report here that guanine-rich TFOs partially substituted with 8-aza-7-deaza-guanine (PPG) have improved target site binding in potassium compared with TFOs containing the natural guanine base. We designed PPG-substituted TFOs to bind to a polypurine sequence in the supFG1 reporter gene. The binding efficiency of PPG-substituted TFOs to the target sequence was analyzed using electrophoresis mobility gel shift assays. We have determined that in the presence of potassium, the non-substituted TFO, AG30 did not bind to its target sequence, however binding was observed with the PPG-substituted AG30 under conditions with up to 140 mM KCl. The PPG-TFOs were able to maintain their ability to induce genomic modifications as measured by an assay for gene-targeted mutagenesis. In addition, these compounds were capable of triplex-induced DNA double strand breaks, which resulted in activation of apoptosis. PMID:25483840

  13. Characterization of monocarboxylate transporter 1 (MCT1) binding affinity for Basigin gene products and L1cam.

    PubMed

    Howard, John; Finch, Nicole A; Ochrietor, Judith D

    2010-07-01

    The purpose of this study was to determine the binding affinities of Basigin gene products and neural cell adhesion molecule L1cam for monocarboxylate transporter-1 (MCT1). ELISA binding assays were performed in which recombinant proteins of the transmembrane domains of Basigin gene products and L1cam were incubated with MCT1 captured from mouse brain. It was determined that Basigin gene products bind MCT1 with moderate affinity, but L1cam does not bind MCT1. Despite a high degree of sequence conservation between Basigin gene products and L1cam, the sequences are different enough to prevent L1cam from interacting with MCT1.

  14. Characterization of a protein that binds multiple sequences in mammalian type C retrovirus enhancers.

    PubMed Central

    Sun, W; O'Connell, M; Speck, N A

    1993-01-01

    Mammalian type C retrovirus enhancer factor 1 (MCREF-1) is a nuclear protein that binds several directly repeated sequences (CNGGN6CNGG) in the Moloney and Friend murine leukemia virus (MLV) enhancers (N. R. Manley, M. O'Connell, W. Sun, N. A. Speck, and N. Hopkins, J. Virol. 67:1967-1975, 1993). In this paper, we describe the partial purification of MCREF-1 from calf thymus nuclei and further characterize the binding properties of MCREF-1. MCREF-1 binds four sites in the Moloney MLV enhancer and three sites in the Friend MLV enhancer. Ethylation interference analysis suggests that the MCREF-1 binding site spans two adjacent minor grooves of DNA. Images PMID:8445719

  15. A Feature-Based Approach to Modeling Protein–DNA Interactions

    PubMed Central

    Segal, Eran

    2008-01-01

    Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF–DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP) dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/. PMID:18725950

  16. Discovery of 12-mer peptides that bind to wood lignin

    PubMed Central

    Yamaguchi, Asako; Isozaki, Katsuhiro; Nakamura, Masaharu; Takaya, Hikaru; Watanabe, Takashi

    2016-01-01

    Lignin, an abundant terrestrial polymer, is the only large-volume renewable feedstock composed of an aromatic skeleton. Lignin has been used mostly as an energy source during paper production; however, recent interest in replacing fossil fuels with renewable resources has highlighted its potential value in providing aromatic chemicals. Highly selective degradation of lignin is pivotal for industrial production of paper, biofuels, chemicals, and materials. However, few studies have examined natural and synthetic molecular components recognizing the heterogeneous aromatic polymer. Here, we report the first identification of lignin-binding peptides possessing characteristic sequences using a phage display technique. The consensus sequence HFPSP was found in several lignin-binding peptides, and the outer amino acid sequence affected the binding affinity of the peptides. Substitution of phenylalanine7 with Ile in the lignin-binding peptide C416 (HFPSPIFQRHSH) decreased the affinity of the peptide for softwood lignin without changing its affinity for hardwood lignin, indicating that C416 recognised structural differences between the lignins. Circular dichroism spectroscopy demonstrated that this peptide adopted a highly flexible random coil structure, allowing key residues to be appropriately arranged in relation to the binding site in lignin. These results provide a useful platform for designing synthetic and biological catalysts selectively bind to lignin. PMID:26903196

  17. Crystal Structures of the Scaffolding Protein LGN Reveal the General Mechanism by Which GoLoco Binding Motifs Inhibit the Release of GDP from Gαi *

    PubMed Central

    Jia, Min; Li, Jianchao; Zhu, Jinwei; Wen, Wenyu; Zhang, Mingjie; Wang, Wenning

    2012-01-01

    GoLoco (GL) motif-containing proteins regulate G protein signaling by binding to Gα subunit and acting as guanine nucleotide dissociation inhibitors. GLs of LGN are also known to bind the GDP form of Gαi/o during asymmetric cell division. Here, we show that the C-terminal GL domain of LGN binds four molecules of Gαi·GDP. The crystal structures of Gαi·GDP in complex with LGN GL3 and GL4, respectively, reveal distinct GL/Gαi interaction features when compared with the only high resolution structure known with GL/Gαi interaction between RGS14 and Gαi1. Only a few residues C-terminal to the conserved GL sequence are required for LGN GLs to bind to Gαi·GDP. A highly conserved “double Arg finger” sequence (RΨ(D/E)(D/E)QR) is responsible for LGN GL to bind to GDP bound to Gαi. Together with the sequence alignment, we suggest that the LGN GL/Gαi interaction represents a general binding mode between GL motifs and Gαi. We also show that LGN GLs are potent guanine nucleotide dissociation inhibitors. PMID:22952234

  18. Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhattacharya, Monolekha; Das, Amit Kumar, E-mail: amitk@hijli.iitkgp.ernet.in

    Highlights: Black-Right-Pointing-Pointer The regulatory sequences recognized by TcrX have been identified. Black-Right-Pointing-Pointer The regulatory region comprises of inverted repeats segregated by 30 bp region. Black-Right-Pointing-Pointer The mode of binding of TcrX with regulatory sequence is unique. Black-Right-Pointing-Pointer In silico TcrX-DNA docked model binds one of the inverted repeats. Black-Right-Pointing-Pointer Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has notmore » been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by {approx}30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.« less

  19. NMR studies of DNA oligomers and their interactions with minor groove binding ligands

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fagan, Patricia A.

    1996-05-01

    The cationic peptide ligands distamycin and netropsin bind noncovalently to the minor groove of DNA. The binding site, orientation, stoichiometry, and qualitative affinity of distamycin binding to several short DNA oligomers were investigated by NMR spectroscopy. The oligomers studied contain A,T-rich or I,C-rich binding sites, where I = 2-desaminodeoxyguanosine. I•C base pairs are functional analogs of A•T base pairs in the minor groove. The different behaviors exhibited by distamycin and netropsin binding to various DNA sequences suggested that these ligands are sensitive probes of DNA structure. For sites of five or more base pairs, distamycin can form 1:1 or 2:1more » ligand:DNA complexes. Cooperativity in distamycin binding is low in sites such as AAAAA which has narrow minor grooves, and is higher in sites with wider minor grooves such as ATATAT. The distamycin binding and base pair opening lifetimes of I,C-containing DNA oligomers suggest that the I,C minor groove is structurally different from the A,T minor groove. Molecules which direct chemistry to a specific DNA sequence could be used as antiviral compounds, diagnostic probes, or molecular biology tools. The author studied two ligands in which reactive groups were tethered to a distamycin to increase the sequence specificity of the reactive agent.« less

  20. Characterization of minimal sequences associated with self-similar interval exchange maps

    NASA Astrophysics Data System (ADS)

    Cobo, Milton; Gutiérrez-Romo, Rodolfo; Maass, Alejandro

    2018-04-01

    The construction of affine interval exchange maps (IEMs) with wandering intervals that are semi-conjugate to a given self-similar IEM is strongly related to the existence of the so-called minimal sequences associated with local potentials, which are certain elements of the substitution subshift arising from the given IEM. In this article, under the condition called unique representation property, we characterize such minimal sequences for potentials coming from non-real eigenvalues of the substitution matrix. We also give conditions on the slopes of the affine extensions of a self-similar IEM that determine whether it exhibits a wandering interval or not.

  1. Architecture of the Yeast RNA Polymerase II Open Complex and Regulation of Activity by TFIIF

    PubMed Central

    Fishburn, James

    2012-01-01

    To investigate the function and architecture of the open complex state of RNA polymerase II (Pol II), Saccharomyces cerevisiae minimal open complexes were assembled by using a series of heteroduplex HIS4 promoters, TATA binding protein (TBP), TFIIB, and Pol II. The yeast system demonstrates great flexibility in the position of active open complexes, spanning 30 to 80 bp downstream from TATA, consistent with the transcription start site scanning behavior of yeast Pol II. TFIIF unexpectedly modulates the activity of the open complexes, either repressing or stimulating initiation. The response to TFIIF was dependent on the sequence of the template strand within the single-stranded bubble. Mutations in the TFIIB reader and linker region, which were inactive on duplex DNA, were suppressed by the heteroduplex templates, showing that a major function of the TFIIB reader and linker is in the initiation or stabilization of single-stranded DNA. Probing of the architecture of the minimal open complexes with TFIIB-FeBABE [TFIIB–p-bromoacetamidobenzyl–EDTA-iron(III)] derivatives showed that the TFIIB core domain is surprisingly positioned away from Pol II, and the addition of TFIIF repositions the TFIIB core domain to the Pol II wall domain. Together, our results show an unexpected architecture of minimal open complexes and the regulation of activity by TFIIF and the TFIIB core domain. PMID:22025674

  2. Design and development of a field applicable gold nanosensor for the detection of luteinizing hormone.

    PubMed

    Zambre, Ajit; Chanda, Nripen; Prayaga, Sudhirdas; Almudhafar, Rosana; Afrasiabi, Zahra; Upendran, Anandhi; Kannan, Raghuraman

    2012-11-06

    In this paper, we describe a novel strategy for the fabrication of a nanosensor for detecting luteinizing hormone (LH) of sheep using a gold nanoparticle-peptide conjugate. A new peptide sequence "CDHPPLPDILFL" (leutinizing hormone peptide, LHP) has been identified, using BLAST and Clustal W analysis, to detect antibody of LH (sheep). LHP has been synthesized and characterized, and their affinity toward anti-LH was established using enzyme linked immunosorbant assay (ELISA) technique. The thiol group in LHP directly binds with gold nanoparticles (AuNPs) to yield AuNP-LHP construct. Detailed physicochemical analysis of AuNP-LHP construct was determined using various analytical techniques. Nanosensor using gold nanoparticle peptide conjugate was developed on the basis of competitive binding of AuNP-LHP and LH toward anti-LH. Nitrocellulose membrane, precoated with anti-LH, was soaked in the mixture of AuNP-LHP and sample of analysis (LH). In the absence of LH (sheep), anti-LH coated on the membrane binds with AuNP-LHP, leading to a distinctive red color, while in the presence of LH, no color appeared in the membrane due to the interaction of anti-LH with LH thereby preventing the binding of AuNP-LHP with membrane bound anti-LH. The sensor assay developed in this study can detect LH (sheep) up to a minimal concentration of ∼50 ppm with a high degree of reproducibility and selectivity. The gold-nanoparticle-peptide based nanosensor would be a simple, portable, effective, and low cost technique for infield applications.

  3. A novel progesterone receptor membrane component (PGRMC) in the human and swine parasite Taenia solium: implications to the host-parasite relationship.

    PubMed

    Aguilar-Díaz, Hugo; Nava-Castro, Karen E; Escobedo, Galileo; Domínguez-Ramírez, Lenin; García-Varela, Martín; Del Río-Araiza, Víctor H; Palacios-Arreola, Margarita I; Morales-Montor, Jorge

    2018-03-09

    We have previously reported that progesterone (P 4 ) has a direct in vitro effect on the scolex evagination and growth of Taenia solium cysticerci. Here, we explored the hypothesis that the P 4 direct effect on T. solium might be mediated by a novel steroid-binding parasite protein. By way of using immunofluorescent confocal microscopy, flow cytometry analysis, double-dimension electrophoresis analysis, and sequencing the corresponding protein spot, we detected a novel PGRMC in T. solium. Molecular modeling studies accompanied by computer docking using the sequenced protein, together with phylogenetic analysis and sequence alignment clearly demonstrated that T. solium PGRMC is from parasite origin. Our results show that P 4 in vitro increases parasite evagination and scolex size. Using immunofluorescent confocal microscopy, we detected that parasite cells showed expression of a P 4 -binding like protein exclusively located at the cysticercus subtegumental tissue. Presence of the P 4 -binding protein in cyst cells was also confirmed by flow cytometry. Double-dimension electrophoresis analysis, followed by sequencing the corresponding protein spot, revealed a protein that was previously reported in the T. solium genome belonging to a membrane-associated progesterone receptor component (PGRMC). Molecular modeling studies accompanied by computer docking using the sequenced protein showed that PGRMC is potentially able to bind steroid hormones such as progesterone, estradiol, testosterone and dihydrodrotestosterone with different affinities. Phylogenetic analysis and sequence alignment clearly demonstrated that T. solium PGRMC is related to a steroid-binding protein of Echinoccocus granulosus, both of them being nested within a cluster including similar proteins present in platyhelminths such as Schistocephalus solidus and Schistosoma haematobium. Progesterone may directly act upon T. solium cysticerci probably by binding to PGRMC. This research has implications in the field of host-parasite co-evolution as well as the sex-associated susceptibility to this infection. In a more practical matter, present results may contribute to the molecular design of new drugs with anti-parasite actions.

  4. Phyloscan: locating transcription-regulating binding sites in mixed aligned and unaligned sequence data.

    PubMed

    Palumbo, Michael J; Newberg, Lee A

    2010-07-01

    The transcription of a gene from its DNA template into an mRNA molecule is the first, and most heavily regulated, step in gene expression. Especially in bacteria, regulation is typically achieved via the binding of a transcription factor (protein) or small RNA molecule to the chromosomal region upstream of a regulated gene. The protein or RNA molecule recognizes a short, approximately conserved sequence within a gene's promoter region and, by binding to it, either enhances or represses expression of the nearby gene. Since the sought-for motif (pattern) is short and accommodating to variation, computational approaches that scan for binding sites have trouble distinguishing functional sites from look-alikes. Many computational approaches are unable to find the majority of experimentally verified binding sites without also finding many false positives. Phyloscan overcomes this difficulty by exploiting two key features of functional binding sites: (i) these sites are typically more conserved evolutionarily than are non-functional DNA sequences; and (ii) these sites often occur two or more times in the promoter region of a regulated gene. The website is free and open to all users, and there is no login requirement. Address: (http://bayesweb.wadsworth.org/phyloscan/).

  5. Expression cloning and characterization of a novel gene that encodes the RNA-binding protein FAU-1 from Pyrococcus furiosus.

    PubMed Central

    Kanai, Akio; Oida, Hanako; Matsuura, Nana; Doi, Hirofumi

    2003-01-01

    We systematically screened a genomic DNA library to identify proteins of the hyperthermophilic archaeon Pyrococcus furiosus using an expression cloning method. One gene product, which we named FAU-1 (P. furiosus AU-binding), demonstrated the strongest binding activity of all the genomic library-derived proteins tested against an AU-rich RNA sequence. The protein was purified to near homogeneity as a 54 kDa single polypeptide, and the gene locus corresponding to this FAU-1 activity was also sequenced. The FAU-1 gene encoded a 472-amino-acid protein that was characterized by highly charged domains consisting of both acidic and basic amino acids. The N-terminal half of the gene had a degree of similarity (25%) with RNase E from Escherichia coli. Five rounds of RNA-binding-site selection and footprinting analysis showed that the FAU-1 protein binds specifically to the AU-rich sequence in a loop region of a possible RNA ligand. Moreover, we demonstrated that the FAU-1 protein acts as an oligomer, and mainly as a trimer. These results showed that the FAU-1 protein is a novel heat-stable protein with an RNA loop-binding characteristic. PMID:12614195

  6. Definition of IgG- and albumin-binding regions of streptococcal protein G.

    PubMed

    Akerström, B; Nielsen, E; Björck, L

    1987-10-05

    Protein G, the immunoglobin G-binding surface protein of group C and G streptococci, also binds serum albumin. The albumin-binding site on protein G is distinct from the immunoglobulin G-binding site. By mild acid hydrolysis of the papain-liberated protein G fragment (35 kDa), a 28-kDa fragment was produced which retained full immunoglobulin G-binding activity (determined by Scatchard plotting) but had lost all albumin-binding capacity. A protein G (65 kDa), isolated after cloning and expression of the protein G gene in Escherichia coli, had comparable affinity to immunoglobulin G (5-10 X 10(10)M-1), but much higher affinity to albumin than the 35- and 28-kDa protein G fragments (31, 2.6, and 0 X 10(9)M-1, respectively). The amino-terminal amino acid sequences of the 65-, 35-, and 28-kDa fragments allowed us to exactly locate the three fragments in an overall sequence map of protein G, based on the partial gene sequences published by Guss et al. (Guss, B., Eliasson, M., Olsson, A., Uhlen, M., Frej, A.-K., Jörnvall, H., Flock, J.-I., and Lindberg, M. (1986) EMBO J. 5, 1567-1575) and Fahnestock et al. (Fahnestock, S. R., Alexander, P., Nagle, J., and Filpula, D. (1986) J. Bacteriol. 167, 870-880). In this map could then be deduced the location of three homologous albumin-binding regions and three homologous immunoglobulin G-binding regions.

  7. The primary structure of fatty-acid-binding protein from nurse shark liver. Structural and evolutionary relationship to the mammalian fatty-acid-binding protein family.

    PubMed

    Medzihradszky, K F; Gibson, B W; Kaur, S; Yu, Z H; Medzihradszky, D; Burlingame, A L; Bass, N M

    1992-02-01

    The primary structure of a fatty-acid-binding protein (FABP) isolated from the liver of the nurse shark (Ginglymostoma cirratum) was determined by high-performance tandem mass spectrometry (employing multichannel array detection) and Edman degradation. Shark liver FABP consists of 132 amino acids with an acetylated N-terminal valine. The chemical molecular mass of the intact protein determined by electrospray ionization mass spectrometry (Mr = 15124 +/- 2.5) was in good agreement with that calculated from the amino acid sequence (Mr = 15121.3). The amino acid sequence of shark liver FABP displays significantly greater similarity to the FABP expressed in mammalian heart, peripheral nerve myelin and adipose tissue (61-53% sequence similarity) than to the FABP expressed in mammalian liver (22% similarity). Phylogenetic trees derived from the comparison of the shark liver FABP amino acid sequence with the members of the mammalian fatty-acid/retinoid-binding protein gene family indicate the initial divergence of an ancestral gene into two major subfamilies: one comprising the genes for mammalian liver FABP and gastrotropin, the other comprising the genes for mammalian cellular retinol-binding proteins I and II, cellular retinoic-acid-binding protein myelin P2 protein, adipocyte FABP, heart FABP and shark liver FABP, the latter having diverged from the ancestral gene that ultimately gave rise to the present day mammalian heart-FABP, adipocyte FABP and myelin P2 protein sequences. The sequence for intestinal FABP from the rat could be assigned to either subfamily, depending on the approach used for phylogenetic tree construction, but clearly diverged at a relatively early evolutionary time point. Indeed, sequences proximately ancestral or closely related to mammalian intestinal FABP, liver FABP, gastrotropin and the retinoid-binding group of proteins appear to have arisen prior to the divergence of shark liver FABP and should therefore also be present in elasmobranchs. The presence in shark liver of an FABP which differs substantially in primary structure from mammalian liver FABP, while being closely related to the FABP expressed in mammalian heart muscle, peripheral nerve myelin and adipocytes, opens a further dimension regarding the question of the existence of structure-dependent and tissue-specific specialization of FABP function in lipid metabolism.

  8. Transcription Factor Information System (TFIS): A Tool for Detection of Transcription Factor Binding Sites.

    PubMed

    Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C

    2017-09-01

    Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).

  9. Elucidating the 16S rRNA 3' boundaries and defining optimal SD/aSD pairing in Escherichia coli and Bacillus subtilis using RNA-Seq data.

    PubMed

    Wei, Yulong; Silke, Jordan R; Xia, Xuhua

    2017-12-15

    Bacterial translation initiation is influenced by base pairing between the Shine-Dalgarno (SD) sequence in the 5' UTR of mRNA and the anti-SD (aSD) sequence at the free 3' end of the 16S rRNA (3' TAIL) due to: 1) the SD/aSD sequence binding location and 2) SD/aSD binding affinity. In order to understand what makes an SD/aSD interaction optimal, we must define: 1) terminus of the 3' TAIL and 2) extent of the core aSD sequence within the 3' TAIL. Our approach to characterize these components in Escherichia coli and Bacillus subtilis involves 1) mapping the 3' boundary of the mature 16S rRNA using high-throughput RNA sequencing (RNA-Seq), and 2) identifying the segment within the 3' TAIL that is strongly preferred in SD/aSD pairing. Using RNA-Seq data, we resolve previous discrepancies in the reported 3' TAIL in B. subtilis and recovered the established 3' TAIL in E. coli. Furthermore, we extend previous studies to suggest that both highly and lowly expressed genes favor SD sequences with intermediate binding affinity, but this trend is exclusive to SD sequences that complement the core aSD sequences defined herein.

  10. Accurate and sensitive quantification of protein-DNA binding affinity.

    PubMed

    Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J

    2018-04-17

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.

  11. Accurate and sensitive quantification of protein-DNA binding affinity

    PubMed Central

    Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.

    2018-01-01

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332

  12. Context influences on TALE–DNA binding revealed by quantitative profiling

    PubMed Central

    Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

    2015-01-01

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805

  13. Context influences on TALE-DNA binding revealed by quantitative profiling.

    PubMed

    Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

    2015-06-11

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

  14. Sequence-Based Prediction of RNA-Binding Residues in Proteins.

    PubMed

    Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.

  15. Sequence-Based Prediction of RNA-Binding Residues in Proteins

    PubMed Central

    Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829

  16. Swellix: a computational tool to explore RNA conformational space.

    PubMed

    Sloat, Nathan; Liu, Jui-Wen; Schroeder, Susan J

    2017-11-21

    The sequence of nucleotides in an RNA determines the possible base pairs for an RNA fold and thus also determines the overall shape and function of an RNA. The Swellix program presented here combines a helix abstraction with a combinatorial approach to the RNA folding problem in order to compute all possible non-pseudoknotted RNA structures for RNA sequences. The Swellix program builds on the Crumple program and can include experimental constraints on global RNA structures such as the minimum number and lengths of helices from crystallography, cryoelectron microscopy, or in vivo crosslinking and chemical probing methods. The conceptual advance in Swellix is to count helices and generate all possible combinations of helices rather than counting and combining base pairs. Swellix bundles similar helices and includes improvements in memory use and efficient parallelization. Biological applications of Swellix are demonstrated by computing the reduction in conformational space and entropy due to naturally modified nucleotides in tRNA sequences and by motif searches in Human Endogenous Retroviral (HERV) RNA sequences. The Swellix motif search reveals occurrences of protein and drug binding motifs in the HERV RNA ensemble that do not occur in minimum free energy or centroid predicted structures. Swellix presents significant improvements over Crumple in terms of efficiency and memory use. The efficient parallelization of Swellix enables the computation of sequences as long as 418 nucleotides with sufficient experimental constraints. Thus, Swellix provides a practical alternative to free energy minimization tools when multiple structures, kinetically determined structures, or complex RNA-RNA and RNA-protein interactions are present in an RNA folding problem.

  17. Minimal Absent Words in Four Human Genome Assemblies

    PubMed Central

    Garcia, Sara P.; Pinho, Armando J.

    2011-01-01

    Minimal absent words have been computed in genomes of organisms from all domains of life. Here, we aim to contribute to the catalogue of human genomic variation by investigating the variation in number and content of minimal absent words within a species, using four human genome assemblies. We compare the reference human genome GRCh37 assembly, the HuRef assembly of the genome of Craig Venter, the NA12878 assembly from cell line GM12878, and the YH assembly of the genome of a Han Chinese individual. We find the variation in number and content of minimal absent words between assemblies more significant for large and very large minimal absent words, where the biases of sequencing and assembly methodologies become more pronounced. Moreover, we find generally greater similarity between the human genome assemblies sequenced with capillary-based technologies (GRCh37 and HuRef) than between the human genome assemblies sequenced with massively parallel technologies (NA12878 and YH). Finally, as expected, we find the overall variation in number and content of minimal absent words within a species to be generally smaller than the variation between species. PMID:22220210

  18. MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

    DOE PAGES

    Paugh, Steven W.; Coss, David R.; Bao, Ju; ...

    2016-02-04

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less

  19. Generation of Aptamers from A Primer-Free Randomized ssDNA Library Using Magnetic-Assisted Rapid Aptamer Selection

    NASA Astrophysics Data System (ADS)

    Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih

    2017-04-01

    Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.

  20. MicroRNAs form triplexes with double stranded DNA at sequence-specific binding sites; a eukaryotic mechanism via which microRNAs could directly alter gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paugh, Steven W.; Coss, David R.; Bao, Ju

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less

  1. Structural basis of UGUA recognition by the Nudix protein CFIm25 and implications for a regulatory role in mRNA 3′ processing

    PubMed Central

    Yang, Qin; Gilmartin, Gregory M.; Doublié, Sylvie

    2010-01-01

    Human Cleavage Factor Im (CFIm) is an essential component of the pre-mRNA 3′ processing complex that functions in the regulation of poly(A) site selection through the recognition of UGUA sequences upstream of the poly(A) site. Although the highly conserved 25 kDa subunit (CFIm25) of the CFIm complex possesses a characteristic α/β/α Nudix fold, CFIm25 has no detectable hydrolase activity. Here we report the crystal structures of the human CFIm25 homodimer in complex with UGUAAA and UUGUAU RNA sequences. CFIm25 is the first Nudix protein to be reported to bind RNA in a sequence-specific manner. The UGUA sequence contributes to binding specificity through an intramolecular G:A Watson–Crick/sugar-edge base interaction, an unusual pairing previously found to be involved in the binding specificity of the SAM-III riboswitch. The structures, together with mutational data, suggest a novel mechanism for the simultaneous sequence-specific recognition of two UGUA elements within the pre-mRNA. Furthermore, the mutually exclusive binding of RNA and the signaling molecule Ap4A (diadenosine tetraphosphate) by CFIm25 suggests a potential role for small molecules in the regulation of mRNA 3′ processing. PMID:20479262

  2. Structural basis of UGUA recognition by the Nudix protein CFI(m)25 and implications for a regulatory role in mRNA 3' processing.

    PubMed

    Yang, Qin; Gilmartin, Gregory M; Doublié, Sylvie

    2010-06-01

    Human Cleavage Factor Im (CFI(m)) is an essential component of the pre-mRNA 3' processing complex that functions in the regulation of poly(A) site selection through the recognition of UGUA sequences upstream of the poly(A) site. Although the highly conserved 25 kDa subunit (CFI(m)25) of the CFI(m) complex possesses a characteristic alpha/beta/alpha Nudix fold, CFI(m)25 has no detectable hydrolase activity. Here we report the crystal structures of the human CFI(m)25 homodimer in complex with UGUAAA and UUGUAU RNA sequences. CFI(m)25 is the first Nudix protein to be reported to bind RNA in a sequence-specific manner. The UGUA sequence contributes to binding specificity through an intramolecular G:A Watson-Crick/sugar-edge base interaction, an unusual pairing previously found to be involved in the binding specificity of the SAM-III riboswitch. The structures, together with mutational data, suggest a novel mechanism for the simultaneous sequence-specific recognition of two UGUA elements within the pre-mRNA. Furthermore, the mutually exclusive binding of RNA and the signaling molecule Ap(4)A (diadenosine tetraphosphate) by CFI(m)25 suggests a potential role for small molecules in the regulation of mRNA 3' processing.

  3. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.

    PubMed

    Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques

    2008-01-01

    This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

  4. Protein sequences bound to mineral surfaces persist into deep time

    PubMed Central

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515

  5. Preferential binding of daunomycin to 5'ATCG and 5'ATGC sequences revealed by footprinting titration experiments.

    PubMed

    Chaires, J B; Herrera, J E; Waring, M J

    1990-07-03

    Results from a high-resolution deoxyribonuclease I (DNase I) footprinting titration procedure are described that identify preferred daunomycin binding sites within the 160 bp tyr T DNA fragment. We have obtained single-bond resolution at 65 of the 160 potential binding sites within the tyr T fragment and have examined the effect of 0-3.0 microM total daunomycin concentration on the susceptibility of these sites toward digestion by DNase I. Four types of behavior are observed: (i) protection from DNase I cleavage; (ii) protection, but only after reaching a critical total daunomycin concentration; (iii) enhanced cleavage; (iv) no effect of added drug. Ten sites were identified as the most strongly protected on the basis of the magnitude of the reduction of their digestion product band areas in the presence of daunomycin. These were identified as the preferred daunomycin binding sites. Seven of these 10 sites are found at the end of the triplet sequences 5'ATGC and 5'ATCG, where the notation AT indicates that either A or T may occupy the position. The remaining three strongly protected sites are found at the ends of the triplet sequence 5'ATCAT. Of the preferred daunomycin binding sites we identify in this study, the sequence 5'ATCG is consistent with the specificity predicted by the theoretical studies of Chen et al. [Chen, K.-X., Gresh, N., & Pullman, B. (1985) J. Biomol. Struct. Dyn. 3, 445-466] and is the very sequence to which daunomycin is observed to be bound in two recent X-ray crystallographic studies. Solution studies, theoretical studies, and crystallographic studies have thus converged to provide a consistent and coherent picture of the sequence preference of this important anticancer antibiotic.

  6. Homologous kappa-neurotoxins exhibit residue-specific interactions with the alpha 3 subunit of the nicotinic acetylcholine receptor: a comparison of the structural requirements for kappa-bungarotoxin and kappa-flavitoxin binding.

    PubMed

    McLane, K E; Weaver, W R; Lei, S; Chiappinelli, V A; Conti-Tronconi, B M

    1993-07-13

    kappa-Flavotoxin (kappa-FTX), a snake neurotoxin that is a selective antagonist of certain neuronal nicotinic acetylcholine receptors (AChRs), has recently been isolated and characterized [Grant, G. A., Frazier, M. W., & Chiappinelli, V. A. (1988) Biochemistry 27, 1532-1537]. Like the related snake toxin kappa-bungarotoxin (kappa-BTX), kappa-FTX binds with high affinity to alpha 3 subtypes of neuronal AChRs, even though there are distinct sequence differences between the two toxins. To further characterize the sequence regions of the neuronal AChR alpha 3 subunit involved in formation of the binding site for this family of kappa-neurotoxins, we investigated kappa-FTX binding to overlapping synthetic peptides screening the alpha 3 subunit sequence. A sequence region forming a "prototope" for kappa-FTX was identified within residues alpha 3 (51-70), confirming the suggestions of previous studies on the binding of kappa-BTX to the alpha 3 subunit [McLane, K. E., Tang, F., & Conti-Tronconi, B. M. (1990) J. Biol. Chem. 265, 1537-1544] and alpha-bungarotoxin to the Torpedo AChR alpha subunit [Conti-Tronconi, B. M., Tang, F., Diethelm, B. M., Spencer, S. R., Reinhardt-Maelicke, S., & Maelicke, A. (1990) Biochemistry 29, 6221-6230] that this sequence region is involved in formation of a cholinergic site. Single residue substituted analogues, where each residue of the sequence alpha 3 (51-70) was sequentially replaced by a glycine, were used to identify the amino acid side chains involved in the interaction of this prototope with kappa-FTX.(ABSTRACT TRUNCATED AT 250 WORDS)

  7. Use of eluted peptide sequence data to identify the binding characteristics of peptides to the insulin-dependent diabetes susceptibility allele HLA-DQ8 (DQ 3.2).

    PubMed

    Godkin, A; Friede, T; Davenport, M; Stevanovic, S; Willis, A; Jewell, D; Hill, A; Rammensee, H G

    1997-06-01

    HLA-DQ8 (A1*0301, B1*0302) and -DQ2 (A1*0501, B1*0201) are both associated with diseases such as insulin-dependent diabetes mellitus and coeliac disease. We used the technique of pool sequencing to look at the requirements of peptides binding to HLA-DQ8, and combined these data with naturally sequenced ligands and in vitro binding assays to describe a novel motif for HLA-DQ8. The motif, which has the same basic format as many HLA-DR molecules, consists of four or five anchor regions, in the positions from the N-terminus of the binding core of n, n + 3, n + 5/6 and n + 8, i.e. P1, P4, P6/7 and P9. P1 and P9 require negative or polar residues, with mainly aliphatic residues at P4 and P6/7. The features of the HLA-DQ8 motif were then compared to a pool sequence of peptides eluted from HLA-DQ2. A consensus motif for the binding of a common peptide which may be involved in disease pathogenesis is described. Neither of the disease-associated alleles HLA-DQ2 and -DQ8 have Asp at position 57 of the beta-chain. This Asp, if present, may form a salt bridge with an Arg at position 79 of the alpha-chain and so alter the binding specificity of P9. HLA-DQ2 and -DQ8 both appear to prefer negatively charged amino acids at P9. In contrast, HLA-DQ7 (A1*0301, B1*0301), which is not associated with diabetes, has Asp at beta 57, allowing positively charged amino acids at P9. This analysis of the sequence features of DQ-binding peptides suggests molecular characteristics which may be useful to predict epitopes involved in disease pathogenesis.

  8. Plasma proteins of rainbow trout (Oncorhynchus mykiss) isolated by binding to lipopolysaccharide from Aeromonas salmonicida.

    PubMed

    Hoover, G J; el-Mowafi, A; Simko, E; Kocal, T E; Ferguson, H W; Hayes, M A

    1998-07-01

    In an attempt to find plasma proteins that might be involved in the constitutive resistance of rainbow trout to furunculosis, a disease caused by Aeromonas salmonicida (AS), we purified serum and plasma proteins based on their calcium- and carbohydrate-dependent affinity for A. salmonicida lipopolysaccharide (LPS) coupled to an epoxy-activated synthetic matrix (Toyopearl AF Epoxy 650M). A multimeric family of high molecular weight (96 to 200-kDa) LPS-binding proteins exhibiting both calcium and mannose dependent binding was isolated. Upon reduction the multimers collapsed to subunits of approximately 16-kDa as estimated by 1D-PAGE and exhibited pI values of 5.30 and 5.75 as estimated from 2D-PAGE. Their N-terminal sequences were related to rainbow trout ladderlectin (RT-LL), a Sepharose-binding protein. Polyclonal antibodies to the LPS-purified 16-kDa subunits recognized both the reduced 16-kDa subunits and the non-reduced multimeric forms. A calcium- and N-acetylglucosamine (GlcNAc)-dependent LPS-binding multimeric protein (approximately 207-kDa) composed of 34.5-kDa subunits was purified and found to be identical to trout serum amyloid P (SAP) by N-terminal sequence (DLQDLSGKVFV). A protein of 24-kDa, in reduced and non-reduced conditions, was isolated and had N-terminal sequence identity with a known C-reactive protein (CRP) homologue, C-polysaccharide-binding protein 2 (TCBP2) of rainbow trout. A novel calcium-dependent LPS-binding protein was purified and termed rainbow trout lectin 37 (RT-L37). This protein, composed of dimers, tetramers and pentamers of 37 kDa subunits (pI 5.50-6.10) with N-terminal sequence (IQE(D/N)GHAEAPGATTVLNEILR) showed no close homology to proteins known or predicted from cDNA sequences. These findings demonstrate that rainbow trout have several blood proteins with lectin properties for the LPS of A. salmonicida; the biological functions of these proteins in resistance to furunculosis are still unknown.

  9. Affibody Molecules for In vivo Characterization of HER2-Positive Tumors by Near-Infrared Imaging

    PubMed Central

    Lee, Sang Bong; Hassan, Moinuddin; Fisher, Robert; Chertov, Oleg; Chernomordik, Victor; Kramer-Marek, Gabriela; Gandjbakhche, Amir; Capala, Jacek

    2012-01-01

    Purpose HER2 overexpression has been associated with a poor prognosis and resistance to therapy in breast cancer patients. We are developing molecular probes for in vivo quantitative imaging of HER2 receptors using near-infrared optical imaging. The goal is to provide probes that will minimally interfere with the studied system, i.e., whose binding does not interfere with the binding of the therapeutic agents, and whose effect on the target cells is minimal. Experimental Design We used three different types of HER2-specific Affibody molecules [monomer ZHER2:342, dimer (ZHER2:477)2, and albumin-binding domain-fused-(ZHER2:342)2] as targeting agents, and labeled them with Alexa Fluor dyes. Trastuzumab was also conjugated, using commercially available kits, as a standard control. The resulting conjugates were characterized in vitro by toxicity assays, Biacore affinity measurements, flow cytometry, and confocal microscopy. Semi-uantitative in vivo near-infrared optical imaging studies were carried out using mice with subcutaneous xenografts of HER2-positive tumors. Results The HER2-specific Affibody molecules were not toxic to HER2-overexpressing cells and their binding to HER2 did interfere with neither binding nor effectives of trastuzumab. The binding affinities and specificities of the Affibody-Alexa Fluor fluorescent conjugates to HER2 were unchanged or minimally affected by the modifications. Pharmacokinetics and biodistribution studies showed the albumin-binding domain-fused-(ZHER2:342)2-Alexa Fluor 750 conjugate to be an optimal probe for optical imaging of HER2 in vivo. Conclusion Our results suggest that Affibody-Alexa Fluor conjugates may be used as a specific near-infrared probe for the non-invasive semi-quantitative imaging of HER2 expression in vivo. PMID:18559604

  10. Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats☆

    PubMed Central

    Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

    2013-01-01

    Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. PMID:23648487

  11. Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats.

    PubMed

    Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

    2013-08-01

    Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  12. Transcription initiation from the dihydrofolate reductase promoter is positioned by HIP1 binding at the initiation site.

    PubMed

    Means, A L; Farnham, P J

    1990-02-01

    We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Sp1-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Sp1-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).

  13. Two phenylalanines in the C-terminus of Epstein-Barr virus Rta protein reciprocally modulate its DNA binding and transactivation function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, L.-W.; Department of Molecular Biophysics and Biochemistry, Yale University School of Medicine, New Haven, CT 06520; Raghavan, Vineetha

    The Rta (R transactivator) protein plays an essential role in the Epstein-Barr viral (EBV) lytic cascade. Rta activates viral gene expression by several mechanisms including direct and indirect binding to target viral promoters, synergy with EBV ZEBRA protein, and stimulation of cellular signaling pathways. We previously found that Rta proteins with C-terminal truncations of 30 aa were markedly enhanced in their capacity to bind DNA (Chen, L.W., Chang, P.J., Delecluse, H.J., and Miller, G., (2005). Marked variation in response of consensus binding elements for the Rta protein of Epstein-Barr virus. J. Virol. 79(15), 9635-9650.). Here we show that two phenylalaninesmore » (F600 and F605) in the C-terminus of Rta play a crucial role in mediating this DNA binding inhibitory function. Amino acids 555 to 605 of Rta constitute a functional DNA binding inhibitory sequence (DBIS) that markedly decreased DNA binding when transferred to a minimal DNA binding domain of Rta (aa 1-350). Alanine substitution mutants, F600A/F605A, abolished activity of the DBIS. F600 and F605 are located in the transcriptional activation domain of Rta. Alanine substitutions, F600A/F605A, decreased transcriptional activation by Rta protein, whereas aromatic substitutions, such as F600Y/F605Y or F600W/F605W, partially restored transcriptional activation. Full-length Rta protein with F600A/F605A mutations were enhanced in DNA binding compared to wild-type, whereas Rta proteins with F600Y/F605Y or F600W/F605W substitutions were, like wild-type Rta, relatively poor DNA binders. GAL4 (1-147)/Rta (416-605) fusion proteins with F600A/F605A mutations were diminished in transcriptional activation, relative to GAL4/Rta chimeras without such mutations. The results suggest that, in the context of a larger DBIS, F600 and F605 play a role in the reciprocal regulation of DNA binding and transcriptional activation by Rta. Regulation of DNA binding by Rta is likely to be important in controlling its different modes of action.« less

  14. Two potential calmodulin-binding sequences in the ryanodine receptor contribute to a mobile, intra-subunit calmodulin-binding domain

    PubMed Central

    Huang, Xiaojun; Liu, Ying; Wang, Ruiwu; Zhong, Xiaowei; Liu, Yingjie; Koop, Andrea; Chen, S. R. Wayne; Wagenknecht, Terence; Liu, Zheng

    2013-01-01

    Summary Calmodulin (CaM), a 16 kDa ubiquitous calcium-sensing protein, is known to bind tightly to the calcium release channel/ryanodine receptor (RyR), and modulate RyR function. CaM binding studies using RyR fragments or synthetic peptides have revealed the presence of multiple, potential CaM-binding regions in the primary sequence of RyR. In the present study, we inserted GFP into two of these proposed CaM-binding sequences and mapped them onto the three-dimensional structure of intact cardiac RyR2 by cryo-electron microscopy. Interestingly, we found that the two potential CaM-binding regions encompassing, Arg3595 and Lys4269, respectively, are in close proximity and are adjacent to the previously mapped CaM-binding sites. To monitor the conformational dynamics of these CaM-binding regions, we generated a fluorescence resonance energy transfer (FRET) pair, a dual CFP- and YFP-labeled RyR2 (RyR2R3595-CFP/K4269-YFP) with CFP inserted after Arg3595 and YFP inserted after Lys4269. We transfected HEK293 cells with the RyR2R3595-CFP/K4269-YFP cDNA, and examined their FRET signal in live cells. We detected significant FRET signals in transfected cells that are sensitive to the channel activator caffeine, suggesting that caffeine is able to induce conformational changes in these CaM-binding regions. Importantly, no significant FRET signals were detected in cells co-transfected with cDNAs encoding the single CFP (RyR2R3595-CFP) and single YFP (RyR2K4269-YFP) insertions, indicating that the FRET signal stemmed from the interaction between R3595–CFP and K4269–YFP that are in the same RyR subunit. These observations suggest that multiple regions in the RyR2 sequence may contribute to an intra-subunit CaM-binding pocket that undergoes conformational changes during channel gating. PMID:23868982

  15. Mutations in type 3 reovirus that determine binding to sialic acid are contained in the fibrous tail domain of viral attachment protein sigma1.

    PubMed

    Chappell, J D; Gunn, V L; Wetzel, J D; Baer, G S; Dermody, T S

    1997-03-01

    The reovirus attachment protein, sigma1, determines numerous aspects of reovirus-induced disease, including viral virulence, pathways of spread, and tropism for certain types of cells in the central nervous system. The sigma1 protein projects from the virion surface and consists of two distinct morphologic domains, a virion-distal globular domain known as the head and an elongated fibrous domain, termed the tail, which is anchored into the virion capsid. To better understand structure-function relationships of sigma1 protein, we conducted experiments to identify sequences in sigma1 important for viral binding to sialic acid, a component of the receptor for type 3 reovirus. Three serotype 3 reovirus strains incapable of binding sialylated receptors were adapted to growth in murine erythroleukemia (MEL) cells, in which sialic acid is essential for reovirus infectivity. MEL-adapted (MA) mutant viruses isolated by serial passage in MEL cells acquired the capacity to bind sialic acid-containing receptors and demonstrated a dependence on sialic acid for infection of MEL cells. Analysis of reassortant viruses isolated from crosses of an MA mutant virus and a reovirus strain that does not bind sialic acid indicated that the sigma1 protein is solely responsible for efficient growth of MA mutant viruses in MEL cells. The deduced sigma1 amino acid sequences of the MA mutant viruses revealed that each strain contains a substitution within a short region of sequence in the sigma1 tail predicted to form beta-sheet. These studies identify specific sequences that determine the capacity of reovirus to bind sialylated receptors and suggest a location for a sialic acid-binding domain. Furthermore, the results support a model in which type 3 sigma1 protein contains discrete receptor binding domains, one in the head and another in the tail that binds sialic acid.

  16. Activation of IKKalpha and IKKbeta through their fusion with HTLV-I tax protein.

    PubMed

    Xiao, G; Sun, S C

    2000-10-26

    Human T-cell leukemia virus type I (HTLV-I) Tax protein persistently stimulates the activity of IkappaB kinase (IKK), resulting in constitutive activation of the transcription factor NF-kappaB. Tax activation of IKK requires physical interaction of this viral protein with the IKK regulatory subunit, IKKgamma. The Tax/IKKgamma interaction allows Tax to engage the IKK catalytic subunits, IKKalpha and IKKbeta, although it remains unclear whether this linker function of IKKgamma is sufficient for supporting the Tax-specific IKK activation. To address this question, we have examined the sequences of IKKgamma required for modulating the Tax/IKK signaling. We demonstrate that when fused to Tax, a small N-terminal fragment of IKKgamma, containing its minimal IKKalpha/beta-binding domain, is sufficient for bringing Tax to and activating the IKK catalytic subunits. Disruption of the IKKalpha/beta-binding activity of this domain abolishes its function in modulating the Tax/IKK signaling. We further demonstrate that direct fusion of Tax to IKKalpha and IKKbeta leads to activation of these kinases. These findings suggest that the IKKgamma-directed Tax/IKK association serves as a molecular trigger for IKK activation.

  17. CIP1 polypeptides and their uses

    DOEpatents

    Foreman, Pamela [Los Altos, CA; Van Solingen, Pieter [Naaldwijk, NL; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA

    2011-04-12

    Described herein are novel gene sequences isolated from Trichoderma reesei. Two genes encoding proteins comprising a cellulose binding domain, one encoding an arabionfuranosidase and one encoding an acetylxylanesterase are described. The sequences, CIP1 and CIP2, contain a cellulose binding domain. These proteins are especially useful in the textile and detergent industry and in pulp and paper industry.

  18. A homolog of an Escherichia coli phosphate-binding protein gene from Xanthomonas oryzae pv. oryzae

    NASA Technical Reports Server (NTRS)

    Hopkins, C. M.; White, F. F.; Heaton, L. A.; Guikema, J. A.; Leach, J. E.; Spooner, B. S. (Principal Investigator)

    1995-01-01

    A Xanthomonas oryzae pv. oryzae gene with sequence similarity to an Escherichia coli phosphate-binding protein gene (phoS) produces a periplasmic protein of apparent M(r) 35,000 when expressed in E. coli. Amino terminal sequencing revealed that a signal peptide is removed during transport to the periplasm in E. coli.

  19. Nucleic acids encoding phloem small RNA-binding proteins and transgenic plants comprising them

    DOEpatents

    Lucas, William J.; Yoo, Byung-Chun; Lough, Tony J.; Varkonyi-Gasic, Erika

    2007-03-13

    The present invention provides a polynucleotide sequence encoding a component of the protein machinery involved in small RNA trafficking, Cucurbita maxima phloem small RNA-binding protein (CmPSRB 1), and the corresponding polypeptide sequence. The invention also provides genetic constructs and transgenic plants comprising the polynucleotide sequence encoding a phloem small RNA-binding protein to alter (e.g., prevent, reduce or elevate) non-cell autonomous signaling events in the plants involving small RNA metabolism. These signaling events are involved in a broad spectrum of plant physiological and biochemical processes, including, for example, systemic resistance to pathogens, responses to environmental stresses, e.g., heat, drought, salinity, and systemic gene silencing (e.g., viral infections).

  20. ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

    PubMed Central

    Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

    2017-01-01

    Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546

  1. Structure-affinity relationships for the binding of actinomycin D to DNA

    NASA Astrophysics Data System (ADS)

    Gallego, José; Ortiz, Angel R.; de Pascual-Teresa, Beatriz; Gago, Federico

    1997-03-01

    Molecular models of the complexes between actinomycin D and 14 different DNA hexamers were built based on the X-ray crystal structure of the actinomycin-d(GAAGCTTC)2 complex. The DNA sequences included the canonical GpC binding step flanked by different base pairs, nonclassical binding sites such as GpG and GpT, and sites containing 2,6-diamino- purine. A good correlation was found between the intermolecular interaction energies calculated for the refined complexes and the relative preferences of actinomycin binding to standard and modified DNA. A detailed energy decomposition into van der Waals and electrostatic components for the interactions between the DNA base pairs and either the chromophore or the peptidic part of the antibiotic was performed for each complex. The resulting energy matrix was then subjected to principal component analysis, which showed that actinomycin D discriminates among different DNA sequences by an interplay of hydrogen bonding and stacking interactions. The structure-affinity relationships for this important antitumor drug are thus rationalized and may be used to advantage in the design of novel sequence-specific DNA-binding agents.

  2. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis

    PubMed Central

    Moore, Michael; Zhang, Chaolin; Gantman, Emily Conn; Mele, Aldo; Darnell, Jennifer C.; Darnell, Robert B.

    2014-01-01

    Summary Identifying sites where RNA binding proteins (RNABPs) interact with target RNAs opens the door to understanding the vast complexity of RNA regulation. UV-crosslinking and immunoprecipitation (CLIP) is a transformative technology in which RNAs purified from in vivo cross-linked RNA-protein complexes are sequenced to reveal footprints of RNABP:RNA contacts. CLIP combined with high throughput sequencing (HITS-CLIP) is a generalizable strategy to produce transcriptome-wide RNA binding maps with higher accuracy and resolution than standard RNA immunoprecipitation (RIP) profiling or purely computational approaches. Applying CLIP to Argonaute proteins has expanded the utility of this approach to mapping binding sites for microRNAs and other small regulatory RNAs. Finally, recent advances in data analysis take advantage of crosslinked-induced mutation sites (CIMS) to refine RNA-binding maps to single-nucleotide resolution. Once IP conditions are established, HITS-CLIP takes approximately eight days to prepare RNA for sequencing. Established pipelines for data analysis, including for CIMS, take 3-4 days. PMID:24407355

  3. Analytical study of avian reticuloendotheliosis virus dimeric RNA generated in vivo and in vitro.

    PubMed

    Darlix, J L; Gabus, C; Allain, B

    1992-12-01

    The retroviral genome consists of two identical RNA molecules associated at their 5' ends by a stable structure called the dimer linkage structure. The dimer linkage structure, while maintaining the dimer state of the retroviral genome, might also be involved in packaging and reverse transcription, as well as recombination during proviral DNA synthesis. To study the dimer structure of the retroviral genome and the mechanism of dimerization, we analyzed features of the dimeric genome of reticuloendotheliosis virus (REV) type A and identified elements required for its dimerization. Here we report that the REV dimeric genome extracted from virions and infected cells, as well as that synthesized in vitro, is more resistant to heat denaturation than avian sarcoma and leukemia virus, murine leukemia virus, or human immunodeficiency virus type 1 dimeric RNA. The minimal domain required to form a stable REV RNA dimer in vitro was found to map between positions 268 and 452 (KpnI and SalI sites), thus corresponding to the E encapsidation sequence (J. E. Embretson and H. M. Temin, J. Virol. 61:2675-2683, 1987). In addition, both the 5' and 3' halves of E are necessary in cis for RNA dimerization and the extent of RNA dimerization is influenced by viral sequences flanking E. Rapid and efficient dimerization of REV RNA containing gag sequences in addition to the E sequences and annealing of replication primer tRNA(Pro) to the primer-binding site necessitate the nucleocapsid protein.

  4. Analytical study of avian reticuloendotheliosis virus dimeric RNA generated in vivo and in vitro.

    PubMed Central

    Darlix, J L; Gabus, C; Allain, B

    1992-01-01

    The retroviral genome consists of two identical RNA molecules associated at their 5' ends by a stable structure called the dimer linkage structure. The dimer linkage structure, while maintaining the dimer state of the retroviral genome, might also be involved in packaging and reverse transcription, as well as recombination during proviral DNA synthesis. To study the dimer structure of the retroviral genome and the mechanism of dimerization, we analyzed features of the dimeric genome of reticuloendotheliosis virus (REV) type A and identified elements required for its dimerization. Here we report that the REV dimeric genome extracted from virions and infected cells, as well as that synthesized in vitro, is more resistant to heat denaturation than avian sarcoma and leukemia virus, murine leukemia virus, or human immunodeficiency virus type 1 dimeric RNA. The minimal domain required to form a stable REV RNA dimer in vitro was found to map between positions 268 and 452 (KpnI and SalI sites), thus corresponding to the E encapsidation sequence (J. E. Embretson and H. M. Temin, J. Virol. 61:2675-2683, 1987). In addition, both the 5' and 3' halves of E are necessary in cis for RNA dimerization and the extent of RNA dimerization is influenced by viral sequences flanking E. Rapid and efficient dimerization of REV RNA containing gag sequences in addition to the E sequences and annealing of replication primer tRNA(Pro) to the primer-binding site necessitate the nucleocapsid protein. Images PMID:1331519

  5. By-product formation in repetitive PCR amplification of DNA libraries during SELEX.

    PubMed

    Tolle, Fabian; Wilke, Julian; Wengel, Jesper; Mayer, Günter

    2014-01-01

    The selection of nucleic acid aptamers is an increasingly important approach to generate specific ligands binding to virtually any molecule of choice. However, selection-inherent amplification procedures are prone to artificial by-product formation that prohibits the enrichment of target-recognizing aptamers. Little is known about the formation of such by-products when employing nucleic acid libraries as templates. We report on the formation of two different forms of by-products, named ladder- and non-ladder-type observed during repetitive amplification in the course of in vitro selection experiments. Based on sequence information and the amplification behaviour of defined enriched nucleic acid molecules we suppose a molecular mechanism through which these amplification by-products are built. Better understanding of these mechanisms might help to find solutions minimizing by-product formation and improving the success rate of aptamer selection.

  6. Direct association of Csk homologous kinase (CHK) with the diphosphorylated site Tyr568/570 of the activated c-KIT in megakaryocytes.

    PubMed

    Price, D J; Rivnay, B; Fu, Y; Jiang, S; Avraham, S; Avraham, H

    1997-02-28

    The Csk homologous kinase (CHK), formerly MATK, has previously been shown to bind to activated c-KIT. In this report, we characterize the binding of SH2(CHK) to specific phosphotyrosine sites on the c-KIT protein sequence. Phosphopeptide inhibition of the in vitro interaction of SH2(CHK)-glutathione S-transferase fusion protein/c-KIT from SCF/KL-treated Mo7e megakaryocytic cells indicated that two sites on c-KIT were able to bind SH2(CHK). These sites were the Tyr568/570 diphosphorylated sequence and the monophosphorylated Tyr721 sequence. To confirm this, we precipitated native CHK from cellular extracts using phosphorylated peptides linked to Affi-Gel 15. In addition, purified SH2(CHK)-glutathione S-transferase fusion protein was precipitated with the same peptide beads. All of the peptide bead-binding studies were consistent with the direct binding of SH2(CHK) to phosphorylated Tyr568/570 and Tyr721 sites. Binding of FYN and SHC to the diphosphorylated Tyr568/570 site was observed, while binding of Csk to this site was not observed. The SH2(CHK) binding to the two sites is direct and not through phosphorylated intermediates such as FYN or SHC. Site-directed mutagenesis of the full-length c-KIT cDNA followed by transient transfection indicated that only the Tyr568/570, and not the Tyr721, is able to bind SH2(CHK). This indicates that CHK binds to the same site on c-KIT to which FYN binds, possibly bringing the two into proximity on associated c-KIT subunits and leading to the down-regulation of FYN by CHK.

  7. Incorporating evolution of transcription factor binding sites into annotated alignments.

    PubMed

    Bais, Abha S; Grossmann, Stefen; Vingron, Martin

    2007-08-01

    Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.

  8. Subsite mapping of enzymes. Depolymerase computer modelling.

    PubMed Central

    Allen, J D; Thoma, J A

    1976-01-01

    We have developed a depolymerase computer model that uses a minimization routine. The model is designed so that, given experimental bond-cleavage frequencies for oligomeric substrates and experimental Michaelis parameters as a function of substrate chain length, the optimum subsite map is generated. The minimized sum of the weighted-squared residuals of the experimental and calculated data is used as a criterion of the goodness-of-fit for the optimized subsite map. The application of the minimization procedure to subsite mapping is explored through the use of simulated data. A procedure is developed whereby the minimization model can be used to determine the number of subsites in the enzymic binding region and to locate the position of the catalytic amino acids among these subsites. The degree of propagation of experimental variance into the subsite-binding energies is estimated. The question of whether hydrolytic rate coefficients are constant or a function of the number of filled subsites is examined. PMID:999629

  9. Regulation of expression of the ada gene controlling the adaptive response. Interactions with the ada promoter of the Ada protein and RNA polymerase.

    PubMed

    Sakumi, K; Sekiguchi, M

    1989-01-20

    The Ada protein of Escherichia coli catalyzes transfer of methyl groups from methylated DNA to its own molecule, and the methylated form of Ada protein promotes transcription of its own gene, ada. Using an in vitro reconstituted system, we found that both the sigma factor and the methylated Ada protein are required for transcription of the ada gene. To elucidate molecular mechanisms involved in the regulation of the ada transcription, we investigated interactions of the non-methylated and methylated forms of Ada protein and the RNA polymerase holo enzyme (the core enzyme and sigma factor) with a DNA fragment carrying the ada promoter region. Footprinting analyses revealed that the methylated Ada protein binds to a region from positions -63 to -31, which includes the ada regulatory sequence AAAGCGCA. No firm binding was observed with the non-methylated Ada protein, although some DNase I-hypersensitive sites were produced in the promoter by both types of Ada protein. RNA polymerase did bind to the promoter once the methylated Ada protein had bound to the upstream sequence. To correlate these phenomena with the process in vivo, we used the DNAs derived from promoter-defective mutants. No binding of Ada protein nor of RNA polymerase occurred with a mutant DNA having a C to G substitution at position -47 within the ada regulatory sequence. In the case of a -35 box mutant with a T to A change at position -34, the methylated Ada protein did bind to the ada regulatory sequence, yet there was no RNA polymerase binding. Thus, the binding of the methylated Ada protein to the upstream region apparently facilitates binding of the RNA polymerase to the proper region of the promoter. The Ada protein possesses two known methyl acceptor sites, Cys69 and Cys321. The role of methylation of each cysteine residue was investigated using mutant forms of the Ada protein. The Ada protein with the cysteine residue at position 69 replaced by alanine was incapable of binding to the ada promoter even when the cysteine residue at position 321 of the protein was methylated. When the Ada protein with alanine at position 321 was methylated, it acquired the potential to bind to the ada promoter. These results are compatible with the notion that methylation of the cysteine residue at position 69 causes a conformational change of the Ada protein, thereby facilitating binding of the protein to the upstream regulatory sequence.

  10. Identification of the DNA-Binding Domains of Human Replication Protein A That Recognize G-Quadruplex DNA

    PubMed Central

    Prakash, Aishwarya; Natarajan, Amarnath; Marky, Luis A.; Ouellette, Michel M.; Borgstahl, Gloria E. O.

    2011-01-01

    Replication protein A (RPA), a key player in DNA metabolism, has 6 single-stranded DNA-(ssDNA-) binding domains (DBDs) A-F. SELEX experiments with the DBDs-C, -D, and -E retrieve a 20-nt G-quadruplex forming sequence. Binding studies show that RPA-DE binds preferentially to the G-quadruplex DNA, a unique preference not observed with other RPA constructs. Circular dichroism experiments show that RPA-CDE-core can unfold the G-quadruplex while RPA-DE stabilizes it. Binding studies show that RPA-C binds pyrimidine- and purine-rich sequences similarly. This difference between RPA-C and RPA-DE binding was also indicated by the inability of RPA-CDE-core to unfold an oligonucleotide containing a TC-region 5′ to the G-quadruplex. Molecular modeling studies of RPA-DE and telomere-binding proteins Pot1 and Stn1 reveal structural similarities between the proteins and illuminate potential DNA-binding sites for RPA-DE and Stn1. These data indicate that DBDs of RPA have different ssDNA recognition properties. PMID:21772997

  11. Deciphering minimal antigenic epitopes associated with Burkholderia pseudomallei and Burkholderia mallei lipopolysaccharide O-antigens.

    PubMed

    Tamigney Kenfack, Marielle; Mazur, Marcelina; Nualnoi, Teerapat; Shaffer, Teresa L; Ngassimou, Abba; Blériot, Yves; Marrot, Jérôme; Marchetti, Roberta; Sintiprungrat, Kitisak; Chantratita, Narisara; Silipo, Alba; Molinaro, Antonio; AuCoin, David P; Burtnick, Mary N; Brett, Paul J; Gauthier, Charles

    2017-07-24

    Burkholderia pseudomallei (Bp) and Burkholderia mallei (Bm), the etiologic agents of melioidosis and glanders, respectively, cause severe disease in both humans and animals. Studies have highlighted the importance of Bp and Bm lipopolysaccharides (LPS) as vaccine candidates. Here we describe the synthesis of seven oligosaccharides as the minimal structures featuring all of the reported acetylation/methylation patterns associated with Bp and Bm LPS O-antigens (OAgs). Our approach is based on the conversion of an L-rhamnose into a 6-deoxy-L-talose residue at a late stage of the synthetic sequence. Using biochemical and biophysical methods, we demonstrate the binding of several Bp and Bm LPS-specific monoclonal antibodies with terminal OAg residues. Mice immunized with terminal disaccharide-CRM197 constructs produced high-titer antibody responses that crossreacted with Bm-like OAgs. Collectively, these studies serve as foundation for the development of novel therapeutics, diagnostics, and vaccine candidates to combat diseases caused by Bp and Bm.Melioidosis and glanders are multifaceted infections caused by gram-negative bacteria. Here, the authors synthesize a series of oligosaccharides that mimic the lipopolysaccharides present on the pathogens' surface and use them to develop novel glycoconjugates for vaccine development.

  12. A Silent ABC Transporter Isolated from Streptomyces rochei F20 Induces Multidrug Resistance

    PubMed Central

    Fernández-Moreno, Miguel A.; Carbó, Lázaro; Cuesta, Trinidad; Vallín, Carlos; Malpartida, Francisco

    1998-01-01

    In the search for heterologous activators for actinorhodin production in Streptomyces lividans, 3.4 kb of DNA from Streptomyces rochei F20 (a streptothricin producer) were characterized. Subcloning experiments showed that the minimal DNA fragment required for activation was 0.4 kb in size. The activation is mediated by increasing the levels of transcription of the actII-ORF4 gene. Sequencing of the minimal activating fragment did not reveal any clues about its mechanism; nevertheless, it was shown to overlap the 3′ end of two convergent genes, one of whose translated products (ORF2) strongly resembles that of other genes belonging to the ABC transporter superfamily. Computer-assisted analysis of the 3.4-kb DNA sequence showed the 3′ terminus of an open reading frame (ORF), i.e., ORFA, and three complete ORFs (ORF1, ORF2, and ORFB). Searches in the databases with their respective gene products revealed similarities for ORF1 and ORF2 with ATP-binding proteins and transmembrane proteins, respectively, which are found in members of the ABC transporter superfamily. No similarities for ORFA and ORFB were found in the databases. Insertional inactivation of ORF1 and ORF2, their transcription analysis, and their cloning in heterologous hosts suggested that these genes were not expressed under our experimental conditions; however, cloning of ORF1 and ORF2 together (but not separately) under the control of an expressing promoter induced resistance to several chemically different drugs: oleandomycin, erythromycin, spiramycin, doxorubicin, and tetracycline. Thus, this genetic system, named msr, is a new bacterial multidrug ABC transporter. PMID:9696745

  13. Violation of an Evolutionarily Conserved Immunoglobulin Diversity Gene Sequence Preference Promotes Production of dsDNA-Specific IgG Antibodies

    PubMed Central

    Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.

    2015-01-01

    Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374

  14. Sequence-Specific Affinity Chromatography of Bacterial Small Regulatory RNA-Binding Proteins from Bacterial Cells.

    PubMed

    Gans, Jonathan; Osborne, Jonathan; Cheng, Juliet; Djapgne, Louise; Oglesby-Sherrouse, Amanda G

    2018-01-01

    Bacterial small RNA molecules (sRNAs) are increasingly recognized as central regulators of bacterial stress responses and pathogenesis. In many cases, RNA-binding proteins are critical for the stability and function of sRNAs. Previous studies have adopted strategies to genetically tag an sRNA of interest, allowing isolation of RNA-protein complexes from cells. Here we present a sequence-specific affinity purification protocol that requires no prior genetic manipulation of bacterial cells, allowing isolation of RNA-binding proteins bound to native RNA molecules.

  15. A calmodulin-like protein (LCALA) is a new Leishmania amazonensis candidate for telomere end-binding protein.

    PubMed

    Morea, Edna G O; Viviescas, Maria Alejandra; Fernandes, Carlos A H; Matioli, Fabio F; Lira, Cristina B B; Fernandez, Maribel F; Moraes, Barbara S; da Silva, Marcelo S; Storti, Camila B; Fontes, Marcos R M; Cano, Maria Isabel N

    2017-11-01

    Leishmania spp. telomeres are composed of 5'-TTAGGG-3' repeats associated with proteins. We have previously identified LaRbp38 and LaRPA-1 as proteins that bind the G-rich telomeric strand. At that time, we had also partially characterized a protein: DNA complex, named LaGT1, but we could not identify its protein component. Using protein-DNA interaction and competition assays, we confirmed that LaGT1 is highly specific to the G-rich telomeric single-stranded DNA. Three protein bands, with LaGT1 activity, were isolated from affinity-purified protein extracts in-gel digested, and sequenced de novo using mass spectrometry analysis. In silico analysis of the digested peptide identified them as a putative calmodulin with sequences identical to the T. cruzi calmodulin. In the Leishmania genome, the calmodulin ortholog is present in three identical copies. We cloned and sequenced one of the gene copies, named it LCalA, and obtained the recombinant protein. Multiple sequence alignment and molecular modeling showed that LCalA shares homology to most eukaryotes calmodulin. In addition, we demonstrated that LCalA is nuclear, partially co-localizes with telomeres and binds in vivo the G-rich telomeric strand. Recombinant LCalA can bind specifically and with relative affinity to the G-rich telomeric single-strand and to a 3'G-overhang, and DNA binding is calcium dependent. We have described a novel candidate component of Leishmania telomeres, LCalA, a nuclear calmodulin that binds the G-rich telomeric strand with high specificity and relative affinity, in a calcium-dependent manner. LCalA is the first reported calmodulin that binds in vivo telomeric DNA. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Structure-based Analysis to Hu-DNA Binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swinger,K.; Rice, P.

    2007-01-01

    HU and IHF are prokaryotic proteins that induce very large bends in DNA. They are present in high concentrations in the bacterial nucleoid and aid in chromosomal compaction. They also function as regulatory cofactors in many processes, such as site-specific recombination and the initiation of replication and transcription. HU and IHF have become paradigms for understanding DNA bending and indirect readout of sequence. While IHF shows significant sequence specificity, HU binds preferentially to certain damaged or distorted DNAs. However, none of the structurally diverse HU substrates previously studied in vitro is identical with the distorted substrates in the recently publishedmore » Anabaena HU(AHU)-DNA cocrystal structures. Here, we report binding affinities for AHU and the DNA in the cocrystal structures. The binding free energies for formation of these AHU-DNA complexes range from 10-14.5 kcal/mol, representing K{sub d} values in the nanomolar to low picomolar range, and a maximum stabilization of at least 6.3 kcal/mol relative to complexes with undistorted, non-specific DNA. We investigated IHF binding and found that appropriate structural distortions can greatly enhance its affinity. On the basis of the coupling of structural and relevant binding data, we estimate the amount of conformational strain in an IHF-mediated DNA kink that is relieved by a nick (at least 0.76 kcal/mol) and pinpoint the location of the strain. We show that AHU has a sequence preference for an A+T-rich region in the center of its DNA-binding site, correlating with an unusually narrow minor groove. This is similar to sequence preferences shown by the eukaryotic nucleosome.« less

  17. NMR and computational methods applied to the 3- dimensional structure determination of DNA and ligand-DNA complexes in solution

    NASA Astrophysics Data System (ADS)

    Smith, Jarrod Anson

    2D homonuclear 1H NMR methods and restrained molecular dynamics (rMD) calculations have been applied to determining the three-dimensional structures of DNA and minor groove-binding ligand-DNA complexes in solution. The structure of the DNA decamer sequence d(GCGTTAACGC)2 has been solved both with a distance-based rMD protocol and an NOE relaxation matrix backcalculation-based protocol in order to probe the relative merits of the different refinement methods. In addition, three minor groove binding ligand-DNA complexes have been examined. The solution structure of the oligosaccharide moiety of the antitumor DNA scission agent calicheamicin γ1I has been determined in complex with a decamer duplex containing its high affinity 5'-TCCT- 3' binding sequence. The structure of the complex reinforces the belief that the oligosaccharide moiety is responsible for the sequence selective minor-groove binding activity of the agent, and critical intermolecular contacts are revealed. The solution structures of both the (+) and (-) enantiomers of the minor groove binding DNA alkylating agent duocarmycin SA have been determined in covalent complex with the undecamer DNA duplex d(GACTAATTGTC).d(GAC AATTAGTC). The results support the proposal that the alkylation activity of the duocarmycin antitumor antibiotics is catalyzed by a binding-induced conformational change in the ligand which activates the cyclopropyl group for reaction with the DNA. Comparisons between the structures of the two enantiomers covalently bound to the same DNA sequence at the same 5'-AATTA-3 ' site have provided insight into the binding orientation and site selectivity, as well as the relative rates of reactivity of these two agents.

  18. Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction*

    PubMed Central

    Rahman, Kh. Shamsur; Chowdhury, Erfan Ullah; Sachse, Konrad; Kaltenboeck, Bernhard

    2016-01-01

    X-ray crystallography has shown that an antibody paratope typically binds 15–22 amino acids (aa) of an epitope, of which 2–5 randomly distributed amino acids contribute most of the binding energy. In contrast, researchers typically choose for B-cell epitope mapping short peptide antigens in antibody binding assays. Furthermore, short 6–11-aa epitopes, and in particular non-epitopes, are over-represented in published B-cell epitope datasets that are commonly used for development of B-cell epitope prediction approaches from protein antigen sequences. We hypothesized that such suboptimal length peptides result in weak antibody binding and cause false-negative results. We tested the influence of peptide antigen length on antibody binding by analyzing data on more than 900 peptides used for B-cell epitope mapping of immunodominant proteins of Chlamydia spp. We demonstrate that short 7–12-aa peptides of B-cell epitopes bind antibodies poorly; thus, epitope mapping with short peptide antigens falsely classifies many B-cell epitopes as non-epitopes. We also show in published datasets of confirmed epitopes and non-epitopes a direct correlation between length of peptide antigens and antibody binding. Elimination of short, ≤11-aa epitope/non-epitope sequences improved datasets for evaluation of in silico B-cell epitope prediction. Achieving up to 86% accuracy, protein disorder tendency is the best indicator of B-cell epitope regions for chlamydial and published datasets. For B-cell epitope prediction, the most effective approach is plotting disorder of protein sequences with the IUPred-L scale, followed by antibody reactivity testing of 16–30-aa peptides from peak regions. This strategy overcomes the well known inaccuracy of in silico B-cell epitope prediction from primary protein sequences. PMID:27189949

  19. A Novel WRKY transcription factor is required for induction of PR-1a gene expression by salicylic acid and bacterial elicitors.

    PubMed

    van Verk, Marcel C; Pappaioannou, Dimitri; Neeleman, Lyda; Bol, John F; Linthorst, Huub J M

    2008-04-01

    PR-1a is a salicylic acid-inducible defense gene of tobacco (Nicotiana tabacum). One-hybrid screens identified a novel tobacco WRKY transcription factor (NtWRKY12) with specific binding sites in the PR-1a promoter at positions -564 (box WK(1)) and -859 (box WK(2)). NtWRKY12 belongs to the class of transcription factors in which the WRKY sequence is followed by a GKK rather than a GQK sequence. The binding sequence of NtWRKY12 (WK box TTTTCCAC) deviated significantly from the consensus sequence (W box TTGAC[C/T]) shown to be recognized by WRKY factors with the GQK sequence. Mutation of the GKK sequence in NtWRKY12 into GQK or GEK abolished binding to the WK box. The WK(1) box is in close proximity to binding sites in the PR-1a promoter for transcription factors TGA1a (as-1 box) and Myb1 (MBSII box). Expression studies with PR-1a promoterbeta-glucuronidase (GUS) genes in stably and transiently transformed tobacco indicated that NtWRKY12 and TGA1a act synergistically in PR-1a expression induced by salicylic acid and bacterial elicitors. Cotransfection of Arabidopsis thaliana protoplasts with 35SNtWRKY12 and PR-1aGUS promoter fusions showed that overexpression of NtWRKY12 resulted in a strong increase in GUS expression, which required functional WK boxes in the PR-1a promoter.

  20. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations

    PubMed Central

    Wang, Junbai; Batmanov, Kirill

    2015-01-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  1. Core histone genes of Giardia intestinalis: genomic organization, promoter structure, and expression

    PubMed Central

    Yee, Janet; Tang, Anita; Lau, Wei-Ling; Ritter, Heather; Delport, Dewald; Page, Melissa; Adam, Rodney D; Müller, Miklós; Wu, Gang

    2007-01-01

    Background Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones. Results We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia. Conclusion In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome. PMID:17425802

  2. bfr1+, a novel gene of Schizosaccharomyces pombe which confers brefeldin A resistance, is structurally related to the ATP-binding cassette superfamily.

    PubMed Central

    Nagao, K; Taguchi, Y; Arioka, M; Kadokura, H; Takatsuki, A; Yoda, K; Yamasaki, M

    1995-01-01

    We have isolated a Schizosaccharomyces pombe gene, bfr1+, which on a multicopy plasmid vector, pDB248', confers resistance to brefeldin A (BFA), an inhibitor of intracellular protein transport. This gene encodes a novel protein of 1,531 amino acids with an intramolecular duplicated structure, each half containing a single ATP-binding consensus sequence and a set of six transmembrane sequences. This structural characteristic of bfr1+ protein resembles that of mammalian P-glycoprotein, which, by exporting a variety of anticancer drugs, has been shown to be responsible for multidrug resistance in tumor cells. Consistent with this is that S. pombe cells harboring bfr1+ on pDB248' are resistant to actinomycin D, cerulenin, and cytochalasin B, as well as to BFA. The relative positions of the ATP-binding sequences and the clusters of transmembrane sequences within the bfr1+ protein are, however, transposed in comparison with those in P-glycoprotein; the bfr1+ protein has N-terminal ATP-binding sequence followed by transmembrane segments in each half of the molecule. The bfr1+ protein exhibited significant homology in primary and secondary structures with two recently identified multidrug resistance gene products of Saccharomyces cerevisiae, Snq2 and Sts1/Pdr5/Ydr1. The bfr1+ gene is not essential for cell growth or mating, but a delta bfr1 mutant exhibited hypersensitivity to BFA. We propose that the bfr1+ protein is another member of the ATP-binding cassette superfamily and serves as an efflux pump of various antibiotics. PMID:7883711

  3. Heterogeneous RNA-binding protein M4 is a receptor for carcinoembryonic antigen in Kupffer cells.

    PubMed

    Bajenova, O V; Zimmer, R; Stolper, E; Salisbury-Rowswell, J; Nanji, A; Thomas, P

    2001-08-17

    Here we report the isolation of the recombinant cDNA clone from rat macrophages, Kupffer cells (KC) that encodes a protein interacting with carcinoembryonic antigen (CEA). To isolate and identify the CEA receptor gene we used two approaches: screening of a KC cDNA library with a specific antibody and the yeast two-hybrid system for protein interaction using as a bait the N-terminal part of the CEA encoding the binding site. Both techniques resulted in the identification of the rat heterogeneous RNA-binding protein (hnRNP) M4 gene. The rat ortholog cDNA sequence has not been previously described. The open reading frame for this gene contains a 2351-base pair sequence with the polyadenylation signal AATAAA and a termination poly(A) tail. The mRNA shows ubiquitous tissue expression as a 2.4-kilobase transcript. The deduced amino acid sequence comprised a 78-kDa membrane protein with 3 putative RNA-binding domains, arginine/methionine/glutamine-rich C terminus and 3 potential membrane spanning regions. When hnRNP M4 protein is expressed in pGEX4T-3 vector system in Escherichia coli it binds (125)I-labeled CEA in a Ca(2+)-dependent fashion. Transfection of rat hnRNP M4 cDNA into a non-CEA binding mouse macrophage cell line p388D1 resulted in CEA binding. These data provide evidence for a new function of hnRNP M4 protein as a CEA-binding protein in Kupffer cells.

  4. GenProBiS: web server for mapping of sequence variants to protein binding sites.

    PubMed

    Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka

    2017-07-03

    Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate

    PubMed Central

    Yang, Yu; Hebron, Haroun R.; Hang, Jun

    2009-01-01

    A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455

  6. Spectroscopic studies of the binding of Cu(II) complexes of oxicam NSAIDs to alternating G-C and homopolymeric G-C sequences

    NASA Astrophysics Data System (ADS)

    Chakraborty, Sreeja; Bose, Madhuparna; Sarkar, Munna

    2014-03-01

    Drugs belonging to the Non-steroidal anti-inflammatory (NSAID) group are not only used as anti-inflammatory, analgesic and anti-pyretic agents, but also show anti-cancer effects. Complexing them with a bioactive metal like copper, show an enhancement in their anti-cancer effects compared to the bare drugs, whose exact mechanism of action is not yet fully understood. For the first time, it was shown by our group that Cu(II)-NSAIDs can directly bind to the DNA backbone. The ability of the copper complexes of NSAIDs namely meloxicam and piroxicam to bind to the DNA backbone could be a possible molecular mechanism behind their enhanced anticancer effects. Elucidating base sequence specific interaction of Cu(II)-NSAIDs to the DNA will provide information on their possible binding sites in the genome sequence. In this work, we present how these complexes respond to differences in structure and hydration pattern of GC rich sequences. For this, binding studies of Cu(II) complexes of piroxicam [Cu(II)-(Px)2 (L)2] and meloxicam [Cu(II)-(Mx)2 (L)] with alternating GC (polydG-dC) and homopolymeric GC (polydG-polydC) sequences were carried out using a combination of spectroscopic techniques that include UV-Vis absorption, fluorescence and circular dichroism (CD) spectroscopy. The Cu(II)-NSAIDs show strong binding affinity to both polydG-dC and polydG-polydC. The role reversal of Cu(II)-meloxicam from a strong binder of polydG-dC (Kb = 11.5 × 103 M-1) to a weak binder of polydG-polydC (Kb = 5.02 × 103 M-1), while Cu(II)-piroxicam changes from a strong binder of polydG-polydC (Kb = 8.18 × 103 M-1) to a weak one of polydG-dC (Kb = 2.18 × 103 M-1), point to the sensitivity of these complexes to changes in the backbone structures/hydration. Changes in the profiles of UV absorption band and CD difference spectra, upon complex binding to polynucleotides and the results of competitive binding assay using ethidium bromide (EtBr) fluorescence indicate different binding modes in each case.

  7. Binding properties of SUMO-interacting motifs (SIMs) in yeast.

    PubMed

    Jardin, Christophe; Horn, Anselm H C; Sticht, Heinrich

    2015-03-01

    Small ubiquitin-like modifier (SUMO) conjugation and interaction play an essential role in many cellular processes. A large number of yeast proteins is known to interact non-covalently with SUMO via short SUMO-interacting motifs (SIMs), but the structural details of this interaction are yet poorly characterized. In the present work, sequence analysis of a large dataset of 148 yeast SIMs revealed the existence of a hydrophobic core binding motif and a preference for acidic residues either within or adjacent to the core motif. Thus the sequence properties of yeast SIMs are highly similar to those described for human. Molecular dynamics simulations were performed to investigate the binding preferences for four representative SIM peptides differing in the number and distribution of acidic residues. Furthermore, the relative stability of two previously observed alternative binding orientations (parallel, antiparallel) was assessed. For all SIMs investigated, the antiparallel binding mode remained stable in the simulations and the SIMs were tightly bound via their hydrophobic core residues supplemented by polar interactions of the acidic residues. In contrary, the stability of the parallel binding mode is more dependent on the sequence features of the SIM motif like the number and position of acidic residues or the presence of additional adjacent interaction motifs. This information should be helpful to enhance the prediction of SIMs and their binding properties in different organisms to facilitate the reconstruction of the SUMO interactome.

  8. Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.

    PubMed

    Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland

    2009-12-01

    Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.

  9. Sequence information gain based motif analysis.

    PubMed

    Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre

    2015-11-09

    The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.

  10. Sequence Alignment to Predict Across Species Susceptibility ...

    EPA Pesticide Factsheets

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev

  11. footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

    PubMed

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2014-01-15

    Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.

  12. Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.

    PubMed Central

    Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P

    1998-01-01

    The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery. PMID:9811800

  13. Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.

    PubMed

    Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P

    1998-11-01

    The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery.

  14. Acanthamoeba castellanii contains a ribosomal RNA enhancer binding protein which stimulates TIF-IB binding and transcription under stringent conditions.

    PubMed

    Yang, Q; Radebaugh, C A; Kubaska, W; Geiss, G K; Paule, M R

    1995-11-11

    The intergenic spacer (IGS) of Acanthamoeba castellanii rRNA genes contains repeated elements which are weak enhancers for transcription by RNA polymerase I. A protein, EBF, was identified and partially purified which binds to the enhancers and to several other sequences within the IGS, but not to other DNA fragments, including the rRNA core promoter. No consensus binding sequence could be discerned in these fragments and bound factor is in rapid equilibrium with unbound. EBF has functional characteristics similar to vertebrate upstream binding factors (UBF). Not only does it bind to the enhancer and other IGS elements, but it also stimulates binding of TIF-IB, the fundamental transcription initiation factor, to the core promoter and stimulates transcription from the promoter. Attempts to identify polypeptides with epitopes similar to rat or Xenopus laevis UBF suggest that structurally the protein from A.castellanii is not closely related to vertebrate UBF.

  15. Acanthamoeba castellanii contains a ribosomal RNA enhancer binding protein which stimulates TIF-IB binding and transcription under stringent conditions.

    PubMed Central

    Yang, Q; Radebaugh, C A; Kubaska, W; Geiss, G K; Paule, M R

    1995-01-01

    The intergenic spacer (IGS) of Acanthamoeba castellanii rRNA genes contains repeated elements which are weak enhancers for transcription by RNA polymerase I. A protein, EBF, was identified and partially purified which binds to the enhancers and to several other sequences within the IGS, but not to other DNA fragments, including the rRNA core promoter. No consensus binding sequence could be discerned in these fragments and bound factor is in rapid equilibrium with unbound. EBF has functional characteristics similar to vertebrate upstream binding factors (UBF). Not only does it bind to the enhancer and other IGS elements, but it also stimulates binding of TIF-IB, the fundamental transcription initiation factor, to the core promoter and stimulates transcription from the promoter. Attempts to identify polypeptides with epitopes similar to rat or Xenopus laevis UBF suggest that structurally the protein from A.castellanii is not closely related to vertebrate UBF. Images PMID:7501455

  16. Evaluation of simultaneous binding of Chromomycin A3 to the multiple sites of DNA by the new restriction enzyme assay.

    PubMed

    Murase, Hirotaka; Noguchi, Tomoharu; Sasaki, Shigeki

    2018-06-01

    Chromomycin A3 (CMA3) is an aureolic acid-type antitumor antibiotic. CMA3 forms dimeric complexes with divalent cations, such as Mg 2+ , which strongly binds to the GC rich sequence of DNA to inhibit DNA replication and transcription. In this study, the binding property of CMA3 to the DNA sequence containing multiple GC-rich binding sites was investigated by measuring the protection from hydrolysis by the restriction enzymes, AccII and Fnu4HI, for the center of the CGCG site and the 5'-GC↓GGC site, respectively. In contrast to the standard DNase I footprinting method, the DNA substrates are fully hydrolyzed by the restriction enzymes, therefore, the full protection of DNA at all the cleavable sites indicates that CMA3 simultaneously binds to all the binding sites. The restriction enzyme assay has suggested that CMA3 has a high tendency to bind the successive CGCG sites and the CGG repeat. Copyright © 2018 Elsevier Ltd. All rights reserved.

  17. Human mRNA polyadenylate binding protein: evolutionary conservation of a nucleic acid binding motif.

    PubMed Central

    Grange, T; de Sa, C M; Oddos, J; Pictet, R

    1987-01-01

    We have isolated a full length cDNA (cDNA) coding for the human poly(A) binding protein. The cDNA derived 73 kd basic translation product has the same Mr, isoelectric point and peptidic map as the poly(A) binding protein. DNA sequence analysis reveals a 70,244 dalton protein. The N terminal part, highly homologous to the yeast poly(A) binding protein, is sufficient for poly(A) binding activity. This domain consists of a four-fold repeated unit of approximately 80 amino acids present in other nucleic acid binding proteins. In the C terminal part there is, as in the yeast protein, a sequence of approximately 150 amino acids, rich in proline, alanine and glutamine which together account for 48% of the residues. A 2,9 kb mRNA corresponding to this cDNA has been detected in several vertebrate cell types and in Drosophila melanogaster at every developmental stage including oogenesis. Images PMID:2885805

  18. Engineered proteins with PUF scaffold to manipulate RNA metabolism

    PubMed Central

    Wang, Yang; Wang, Zefeng; Tanaka Hall, Traci M.

    2013-01-01

    Pumilio/fem-3 mRNA binding factor (FBF) proteins are characterized by a sequence-specific RNA-binding domain. This unique single-stranded RNA recognition module, whose sequence specificity can be reprogrammed, has been fused with functional modules to engineer protein factors with various functions. Here we summarize the advancement in developing RNA regulatory tools and opportunities for the future. PMID:23731364

  19. Evaluation of DNA Binding Drugs as Inhibitors of ESX, and ETS Domain Transcription Factor Associated With Breast Cancer: Effects of ESX/DNA Complex Disruption

    DTIC Science & Technology

    2000-08-01

    4). Sequence recognition of all four DNA bases is achieved by positioning an N- methylimidazole opposite guanine or N-methylpyrrole opposite...unique sequences of DNA based upon selective binding motifs to all four DNA bases , although relatively little is known about the ability of these agents to

  20. Identification of a factor in HeLa cells specific for an upstream transcriptional control sequence of an EIA-inducible adenovirus promoter and its relative abundance in infected and uninfected cells.

    PubMed Central

    SivaRaman, L; Subramanian, S; Thimmappaya, B

    1986-01-01

    Utilizing the gel electrophoresis/DNA binding assay, a factor specific for the upstream transcriptional control sequence of the EIA-inducible adenovirus EIIA-early promoter has been detected in HeLa cell nuclear extract. Analysis of linker-scanning mutants of the promoter by DNA binding assays and methylation-interference experiments show that the factor binds to the 17-nucleotide sequence 5' TGGAGATGACGTAGTTT 3' located between positions -66 and -82 upstream from the cap site. This sequence has been shown to be essential for transcription of this promoter. The EIIA-early-promoter specific factor was found to be present at comparable levels in uninfected HeLa cells and in cells infected with either wild-type adenovirus or the EIA-deletion mutant dl312 under conditions in which the EIA proteins are induced to high levels [7 or 20 hr after infection in the presence of arabinonucleoside (cytosine arabinoside)]. Based on the quantitation in DNA binding assays, it appears that the mechanism of EIA-activated transcription of the EIIA-early promoter does not involve a net change in the amounts of this factor. Images PMID:2942943

  1. In vitro fluorescence studies of transcription factor IIB-DNA interaction.

    PubMed

    Górecki, Andrzej; Figiel, Małgorzata; Dziedzicka-Wasylewska, Marta

    2015-01-01

    General transcription factor TFIIB is one of the basal constituents of the preinitiation complex of eukaryotic RNA polymerase II, acting as a bridge between the preinitiation complex and the polymerase, and binding promoter DNA in an asymmetric manner, thereby defining the direction of the transcription. Methods of fluorescence spectroscopy together with circular dichroism spectroscopy were used to observe conformational changes in the structure of recombinant human TFIIB after binding to specific DNA sequence. To facilitate the exploration of the structural changes, several site-directed mutations have been introduced altering the fluorescence properties of the protein. Our observations showed that binding of specific DNA sequences changed the protein structure and dynamics, and TFIIB may exist in two conformational states, which can be described by a different microenvironment of W52. Fluorescence studies using both intrinsic and exogenous fluorophores showed that these changes significantly depended on the recognition sequence and concerned various regions of the protein, including those interacting with other transcription factors and RNA polymerase II. DNA binding can cause rearrangements in regions of proteins interacting with the polymerase in a manner dependent on the recognized sequences, and therefore, influence the gene expression.

  2. Predicting DNA binding proteins using support vector machine with hybrid fractal features.

    PubMed

    Niu, Xiao-Hui; Hu, Xue-Hai; Shi, Feng; Xia, Jing-Bo

    2014-02-21

    DNA-binding proteins play a vitally important role in many biological processes. Prediction of DNA-binding proteins from amino acid sequence is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) investigates the patterns hidden in protein sequences, and visually reveals previously unknown structure. Fractal dimensions (FD) are good tools to measure sizes of complex, highly irregular geometric objects. In order to extract the intrinsic correlation with DNA-binding property from protein sequences, CGR algorithm, fractal dimension and amino acid composition are applied to formulate the numerical features of protein samples in this paper. Seven groups of features are extracted, which can be computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test and Jackknife test. Comparing the results of numerical experiments, the group of amino acid composition and fractal dimension (21-dimension vector) gets the best result, the average accuracy is 81.82% and average Matthew's correlation coefficient (MCC) is 0.6017. This resulting predictor is also compared with existing method DNA-Prot and shows better performances. © 2013 The Authors. Published by Elsevier Ltd All rights reserved.

  3. Trellises and Trellis-Based Decoding Algorithms for Linear Block Codes. Part 3; The Map and Related Decoding Algirithms

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Fossorier, Marc

    1998-01-01

    In a coded communication system with equiprobable signaling, MLD minimizes the word error probability and delivers the most likely codeword associated with the corresponding received sequence. This decoding has two drawbacks. First, minimization of the word error probability is not equivalent to minimization of the bit error probability. Therefore, MLD becomes suboptimum with respect to the bit error probability. Second, MLD delivers a hard-decision estimate of the received sequence, so that information is lost between the input and output of the ML decoder. This information is important in coded schemes where the decoded sequence is further processed, such as concatenated coding schemes, multi-stage and iterative decoding schemes. In this chapter, we first present a decoding algorithm which both minimizes bit error probability, and provides the corresponding soft information at the output of the decoder. This algorithm is referred to as the MAP (maximum aposteriori probability) decoding algorithm.

  4. Octasaccharide is the minimal length unit required for efficient binding of cyclophilin B to heparin and cell surface heparan sulphate.

    PubMed

    Vanpouille, Christophe; Denys, Agnès; Carpentier, Mathieu; Pakula, Rachel; Mazurier, Joël; Allain, Fabrice

    2004-09-01

    Cyclophilin B (CyPB) is a heparin-binding protein first identified as a receptor for cyclosporin A. In previous studies, we reported that CyPB triggers chemotaxis and integrin-mediated adhesion of T-lymphocytes by way of interaction with two types of binding sites. The first site corresponds to a signalling receptor; the second site has been identified as heparan sulphate (HS) and appears crucial to induce cell adhesion. Characterization of the HS-binding unit is critical to understand the requirement of HS in pro-adhesive activity of CyPB. By using a strategy based on gel mobility shift assays with fluorophore-labelled oligosaccharides, we demonstrated that the minimal heparin unit required for efficient binding of CyPB is an octasaccharide. The mutants CyPB(KKK-) [where KKK- refers to the substitutions K3A(Lys3-->Ala)/K4A/K5A] and CyPB(DeltaYFD) (where Tyr14-Phe-Asp16 has been deleted) failed to interact with octasaccharides, confirming that the Y14FD16 and K3KK5 clusters are required for CyPB binding. Molecular modelling revealed that both clusters are spatially arranged so that they may act synergistically to form a binding site for the octasaccharide. We then demonstrated that heparin-derived octasaccharides and higher degree of polymerization oligosaccharides inhibited the interaction between CyPB and fluorophore-labelled HS chains purified from T-lymphocytes, and strongly reduced the HS-dependent pro-adhesive activity of CyPB. However, oligosaccharides or heparin were unable to restore adhesion of heparinase-treated T-lymphocytes, indicating that HS has to be present on the cell membrane to support the pro-adhesive activity of CyPB. Altogether, these results demonstrate that the octasaccharide is likely to be the minimal length unit required for efficient binding of CyPB to cell surface HS and consequent HS-dependent cell responses.

  5. Octasaccharide is the minimal length unit required for efficient binding of cyclophilin B to heparin and cell surface heparan sulphate

    PubMed Central

    2004-01-01

    Cyclophilin B (CyPB) is a heparin-binding protein first identified as a receptor for cyclosporin A. In previous studies, we reported that CyPB triggers chemotaxis and integrin-mediated adhesion of T-lymphocytes by way of interaction with two types of binding sites. The first site corresponds to a signalling receptor; the second site has been identified as heparan sulphate (HS) and appears crucial to induce cell adhesion. Characterization of the HS-binding unit is critical to understand the requirement of HS in pro-adhesive activity of CyPB. By using a strategy based on gel mobility shift assays with fluorophore-labelled oligosaccharides, we demonstrated that the minimal heparin unit required for efficient binding of CyPB is an octasaccharide. The mutants CyPBKKK− [where KKK− refers to the substitutions K3A(Lys3→Ala)/K4A/K5A] and CyPBΔYFD (where Tyr14-Phe-Asp16 has been deleted) failed to interact with octasaccharides, confirming that the Y14FD16 and K3KK5 clusters are required for CyPB binding. Molecular modelling revealed that both clusters are spatially arranged so that they may act synergistically to form a binding site for the octasaccharide. We then demonstrated that heparin-derived octasaccharides and higher degree of polymerization oligosaccharides inhibited the interaction between CyPB and fluorophore-labelled HS chains purified from T-lymphocytes, and strongly reduced the HS-dependent pro-adhesive activity of CyPB. However, oligosaccharides or heparin were unable to restore adhesion of heparinase-treated T-lymphocytes, indicating that HS has to be present on the cell membrane to support the pro-adhesive activity of CyPB. Altogether, these results demonstrate that the octasaccharide is likely to be the minimal length unit required for efficient binding of CyPB to cell surface HS and consequent HS-dependent cell responses. PMID:15109301

  6. A coarse-grained biophysical model of sequence evolution and the population size dependence of the speciation rate

    PubMed Central

    Khatri, Bhavin S.; Goldstein, Richard A.

    2015-01-01

    Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level. PMID:25936759

  7. Diversity of Functionally Permissive Sequences in the Receptor-Binding Site of Influenza Hemagglutinin.

    PubMed

    Wu, Nicholas C; Xie, Jia; Zheng, Tianqing; Nycholat, Corwin M; Grande, Geramie; Paulson, James C; Lerner, Richard A; Wilson, Ian A

    2017-06-14

    Influenza A virus hemagglutinin (HA) initiates viral entry by engaging host receptor sialylated glycans via its receptor-binding site (RBS). The amino acid sequence of the RBS naturally varies across avian and human influenza virus subtypes and is also evolvable. However, functional sequence diversity in the RBS has not been fully explored. Here, we performed a large-scale mutational analysis of the RBS of A/WSN/33 (H1N1) and A/Hong Kong/1/1968 (H3N2) HAs. Many replication-competent mutants not yet observed in nature were identified, including some that could escape from an RBS-targeted broadly neutralizing antibody. This functional sequence diversity is made possible by pervasive epistasis in the RBS 220-loop and can be buffered by avidity in viral receptor binding. Overall, our study reveals that the HA RBS can accommodate a much greater range of sequence diversity than previously thought, which has significant implications for the complex evolutionary interrelationships between receptor specificity and immune escape. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. An immunoassay for the study of DNA-binding activities of herpes simplex virus protein ICP8.

    PubMed

    Lee, C K; Knipe, D M

    1985-06-01

    An immunoassay was used to examine the interaction between a herpes simplex virus protein, ICP8, and various types of DNA. The advantage of this assay is that the protein is not subjected to harsh purification procedures. We characterized the binding of ICP8 to both single-stranded (ss) and double-stranded (ds) DNA. ICP8 bound ss DNA fivefold more efficiently than ds DNA, and both binding activities were most efficient in 150 mM NaCl. Two lines of evidence indicate that the binding activities were not identical: (i) ds DNA failed to complete with ss DNA binding even with a large excess of ds DNA; (ii) Scatchard plots of DNA binding with various amounts of DNA were fundamentally different for ss DNA and ds DNA. However, the two activities were related in that ss DNA efficiently competed with the binding of ds DNA. We conclude that the ds DNA-binding activity of ICP8 is probably distinct from the ss DNA-binding activity. No evidence for sequence-specific ds DNA binding was obtained for either the entire herpes simplex virus genome or cloned viral sequences.

  9. Improve the prediction of RNA-binding residues using structural neighbours.

    PubMed

    Li, Quan; Cao, Zanxia; Liu, Haiyan

    2010-03-01

    The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

  10. Lithium cation enhances anion binding in a tripodal phosphine oxide-based ditopic receptor†

    PubMed Central

    Gavette, Jesse V.; Lara, Juven; Berryman, Orion B.; Zakharov, Lev N.; Haley, Michael M.; Johnson, Darren W.

    2012-01-01

    A tripodal ditopic receptor presents H-bond donors and a phosphine oxide to potential guests. In the idealized binding conformation, an endohedral P═O functionality provides enhanced halide binding in the presence of lithium with the greatest ΔΔG° observed for bromide, while minimal changes in Ka are observed in the presence of sodium. PMID:21655566

  11. BFEE: A User-Friendly Graphical Interface Facilitating Absolute Binding Free-Energy Calculations.

    PubMed

    Fu, Haohao; Gumbart, James C; Chen, Haochuan; Shao, Xueguang; Cai, Wensheng; Chipot, Christophe

    2018-03-26

    Quantifying protein-ligand binding has attracted the attention of both theorists and experimentalists for decades. Many methods for estimating binding free energies in silico have been reported in recent years. Proper use of the proposed strategies requires, however, adequate knowledge of the protein-ligand complex, the mathematical background for deriving the underlying theory, and time for setting up the simulations, bookkeeping, and postprocessing. Here, to minimize human intervention, we propose a toolkit aimed at facilitating the accurate estimation of standard binding free energies using a geometrical route, coined the binding free-energy estimator (BFEE), and introduced it as a plug-in of the popular visualization program VMD. Benefitting from recent developments in new collective variables, BFEE can be used to generate the simulation input files, based solely on the structure of the complex. Once the simulations are completed, BFEE can also be utilized to perform the post-treatment of the free-energy calculations, allowing the absolute binding free energy to be estimated directly from the one-dimensional potentials of mean force in simulation outputs. The minimal amount of human intervention required during the whole process combined with the ergonomic graphical interface makes BFEE a very effective and practical tool for the end-user.

  12. Genome-wide survey of DNA-binding proteins in Arabidopsis thaliana: analysis of distribution and functions.

    PubMed

    Malhotra, Sony; Sowdhamini, Ramanathan

    2013-08-01

    The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.

  13. Flexible DNA binding of the BTB/POZ-domain protein FBI-1.

    PubMed

    Pessler, Frank; Hernandez, Nouria

    2003-08-01

    POZ-domain transcription factors are characterized by the presence of a protein-protein interaction domain called the POZ or BTB domain at their N terminus and zinc fingers at their C terminus. Despite the large number of POZ-domain transcription factors that have been identified to date and the significant insights that have been gained into their cellular functions, relatively little is known about their DNA binding properties. FBI-1 is a BTB/POZ-domain protein that has been shown to modulate HIV-1 Tat trans-activation and to repress transcription of some cellular genes. We have used various viral and cellular FBI-1 binding sites to characterize the interaction of a POZ-domain protein with DNA in detail. We find that FBI-1 binds to inverted sequence repeats downstream of the HIV-1 transcription start site. Remarkably, it binds efficiently to probes carrying these repeats in various orientations and spacings with no particular rotational alignment, indicating that its interaction with DNA is highly flexible. Indeed, FBI-1 binding sites in the adenovirus 2 major late promoter, the c-fos gene, and the c-myc P1 and P2 promoters reveal variously spaced direct, inverted, and everted sequence repeats with the consensus sequence G(A/G)GGG(T/C)(C/T)(T/C)(C/T) for each repeat.

  14. The binding modes of carbazole derivatives with telomere G-quadruplex

    NASA Astrophysics Data System (ADS)

    Zhang, Xiu-feng; Zhang, Hui-juan; Xiang, Jun-feng; Li, Qian; Yang, Qian-fan; Shang, Qian; Zhang, Yan-xia; Tang, Ya-lin

    2010-10-01

    It is reported that carbazole derivatives can stabilize G-quadruplex DNA structure formed by human telomeric sequence, and therefore, they have the potential to serve as anti-cancer agents. In this present study, in order to further explore the binding mode between carbazole derivatives and G-quadruplex formed by human telomeric sequence, two carbazole iodides (BMVEC, MVEC) molecules were synthesized and used to investigate the interaction with the human telomeric parallel and antiparallel G-quadruplex structures by NMR, CD and molecular modeling study. Interestingly, it is the pivotal the cationic charge pendant groups of pyridinium rings of carbazole that plays an essential role in the stabilizing and binding mode of the human telomeric sequences G-quadruplex structure. It was found that BMVEC with two cationic charge pendant groups of pyridinium rings of 9-ethylcarbazole cannot only stabilize parallel G-quadruple of Hum6 by groove binding and G-tetrad stacking modes and antiparallel G-quadruplex of Hum22 by groove binding, but also induce the formation of mixed G-quadruplex of Hum22. While MVEC with one cationic charge pendant groups of pyridinium ring only can bind with the parallel G-quadruplex of Hum6 by the stacking onto the G4 G-tetrad and could not interact with the G-quadruplex of Hum22.

  15. Advancements in Aptamer Discovery Technologies.

    PubMed

    Gotrik, Michael R; Feagin, Trevor A; Csordas, Andrew T; Nakamoto, Margaret A; Soh, H Tom

    2016-09-20

    Affinity reagents that specifically bind to their target molecules are invaluable tools in nearly every field of modern biomedicine. Nucleic acid-based aptamers offer many advantages in this domain, because they are chemically synthesized, stable, and economical. Despite these compelling features, aptamers are currently not widely used in comparison to antibodies. This is primarily because conventional aptamer-discovery techniques such as SELEX are time-consuming and labor-intensive and often fail to produce aptamers with comparable binding performance to antibodies. This Account describes a body of work from our laboratory in developing advanced methods for consistently producing high-performance aptamers with higher efficiency, fewer resources, and, most importantly, a greater probability of success. We describe our efforts in systematically transforming each major step of the aptamer discovery process: selection, analysis, and characterization. To improve selection, we have developed microfluidic devices (M-SELEX) that enable discovery of high-affinity aptamers after a minimal number of selection rounds by precisely controlling the target concentration and washing stringency. In terms of improving aptamer pool analysis, our group was the first to use high-throughput sequencing (HTS) for the discovery of new aptamers. We showed that tracking the enrichment trajectory of individual aptamer sequences enables the identification of high-performing aptamers without requiring full convergence of the selected aptamer pool. HTS is now widely used for aptamer discovery, and open-source software has become available to facilitate analysis. To improve binding characterization, we used HTS data to design custom aptamer arrays to measure the affinity and specificity of up to ∼10(4) DNA aptamers in parallel as a means to rapidly discover high-quality aptamers. Most recently, our efforts have culminated in the invention of the "particle display" (PD) screening system, which transforms solution-phase aptamers into "aptamer particles" that can be individually screened at high-throughput via fluorescence-activated cell sorting. Using PD, we have shown the feasibility of rapidly generating aptamers with exceptional affinities, even for proteins that have previously proven intractable to aptamer discovery. We are confident that these advanced aptamer-discovery methods will accelerate the discovery of aptamer reagents with excellent affinities and specificities, perhaps even exceeding those of the best monoclonal antibodies. Since aptamers are reproducible, renewable, stable, and can be distributed as sequence information, we anticipate that these affinity reagents will become even more valuable tools for both research and clinical applications.

  16. Informative priors based on transcription factor structural class improve de novo motif discovery.

    PubMed

    Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J

    2006-07-15

    An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.

  17. Changes in solvation during DNA binding and cleavage are critical to altered specificity of the EcoRI endonuclease

    PubMed Central

    Robinson, Clifford R.; Sligar, Stephen G.

    1998-01-01

    Restriction endonucleases such as EcoRI bind and cleave DNA with great specificity and represent a paradigm for protein–DNA interactions and molecular recognition. Using osmotic pressure to induce water release, we demonstrate the participation of bound waters in the sequence discrimination of substrate DNA by EcoRI. Changes in solvation can play a critical role in directing sequence-specific DNA binding by EcoRI and are also crucial in assisting site discrimination during catalysis. By measuring the volume change for complex formation, we show that at the cognate sequence (GAATTC) EcoRI binding releases about 70 fewer water molecules than binding at an alternate DNA sequence (TAATTC), which differs by a single base pair. EcoRI complexation with nonspecific DNA releases substantially less water than either of these specific complexes. In cognate substrates (GAATTC) kcat decreases as osmotic pressure is increased, indicating the binding of about 30 water molecules accompanies the cleavage reaction. For the alternate substrate (TAATTC), release of about 40 water molecules accompanies the reaction, indicated by a dramatic acceleration of the rate when osmotic pressure is raised. These large differences in solvation effects demonstrate that water molecules can be key players in the molecular recognition process during both association and catalytic phases of the EcoRI reaction, acting to change the specificity of the enzyme. For both the protein–DNA complex and the transition state, there may be substantial conformational differences between cognate and alternate sites, accompanied by significant alterations in hydration and solvent accessibility. PMID:9482860

  18. Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.

    PubMed

    Raghav, Sunil Kumar; Deplancke, Bart

    2012-01-01

    Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.

  19. Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection.

    PubMed

    Zhang, Qi; Zeng, Xin; Younkin, Sam; Kawli, Trupti; Snyder, Michael P; Keleş, Sündüz

    2016-02-24

    Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36-50 bps), long (75-100 bps), single-end, or paired-end reads, the impact of these read parameters on the downstream data analysis are not well understood. In this paper, we evaluate the effects of different read parameters on genome sequence alignment, coverage of different classes of genomic features, peak identification, and allele-specific binding detection. We generated 101 bps paired-end ChIP-seq data for many transcription factors from human GM12878 and MCF7 cell lines. Systematic evaluations using in silico variations of these data as well as fully simulated data, revealed complex interplay between the sequencing parameters and analysis tools, and indicated clear advantages of paired-end designs in several aspects such as alignment accuracy, peak resolution, and most notably, allele-specific binding detection. Our work elucidates the effect of design on the downstream analysis and provides insights to investigators in deciding sequencing parameters in ChIP-seq experiments. We present the first systematic evaluation of the impact of ChIP-seq designs on allele-specific binding detection and highlights the power of pair-end designs in such studies.

  20. cAMP-Mediated Stimulation of Tyrosine Hydroxylase mRNA Translation Is Mediated by Polypyrimidine-Rich Sequences within Its 3′-Untranslated Region and Poly(C)-Binding Protein 2

    PubMed Central

    Xu, Lu; Sterling, Carol R.

    2009-01-01

    Tyrosine hydroxylase (TH) plays a critical role in maintaining the appropriate concentrations of catecholamine neurotransmitters in brain and periphery, particularly during long-term stress, long-term drug treatment, or neurodegenerative diseases. Its expression is controlled by both transcriptional and post-transcriptional mechanisms. In a previous report, we showed that treatment of rat midbrain slice explant cultures or mouse MN9D cells with cAMP analog or forskolin leads to induction of TH protein without concomitant induction of TH mRNA. We further showed that cAMP activates mechanisms that regulate TH mRNA translation via cis-acting sequences within its 3′-untranslated region (UTR). In the present report, we extend these studies to show that MN9D cytoplasmic proteins bind to the same TH mRNA 3′-UTR domain that is required for the cAMP response. RNase T1 mapping demonstrates binding of proteins to a 27-nucleotide polypyrimidine-rich sequence within this domain. A specific mutation within the polypyrimidine-rich sequence inhibits protein binding and cAMP-mediated translational activation. UV-cross-linking studies identify a ∼44-kDa protein as a major TH mRNA 3′-UTR binding factor, and cAMP induces the 40- to 42-kDa poly(C)-binding protein-2 (PCBP2) in MN9D cells. We show that PCBP2 binds to the TH mRNA 3′-UTR domain that participates in the cAMP response. Overexpression of PCBP2 induces TH protein without concomitant induction of TH mRNA. These results support a model in which cAMP induces PCBP2, leading to increased interaction with its cognate polypyrimidine binding site in the TH mRNA 3′-UTR. This increased interaction presumably plays a role in the activation of TH mRNA translation by cAMP in dopaminergic neurons. PMID:19620256

Top