Sample records for predicting binding specificities

  1. NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure.

    PubMed

    Nielsen, Morten; Justesen, Sune; Lund, Ole; Lundegaard, Claus; Buus, Søren

    2010-11-13

    Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option. Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy. The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at http://www.cbs.dtu.dk/services/NetMHCIIpan-2.0.

  2. Context influences on TALE–DNA binding revealed by quantitative profiling

    PubMed Central

    Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

    2015-01-01

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805

  3. Context influences on TALE-DNA binding revealed by quantitative profiling.

    PubMed

    Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

    2015-06-11

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

  4. Development of estrogen receptor beta binding prediction model using large sets of chemicals.

    PubMed

    Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao

    2017-11-03

    We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .

  5. Simultaneous prediction of binding free energy and specificity for PDZ domain-peptide interactions

    NASA Astrophysics Data System (ADS)

    Crivelli, Joseph J.; Lemmon, Gordon; Kaufmann, Kristian W.; Meiler, Jens

    2013-12-01

    Interactions between protein domains and linear peptides underlie many biological processes. Among these interactions, the recognition of C-terminal peptides by PDZ domains is one of the most ubiquitous. In this work, we present a mathematical model for PDZ domain-peptide interactions capable of predicting both affinity and specificity of binding based on X-ray crystal structures and comparative modeling with R osetta. We developed our mathematical model using a large phage display dataset describing binding specificity for a wild type PDZ domain and 91 single mutants, as well as binding affinity data for a wild type PDZ domain binding to 28 different peptides. Structural refinement was carried out through several R osetta protocols, the most accurate of which included flexible peptide docking and several iterations of side chain repacking and backbone minimization. Our findings emphasize the importance of backbone flexibility and the energetic contributions of side chain-side chain hydrogen bonds in accurately predicting interactions. We also determined that predicting PDZ domain-peptide interactions became increasingly challenging as the length of the peptide increased in the N-terminal direction. In the training dataset, predicted binding energies correlated with those derived through calorimetry and specificity switches introduced through single mutations at interface positions were recapitulated. In independent tests, our best performing protocol was capable of predicting dissociation constants well within one order of magnitude of the experimental values and specificity profiles at the level of accuracy of previous studies. To our knowledge, this approach represents the first integrated protocol for predicting both affinity and specificity for PDZ domain-peptide interactions.

  6. An assay that may predict the development of IgG enhancing allergen-specific IgE binding during birch immunotherapy

    PubMed Central

    Selb, R.; Eckl-Dorna, J.; Vrtala, S.; Valenta, R.; Niederberger, V.

    2017-01-01

    Background It has been shown that birch pollen immunotherapy can induce IgG antibodies which enhance IgE binding to Bet v 1. We aimed to develop a serological assay to predict the development of antibodies which enhance IgE binding to Bet v 1 during immunotherapy. Methods In 18 patients treated by Bet v 1-fragment-specific immunotherapy, the effects of IgG antibodies specific for the fragments on the binding of IgE antibodies to Bet v 1 were measured by ELISA. Blocking and possible enhancing effects on IgE binding were compared with skin sensitivity to Bet v 1 after treatment. Results We found that fragment-specific IgG enhanced IgE binding to Bet v 1 in two patients who also showed an increase of skin sensitivity to Bet v 1. Conclusion Our results indicate that it may be possible to develop serological tests which predict the induction of unfavourable IgG antibodies enhancing the binding of IgE to Bet v 1 during immunotherapy. PMID:23998344

  7. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    PubMed

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  8. Structure-Templated Predictions of Novel Protein Interactions from Sequence Information

    PubMed Central

    Betel, Doron; Breitkreuz, Kevin E; Isserlin, Ruth; Dewar-Darch, Danielle; Tyers, Mike; Hogue, Christopher W. V

    2007-01-01

    The multitude of functions performed in the cell are largely controlled by a set of carefully orchestrated protein interactions often facilitated by specific binding of conserved domains in the interacting proteins. Interacting domains commonly exhibit distinct binding specificity to short and conserved recognition peptides called binding profiles. Although many conserved domains are known in nature, only a few have well-characterized binding profiles. Here, we describe a novel predictive method known as domain–motif interactions from structural topology (D-MIST) for elucidating the binding profiles of interacting domains. A set of domains and their corresponding binding profiles were derived from extant protein structures and protein interaction data and then used to predict novel protein interactions in yeast. A number of the predicted interactions were verified experimentally, including new interactions of the mitotic exit network, RNA polymerases, nucleotide metabolism enzymes, and the chaperone complex. These results demonstrate that new protein interactions can be predicted exclusively from sequence information. PMID:17892321

  9. Probing the human estrogen receptor-α binding requirements for phenolic mono- and di-hydroxyl compounds: A combined synthesis, binding and docking study

    PubMed Central

    McCullough, Christopher; Neumann, Terrence S.; Gone, Jayapal Reddy; He, Zhengjie; Herrild, Christian; Wondergem, Julie; Pandey, Rajesh K.; Donaldson, William A.; Sem, Daniel S.

    2014-01-01

    Various estrogen analogs were synthesized and tested for binding to human ERα using a fluorescence polarization displacement assay. Binding affinity and orientation were also predicted using docking calculations. Docking was able to accurately predict relative binding affinity and orientation for estradiol, but only if a tightly bound water molecule bridging Arg394/Glu353 is present. Di-hydroxyl compounds sometimes bind in two orientations, which are flipped in terms of relative positioning of their hydroxyl groups. Di-hydroxyl compounds were predicted to bind with their aliphatic hydroxyl group interacting with His524 in ERα. One nonsteroid-based dihdroxyl compound was 1000-fold specific for ERβ over ERα, and was also 25-fold specific for agonist ERβ versus antagonist activity. Docking predictions suggest this specificity may be due to interaction of the aliphatic hydroxyl with His475 in the agonist form of ERβ, versus with Thr299 in the antagonist form. But, the presence of this aliphatic hydroxyl is not required in all compounds, since mono-hydroxyl (phenolic) compounds bind ERα with high affinity, via hydroxyl hydrogen bonding interactions with the ERα Arg394/Glu353/water triad, and van der Waals interactions with the rest of the molecule. PMID:24315190

  10. Elucidation of the binding preferences of peptide recognition modules: SH3 and PDZ domains.

    PubMed

    Teyra, Joan; Sidhu, Sachdev S; Kim, Philip M

    2012-08-14

    Peptide-binding domains play a critical role in regulation of cellular processes by mediating protein interactions involved in signalling. In recent years, the development of large-scale technologies has enabled exhaustive studies on the peptide recognition preferences for a number of peptide-binding domain families. These efforts have provided significant insights into the binding specificities of these modular domains. Many research groups have taken advantage of this unprecedented volume of specificity data and have developed a variety of new algorithms for the prediction of binding specificities of peptide-binding domains and for the prediction of their natural binding targets. This knowledge has also been applied to the design of synthetic peptide-binding domains in order to rewire protein-protein interaction networks. Here, we describe how these experimental technologies have impacted on our understanding of peptide-binding domain specificities and on the elucidation of their natural ligands. We discuss SH3 and PDZ domains as well characterized examples, and we explore the feasibility of expanding high-throughput experiments to other peptide-binding domains. Copyright © 2012. Published by Elsevier B.V.

  11. BiPPred: Combined sequence- and structure-based prediction of peptide binding to the Hsp70 chaperone BiP.

    PubMed

    Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris

    2016-10-01

    Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. Prediction of consensus binding mode geometries for related chemical series of positive allosteric modulators of adenosine and muscarinic acetylcholine receptors.

    PubMed

    Sakkal, Leon A; Rajkowski, Kyle Z; Armen, Roger S

    2017-06-05

    Following insights from recent crystal structures of the muscarinic acetylcholine receptor, binding modes of Positive Allosteric Modulators (PAMs) were predicted under the assumption that PAMs should bind to the extracellular surface of the active state. A series of well-characterized PAMs for adenosine (A 1 R, A 2A R, A 3 R) and muscarinic acetylcholine (M 1 R, M 5 R) receptors were modeled using both rigid and flexible receptor CHARMM-based molecular docking. Studies of adenosine receptors investigated the molecular basis of the probe-dependence of PAM activity by modeling in complex with specific agonist radioligands. Consensus binding modes map common pharmacophore features of several chemical series to specific binding interactions. These models provide a rationalization of how PAM binding slows agonist radioligand dissociation kinetics. M 1 R PAMs were predicted to bind in the analogous M 2 R PAM LY2119620 binding site. The M 5 R NAM (ML-375) was predicted to bind in the PAM (ML-380) binding site with a unique induced-fit receptor conformation. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  13. Structure and Sequence Search on Aptamer-Protein Docking

    NASA Astrophysics Data System (ADS)

    Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie

    2015-03-01

    Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.

  14. Accurate and sensitive quantification of protein-DNA binding affinity.

    PubMed

    Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J

    2018-04-17

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.

  15. Accurate and sensitive quantification of protein-DNA binding affinity

    PubMed Central

    Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.

    2018-01-01

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332

  16. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction.

    PubMed

    Cang, Zixuan; Wei, Guo-Wei

    2018-02-01

    Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40Å  away from the binding site, which has a significant ramification to drug and protein design. Copyright © 2017 John Wiley & Sons, Ltd.

  17. A deep learning framework for modeling structural features of RNA-binding protein targets

    PubMed Central

    Zhang, Sai; Zhou, Jingtian; Hu, Hailin; Gong, Haipeng; Chen, Ligong; Cheng, Chao; Zeng, Jianyang

    2016-01-01

    RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp. PMID:26467480

  18. Improve the prediction of RNA-binding residues using structural neighbours.

    PubMed

    Li, Quan; Cao, Zanxia; Liu, Haiyan

    2010-03-01

    The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

  19. Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.

    PubMed

    Han, Youngmahn; Kim, Dongsup

    2017-12-28

    Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc . We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.

  20. Mismatch repair factor MSH2-MSH3 binds and alters the conformation of branched DNA structures predicted to form during genetic recombination.

    PubMed

    Surtees, Jennifer A; Alani, Eric

    2006-07-14

    Genetic studies in Saccharomyces cerevisiae predict that the mismatch repair (MMR) factor MSH2-MSH3 binds and stabilizes branched recombination intermediates that form during single strand annealing and gene conversion. To test this model, we constructed a series of DNA substrates that are predicted to form during these recombination events. We show in an electrophoretic mobility shift assay that S. cerevisiae MSH2-MSH3 specifically binds branched DNA substrates containing 3' single-stranded DNA and that ATP stimulates its release from these substrates. Chemical footprinting analyses indicate that MSH2-MSH3 specifically binds at the double-strand/single-strand junction of branched substrates, alters its conformation and opens up the junction. Therefore, MSH2-MSH3 binding to its substrates creates a unique nucleoprotein structure that may signal downstream steps in repair that include interactions with MMR and nucleotide excision repair factors.

  1. Knowledge-based grouping of modeled HLA peptide complexes.

    PubMed

    Kangueane, P; Sakharkar, M K; Lim, K S; Hao, H; Lin, K; Chee, R E; Kolatkar, P R

    2000-05-01

    Human leukocyte antigens are the most polymorphic of human genes and multiple sequence alignment shows that such polymorphisms are clustered in the functional peptide binding domains. Because of such polymorphism among the peptide binding residues, the prediction of peptides that bind to specific HLA molecules is very difficult. In recent years two different types of computer based prediction methods have been developed and both the methods have their own advantages and disadvantages. The nonavailability of allele specific binding data restricts the use of knowledge-based prediction methods for a wide range of HLA alleles. Alternatively, the modeling scheme appears to be a promising predictive tool for the selection of peptides that bind to specific HLA molecules. The scoring of the modeled HLA-peptide complexes is a major concern. The use of knowledge based rules (van der Waals clashes and solvent exposed hydrophobic residues) to distinguish binders from nonbinders is applied in the present study. The rules based on (1) number of observed atomic clashes between the modeled peptide and the HLA structure, and (2) number of solvent exposed hydrophobic residues on the modeled peptide effectively discriminate experimentally known binders from poor/nonbinders. Solved crystal complexes show no vdW Clash (vdWC) in 95% cases and no solvent exposed hydrophobic peptide residues (SEHPR) were seen in 86% cases. In our attempt to compare experimental binding data with the predicted scores by this scoring scheme, 77% of the peptides are correctly grouped as good binders with a sensitivity of 71%.

  2. Evaluation of OASIS QSAR Models Using ToxCast™ in Vitro Estrogen and Androgen Receptor Binding Data and Application in an Integrated Endocrine Screening Approach

    PubMed Central

    Bhhatarai, Barun; Wilson, Daniel M.; Price, Paul S.; Marty, Sue; Parks, Amanda K.; Carney, Edward

    2016-01-01

    Background: Integrative testing strategies (ITSs) for potential endocrine activity can use tiered in silico and in vitro models. Each component of an ITS should be thoroughly assessed. Objectives: We used the data from three in vitro ToxCast™ binding assays to assess OASIS, a quantitative structure-activity relationship (QSAR) platform covering both estrogen receptor (ER) and androgen receptor (AR) binding. For stronger binders (described here as AC50 < 1 μM), we also examined the relationship of QSAR predictions of ER or AR binding to the results from 18 ER and 10 AR transactivation assays, 72 ER-binding reference compounds, and the in vivo uterotrophic assay. Methods: NovaScreen binding assay data for ER (human, bovine, and mouse) and AR (human, chimpanzee, and rat) were used to assess the sensitivity, specificity, concordance, and applicability domain of two OASIS QSAR models. The binding strength relative to the QSAR-predicted binding strength was examined for the ER data. The relationship of QSAR predictions of binding to transactivation- and pathway-based assays, as well as to in vivo uterotrophic responses, was examined. Results: The QSAR models had both high sensitivity (> 75%) and specificity (> 86%) for ER as well as both high sensitivity (92–100%) and specificity (70–81%) for AR. For compounds within the domains of the ER and AR QSAR models that bound with AC50 < 1 μM, the QSAR models accurately predicted the binding for the parent compounds. The parent compounds were active in all transactivation assays where metabolism was incorporated and, except for those compounds known to require metabolism to manifest activity, all assay platforms where metabolism was not incorporated. Compounds in-domain and predicted to bind by the ER QSAR model that were positive in ToxCast™ ER binding at AC50 < 1 μM were active in the uterotrophic assay. Conclusions: We used the extensive ToxCast™ HTS binding data set to show that OASIS ER and AR QSAR models had high sensitivity and specificity when compounds were in-domain of the models. Based on this research, we recommend a tiered screening approach wherein a) QSAR is used to identify compounds in-domain of the ER or AR binding models and predicted to bind; b) those compounds are screened in vitro to assess binding potency; and c) the stronger binders (AC50 < 1 μM) are screened in vivo. This scheme prioritizes compounds for integrative testing and risk assessment. Importantly, compounds that are not in-domain, that are predicted either not to bind or to bind weakly, that are not active in in vitro, that require metabolism to manifest activity, or for which in vivo AR testing is in order, need to be assessed differently. Citation: Bhhatarai B, Wilson DM, Price PS, Marty S, Parks AK, Carney E. 2016. Evaluation of OASIS QSAR models using ToxCast™ in vitro estrogen and androgen receptor binding data and application in an integrated endocrine screening approach. Environ Health Perspect 124:1453–1461; http://dx.doi.org/10.1289/EHP184 PMID:27152837

  3. Experimental identification of specificity determinants in the domain linker of a LacI/GalR protein: bioinformatics-based predictions generate true positives and false negatives.

    PubMed

    Meinhardt, Sarah; Swint-Kruse, Liskin

    2008-12-01

    In protein families, conserved residues often contribute to a common general function, such as DNA-binding. However, unique attributes for each homolog (e.g. recognition of alternative DNA sequences) must arise from variation in other functionally-important positions. The locations of these "specificity determinant" positions are obscured amongst the background of varied residues that do not make significant contributions to either structure or function. To isolate specificity determinants, a number of bioinformatics algorithms have been developed. When applied to the LacI/GalR family of transcription regulators, several specificity determinants are predicted in the 18 amino acids that link the DNA-binding and regulatory domains. However, results from alternative algorithms are only in partial agreement with each other. Here, we experimentally evaluate these predictions using an engineered repressor comprising the LacI DNA-binding domain, the LacI linker, and the GalR regulatory domain (LLhG). "Wild-type" LLhG has altered DNA specificity and weaker lacO(1) repression compared to LacI or a similar LacI:PurR chimera. Next, predictions of linker specificity determinants were tested, using amino acid substitution and in vivo repression assays to assess functional change. In LLhG, all predicted sites are specificity determinants, as well as three sites not predicted by any algorithm. Strategies are suggested for diminishing the number of false negative predictions. Finally, individual substitutions at LLhG specificity determinants exhibited a broad range of functional changes that are not predicted by bioinformatics algorithms. Results suggest that some variants have altered affinity for DNA, some have altered allosteric response, and some appear to have changed specificity for alternative DNA ligands.

  4. Molecular cloning and analysis of Schizosaccharomyces pombe Reb1p: sequence-specific recognition of two sites in the far upstream rDNA intergenic spacer.

    PubMed Central

    Zhao, A; Guo, A; Liu, Z; Pape, L

    1997-01-01

    The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645

  5. Differences in DNA Binding Specificity of Floral Homeotic Protein Complexes Predict Organ-Specific Target Genes.

    PubMed

    Smaczniak, Cezary; Muiño, Jose M; Chen, Dijun; Angenent, Gerco C; Kaufmann, Kerstin

    2017-08-01

    Floral organ identities in plants are specified by the combinatorial action of homeotic master regulatory transcription factors. However, how these factors achieve their regulatory specificities is still largely unclear. Genome-wide in vivo DNA binding data show that homeotic MADS domain proteins recognize partly distinct genomic regions, suggesting that DNA binding specificity contributes to functional differences of homeotic protein complexes. We used in vitro systematic evolution of ligands by exponential enrichment followed by high-throughput DNA sequencing (SELEX-seq) on several floral MADS domain protein homo- and heterodimers to measure their DNA binding specificities. We show that specification of reproductive organs is associated with distinct binding preferences of a complex formed by SEPALLATA3 and AGAMOUS. Binding specificity is further modulated by different binding site spacing preferences. Combination of SELEX-seq and genome-wide DNA binding data allows differentiation between targets in specification of reproductive versus perianth organs in the flower. We validate the importance of DNA binding specificity for organ-specific gene regulation by modulating promoter activity through targeted mutagenesis. Our study shows that intrafamily protein interactions affect DNA binding specificity of floral MADS domain proteins. Differential DNA binding of MADS domain protein complexes plays a role in the specificity of target gene regulation. © 2017 American Society of Plant Biologists. All rights reserved.

  6. Antibody specific epitope prediction-emergence of a new paradigm.

    PubMed

    Sela-Culang, Inbal; Ofran, Yanay; Peters, Bjoern

    2015-04-01

    The development of accurate tools for predicting B-cell epitopes is important but difficult. Traditional methods have examined which regions in an antigen are likely binding sites of an antibody. However, it is becoming increasingly clear that most antigen surface residues will be able to bind one or more of the myriad of possible antibodies. In recent years, new approaches have emerged for predicting an epitope for a specific antibody, utilizing information encoded in antibody sequence or structure. Applying such antibody-specific predictions to groups of antibodies in combination with easily obtainable experimental data improves the performance of epitope predictions. We expect that further advances of such tools will be possible with the integration of immunoglobulin repertoire sequencing data. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

    PubMed

    Ma, Xin; Guo, Jing; Sun, Xiao

    2016-01-01

    DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.

  8. Non-B-Form DNA Is Enriched at Centromeres

    PubMed Central

    Henikoff, Steven

    2018-01-01

    Abstract Animal and plant centromeres are embedded in repetitive “satellite” DNA, but are thought to be epigenetically specified. To define genetic characteristics of centromeres, we surveyed satellite DNA from diverse eukaryotes and identified variation in <10-bp dyad symmetries predicted to adopt non-B-form conformations. Organisms lacking centromeric dyad symmetries had binding sites for sequence-specific DNA-binding proteins with DNA-bending activity. For example, human and mouse centromeres are depleted for dyad symmetries, but are enriched for non-B-form DNA and are associated with binding sites for the conserved DNA-binding protein CENP-B, which is required for artificial centromere function but is paradoxically nonessential. We also detected dyad symmetries and predicted non-B-form DNA structures at neocentromeres, which form at ectopic loci. We propose that centromeres form at non-B-form DNA because of dyad symmetries or are strengthened by sequence-specific DNA binding proteins. This may resolve the CENP-B paradox and provide a general basis for centromere specification. PMID:29365169

  9. Prediction of Carbohydrate Binding Sites on Protein Surfaces with 3-Dimensional Probability Density Distributions of Interacting Atoms

    PubMed Central

    Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei

    2012-01-01

    Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404

  10. Synergistic use of compound properties and docking scores in neural network modeling of CYP2D6 binding: predicting affinity and conformational sampling.

    PubMed

    Bazeley, Peter S; Prithivi, Sridevi; Struble, Craig A; Povinelli, Richard J; Sem, Daniel S

    2006-01-01

    Cytochrome P450 2D6 (CYP2D6) is used to develop an approach for predicting affinity and relevant binding conformation(s) for highly flexible binding sites. The approach combines the use of docking scores and compound properties as attributes in building a neural network (NN) model. It begins by identifying segments of CYP2D6 that are important for binding specificity, based on structural variability among diverse CYP enzymes. A family of distinct, low-energy conformations of CYP2D6 are generated using simulated annealing (SA) and a collection of 82 compounds with known CYP2D6 affinities are docked. Interestingly, docking poses are observed on the backside of the heme as well as in the known active site. Docking scores for the active site binders, along with compound-specific attributes, are used to train a neural network model to properly bin compounds as strong binders, moderate binders, or nonbinders. Attribute selection is used to preselect the most important scores and compound-specific attributes for the model. A prediction accuracy of 85+/-6% is achieved. Dominant attributes include docking scores for three of the 20 conformations in the ensemble as well as the compound's formal charge, number of aromatic rings, and AlogP. Although compound properties were highly predictive attributes (12% improvement over baseline) in the NN-based prediction of CYP2D6 binders, their combined use with docking score attributes is synergistic (net increase of 23% above baseline). Beyond prediction of affinity, attribute selection provides a way to identify the most relevant protein conformation(s), in terms of binding competence. In the case of CYP2D6, three out of the ensemble of 20 SA-generated structures are found to be the most predictive for binding.

  11. aPPRove: An HMM-Based Method for Accurate Prediction of RNA-Pentatricopeptide Repeat Protein Binding Events

    PubMed Central

    Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina

    2016-01-01

    Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805

  12. Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery.

    PubMed

    Pérot, Stéphanie; Sperandio, Olivier; Miteva, Maria A; Camproux, Anne-Claude; Villoutreix, Bruno O

    2010-08-01

    Detection, comparison and analyses of binding pockets are pivotal to structure-based drug design endeavors, from hit identification, screening of exosites and de-orphanization of protein functions to the anticipation of specific and non-specific binding to off- and anti-targets. Here, we analyze protein-ligand complexes and discuss methods that assist binding site identification, prediction of druggability and binding site comparison. The full potential of pockets is yet to be harnessed, and we envision that better understanding of the pocket space will have far-reaching implications in the field of drug discovery, such as the design of pocket-specific compound libraries and scoring functions.

  13. Predicting Displaceable Water Sites Using Mixed-Solvent Molecular Dynamics.

    PubMed

    Graham, Sarah E; Smith, Richard D; Carlson, Heather A

    2018-02-26

    Water molecules are an important factor in protein-ligand binding. Upon binding of a ligand with a protein's surface, waters can either be displaced by the ligand or may be conserved and possibly bridge interactions between the protein and ligand. Depending on the specific interactions made by the ligand, displacing waters can yield a gain in binding affinity. The extent to which binding affinity may increase is difficult to predict, as the favorable displacement of a water molecule is dependent on the site-specific interactions made by the water and the potential ligand. Several methods have been developed to predict the location of water sites on a protein's surface, but the majority of methods are not able to take into account both protein dynamics and the interactions made by specific functional groups. Mixed-solvent molecular dynamics (MixMD) is a cosolvent simulation technique that explicitly accounts for the interaction of both water and small molecule probes with a protein's surface, allowing for their direct competition. This method has previously been shown to identify both active and allosteric sites on a protein's surface. Using a test set of eight systems, we have developed a method using MixMD to identify conserved and displaceable water sites. Conserved sites can be determined by an occupancy-based metric to identify sites which are consistently occupied by water even in the presence of probe molecules. Conversely, displaceable water sites can be found by considering the sites which preferentially bind probe molecules. Furthermore, the inclusion of six probe types allows the MixMD method to predict which functional groups are capable of displacing which water sites. The MixMD method consistently identifies sites which are likely to be nondisplaceable and predicts the favorable displacement of water sites that are known to be displaced upon ligand binding.

  14. Improving binding mode and binding affinity predictions of docking by ligand-based search of protein conformations: evaluation in D3R grand challenge 2015

    NASA Astrophysics Data System (ADS)

    Xu, Xianjin; Yan, Chengfei; Zou, Xiaoqin

    2017-08-01

    The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.

  15. Role of Electrostatics in Protein-RNA Binding: The Global vs the Local Energy Landscape.

    PubMed

    Ghaemi, Zhaleh; Guzman, Irisbel; Gnutt, David; Luthey-Schulten, Zaida; Gruebele, Martin

    2017-09-14

    U1A protein-stem loop 2 RNA association is a basic step in the assembly of the spliceosomal U1 small nuclear ribonucleoprotein. Long-range electrostatic interactions due to the positive charge of U1A are thought to provide high binding affinity for the negatively charged RNA. Short range interactions, such as hydrogen bonds and contacts between RNA bases and protein side chains, favor a specific binding site. Here, we propose that electrostatic interactions are as important as local contacts in biasing the protein-RNA energy landscape toward a specific binding site. We show by using molecular dynamics simulations that deletion of two long-range electrostatic interactions (K22Q and K50Q) leads to mutant-specific alternative RNA bound states. One of these states preserves short-range interactions with aromatic residues in the original binding site, while the other one does not. We test the computational prediction with experimental temperature-jump kinetics using a tryptophan probe in the U1A-RNA binding site. The two mutants show the distinct predicted kinetic behaviors. Thus, the stem loop 2 RNA has multiple binding sites on a rough RNA-protein binding landscape. We speculate that the rough protein-RNA binding landscape, when biased to different local minima by electrostatics, could be one way that protein-RNA interactions evolve toward new binding sites and novel function.

  16. Kinetics of heavy metal adsorption and desorption in soil: Developing a unified model based on chemical speciation

    NASA Astrophysics Data System (ADS)

    Peng, Lanfang; Liu, Paiyu; Feng, Xionghan; Wang, Zimeng; Cheng, Tao; Liang, Yuzhen; Lin, Zhang; Shi, Zhenqing

    2018-03-01

    Predicting the kinetics of heavy metal adsorption and desorption in soil requires consideration of multiple heterogeneous soil binding sites and variations of reaction chemistry conditions. Although chemical speciation models have been developed for predicting the equilibrium of metal adsorption on soil organic matter (SOM) and important mineral phases (e.g. Fe and Al (hydr)oxides), there is still a lack of modeling tools for predicting the kinetics of metal adsorption and desorption reactions in soil. In this study, we developed a unified model for the kinetics of heavy metal adsorption and desorption in soil based on the equilibrium models WHAM 7 and CD-MUSIC, which specifically consider metal kinetic reactions with multiple binding sites of SOM and soil minerals simultaneously. For each specific binding site, metal adsorption and desorption rate coefficients were constrained by the local equilibrium partition coefficients predicted by WHAM 7 or CD-MUSIC, and, for each metal, the desorption rate coefficients of various binding sites were constrained by their metal binding constants with those sites. The model had only one fitting parameter for each soil binding phase, and all other parameters were derived from WHAM 7 and CD-MUSIC. A stirred-flow method was used to study the kinetics of Cd, Cu, Ni, Pb, and Zn adsorption and desorption in multiple soils under various pH and metal concentrations, and the model successfully reproduced most of the kinetic data. We quantitatively elucidated the significance of different soil components and important soil binding sites during the adsorption and desorption kinetic processes. Our model has provided a theoretical framework to predict metal adsorption and desorption kinetics, which can be further used to predict the dynamic behavior of heavy metals in soil under various natural conditions by coupling other important soil processes.

  17. Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.

    PubMed

    Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G

    2016-01-01

    While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.

  18. An Improved Method for TAL Effectors DNA-Binding Sites Prediction Reveals Functional Convergence in TAL Repertoires of Xanthomonas oryzae Strains

    PubMed Central

    Pérez-Quintero, Alvaro L.; Rodriguez-R, Luis M.; Dereeper, Alexis; López, Camilo; Koebnik, Ralf; Szurek, Boris; Cunnac, Sebastien

    2013-01-01

    Transcription Activators-Like Effectors (TALEs) belong to a family of virulence proteins from the Xanthomonas genus of bacterial plant pathogens that are translocated into the plant cell. In the nucleus, TALEs act as transcription factors inducing the expression of susceptibility genes. A code for TALE-DNA binding specificity and high-resolution three-dimensional structures of TALE-DNA complexes were recently reported. Accurate prediction of TAL Effector Binding Elements (EBEs) is essential to elucidate the biological functions of the many sequenced TALEs as well as for robust design of artificial TALE DNA-binding domains in biotechnological applications. In this work a program with improved EBE prediction performances was developed using an updated specificity matrix and a position weight correction function to account for the matching pattern observed in a validation set of TALE-DNA interactions. To gain a systems perspective on the large TALE repertoires from X. oryzae strains, this program was used to predict rice gene targets for 99 sequenced family members. Integrating predictions and available expression data in a TALE-gene network revealed multiple candidate transcriptional targets for many TALEs as well as several possible instances of functional convergence among TALEs. PMID:23869221

  19. The Length Distribution of Class I-Restricted T Cell Epitopes Is Determined by Both Peptide Supply and MHC Allele-Specific Binding Preference.

    PubMed

    Trolle, Thomas; McMurtrey, Curtis P; Sidney, John; Bardet, Wilfried; Osborn, Sean C; Kaever, Thomas; Sette, Alessandro; Hildebrand, William H; Nielsen, Morten; Peters, Bjoern

    2016-02-15

    HLA class I-binding predictions are widely used to identify candidate peptide targets of human CD8(+) T cell responses. Many such approaches focus exclusively on a limited range of peptide lengths, typically 9 aa and sometimes 9-10 aa, despite multiple examples of dominant epitopes of other lengths. In this study, we examined whether epitope predictions can be improved by incorporating the natural length distribution of HLA class I ligands. We found that, although different HLA alleles have diverse length-binding preferences, the length profiles of ligands that are naturally presented by these alleles are much more homogeneous. We hypothesized that this is due to a defined length profile of peptides available for HLA binding in the endoplasmic reticulum. Based on this, we created a model of HLA allele-specific ligand length profiles and demonstrate how this model, in combination with HLA-binding predictions, greatly improves comprehensive identification of CD8(+) T cell epitopes. Copyright © 2016 by The American Association of Immunologists, Inc.

  20. Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints

    PubMed Central

    Suciu, Maria C.; Telenius, Jelena

    2017-01-01

    In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k-mer-based analysis of DNase footprints to determine any k-mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome. PMID:28904015

  1. HLA mismatches and hematopoietic cell transplantation: structural simulations assess the impact of changes in peptide binding specificity on transplant outcome

    PubMed Central

    Yanover, Chen; Petersdorf, Effie W.; Malkki, Mari; Gooley, Ted; Spellman, Stephen; Velardi, Andrea; Bardy, Peter; Madrigal, Alejandro; Bignon, Jean-Denis; Bradley, Philip

    2013-01-01

    The success of hematopoietic cell transplantation from an unrelated donor depends in part on the degree of Human Histocompatibility Leukocyte Antigen (HLA) matching between donor and patient. We present a structure-based analysis of HLA mismatching, focusing on individual amino acid mismatches and their effect on peptide binding specificity. Using molecular modeling simulations of HLA-peptide interactions, we find evidence that amino acid mismatches predicted to perturb peptide binding specificity are associated with higher risk of mortality in a large and diverse dataset of patient-donor pairs assembled by the International Histocompatibility Working Group in Hematopoietic Cell Transplantation consortium. This analysis may represent a first step toward sequence-based prediction of relative risk for HLA allele mismatches. PMID:24482668

  2. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  3. Binding ligand prediction for proteins using partial matching of local surface patches.

    PubMed

    Sael, Lee; Kihara, Daisuke

    2010-01-01

    Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.

  4. Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches

    PubMed Central

    Sael, Lee; Kihara, Daisuke

    2010-01-01

    Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188

  5. PRISM offers a comprehensive genomic approach to transcription factor function prediction

    PubMed Central

    Wenger, Aaron M.; Clarke, Shoa L.; Guturu, Harendra; Chen, Jenny; Schaar, Bruce T.; McLean, Cory Y.; Bejerano, Gill

    2013-01-01

    The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells. PMID:23382538

  6. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    PubMed

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2017-03-17

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs

    PubMed Central

    Miao, Zhichao; Westhof, Eric

    2015-01-01

    Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods. PMID:26681179

  8. Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals.

    PubMed

    Hu, Xiuzhen; Dong, Qiwen; Yang, Jianyi; Zhang, Yang

    2016-11-01

    More than half of proteins require binding of metal and acid radical ions for their structure and function. Identification of the ion-binding locations is important for understanding the biological functions of proteins. Due to the small size and high versatility of the metal and acid radical ions, however, computational prediction of their binding sites remains difficult. We proposed a new ligand-specific approach devoted to the binding site prediction of 13 metal ions (Zn 2+ , Cu 2+ , Fe 2+ , Fe 3+ , Ca 2+ , Mg 2+ , Mn 2+ , Na + , K + ) and acid radical ion ligands (CO3 2- , NO2 - , SO4 2- , PO4 3- ) that are most frequently seen in protein databases. A sequence-based ab initio model is first trained on sequence profiles, where a modified AdaBoost algorithm is extended to balance binding and non-binding residue samples. A composite method IonCom is then developed to combine the ab initio model with multiple threading alignments for further improving the robustness of the binding site predictions. The pipeline was tested using 5-fold cross validations on a comprehensive set of 2,100 non-redundant proteins bound with 3,075 small ion ligands. Significant advantage was demonstrated compared with the state of the art ligand-binding methods including COACH and TargetS for high-accuracy ion-binding site identification. Detailed data analyses show that the major advantage of IonCom lies at the integration of complementary ab initio and template-based components. Ion-specific feature design and binding library selection also contribute to the improvement of small ion ligand binding predictions. http://zhanglab.ccmb.med.umich.edu/IonCom CONTACT: hxz@imut.edu.cn or zhng@umich.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes.

    PubMed

    Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying

    2015-01-01

    Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.

  10. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

    PubMed Central

    2015-01-01

    Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483

  11. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction

    PubMed Central

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K.; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G.; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H.

    2017-01-01

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. PMID:27899623

  12. Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints.

    PubMed

    Schwessinger, Ron; Suciu, Maria C; McGowan, Simon J; Telenius, Jelena; Taylor, Stephen; Higgs, Doug R; Hughes, Jim R

    2017-10-01

    In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k -mer-based analysis of DNase footprints to determine any k -mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome. © 2017 Schwessinger et al.; Published by Cold Spring Harbor Laboratory Press.

  13. A flexible docking scheme to explore the binding selectivity of PDZ domains.

    PubMed

    Gerek, Z Nevin; Ozkan, S Banu

    2010-05-01

    Modeling of protein binding site flexibility in molecular docking is still a challenging problem due to the large conformational space that needs sampling. Here, we propose a flexible receptor docking scheme: A dihedral restrained replica exchange molecular dynamics (REMD), where we incorporate the normal modes obtained by the Elastic Network Model (ENM) as dihedral restraints to speed up the search towards correct binding site conformations. To our knowledge, this is the first approach that uses ENM modes to bias REMD simulations towards binding induced fluctuations in docking studies. In our docking scheme, we first obtain the deformed structures of the unbound protein as initial conformations by moving along the binding fluctuation mode, and perform REMD using the ENM modes as dihedral restraints. Then, we generate an ensemble of multiple receptor conformations (MRCs) by clustering the lowest replica trajectory. Using ROSETTALIGAND, we dock ligands to the clustered conformations to predict the binding pose and affinity. We apply this method to postsynaptic density-95/Dlg/ZO-1 (PDZ) domains; whose dynamics govern their binding specificity. Our approach produces the lowest energy bound complexes with an average ligand root mean square deviation of 0.36 A. We further test our method on (i) homologs and (ii) mutant structures of PDZ where mutations alter the binding selectivity. In both cases, our approach succeeds to predict the correct pose and the affinity of binding peptides. Overall, with this approach, we generate an ensemble of MRCs that leads to predict the binding poses and specificities of a protein complex accurately.

  14. A flexible docking scheme to explore the binding selectivity of PDZ domains

    PubMed Central

    Gerek, Z Nevin; Ozkan, S Banu

    2010-01-01

    Modeling of protein binding site flexibility in molecular docking is still a challenging problem due to the large conformational space that needs sampling. Here, we propose a flexible receptor docking scheme: A dihedral restrained replica exchange molecular dynamics (REMD), where we incorporate the normal modes obtained by the Elastic Network Model (ENM) as dihedral restraints to speed up the search towards correct binding site conformations. To our knowledge, this is the first approach that uses ENM modes to bias REMD simulations towards binding induced fluctuations in docking studies. In our docking scheme, we first obtain the deformed structures of the unbound protein as initial conformations by moving along the binding fluctuation mode, and perform REMD using the ENM modes as dihedral restraints. Then, we generate an ensemble of multiple receptor conformations (MRCs) by clustering the lowest replica trajectory. Using RosettaLigand, we dock ligands to the clustered conformations to predict the binding pose and affinity. We apply this method to postsynaptic density-95/Dlg/ZO-1 (PDZ) domains; whose dynamics govern their binding specificity. Our approach produces the lowest energy bound complexes with an average ligand root mean square deviation of 0.36 Å. We further test our method on (i) homologs and (ii) mutant structures of PDZ where mutations alter the binding selectivity. In both cases, our approach succeeds to predict the correct pose and the affinity of binding peptides. Overall, with this approach, we generate an ensemble of MRCs that leads to predict the binding poses and specificities of a protein complex accurately. PMID:20196074

  15. Binding proteins enhance specific uptake rate by increasing the substrate-transporter encounter rate.

    PubMed

    Bosdriesz, Evert; Magnúsdóttir, Stefanía; Bruggeman, Frank J; Teusink, Bas; Molenaar, Douwe

    2015-06-01

    Microorganisms rely on binding-protein assisted, active transport systems to scavenge for scarce nutrients. Several advantages of using binding proteins in such uptake systems have been proposed. However, a systematic, rigorous and quantitative analysis of the function of binding proteins is lacking. By combining knowledge of selection pressure and physiochemical constraints, we derive kinetic, thermodynamic, and stoichiometric properties of binding-protein dependent transport systems that enable a maximal import activity per amount of transporter. Under the hypothesis that this maximal specific activity of the transport complex is the selection objective, binding protein concentrations should exceed the concentration of both the scarce nutrient and the transporter. This increases the encounter rate of transporter with loaded binding protein at low substrate concentrations, thereby enhancing the affinity and specific uptake rate. These predictions are experimentally testable, and a number of observations confirm them. © 2015 FEBS.

  16. Inference of Expanded Lrp-Like Feast/Famine Transcription Factor Targets in a Non-Model Organism Using Protein Structure-Based Prediction

    PubMed Central

    Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272

  17. Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.

    PubMed

    Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.

  18. Development of a sugar-binding residue prediction system from protein sequences using support vector machine.

    PubMed

    Banno, Masaki; Komiyama, Yusuke; Cao, Wei; Oku, Yuya; Ueki, Kokoro; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

    2017-02-01

    Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513). Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-08-01

    The speed of site-specific binding of transcription factor (TFs) proteins with genomic DNA seems to be strongly retarded by the randomly occurring sequence traps. Traps are those DNA sequences sharing significant similarity with the original specific binding sites (SBSs). It is an intriguing question how the naturally occurring TFs and their SBSs are designed to manage the retarding effects of such randomly occurring traps. We develop a simple random walk model on the site-specific binding of TFs with genomic DNA in the presence of sequence traps. Our dynamical model predicts that (a) the retarding effects of traps will be minimum when the traps are arranged around the SBS such that there is a negative correlation between the binding strength of TFs with traps and the distance of traps from the SBS and (b) the retarding effects of sequence traps can be appeased by the condensed conformational state of DNA. Our computational analysis results on the distribution of sequence traps around the putative binding sites of various TFs in mouse and human genome clearly agree well the theoretical predictions. We propose that the distribution of traps can be used as an additional metric to efficiently identify the SBSs of TFs on genomic DNA.

  20. miRTar2GO: a novel rule-based model learning method for cell line specific microRNA target prediction that integrates Ago2 CLIP-Seq and validated microRNA-target interaction data.

    PubMed

    Ahadi, Alireza; Sablok, Gaurav; Hutvagner, Gyorgy

    2017-04-07

    MicroRNAs (miRNAs) are ∼19-22 nucleotides (nt) long regulatory RNAs that regulate gene expression by recognizing and binding to complementary sequences on mRNAs. The key step in revealing the function of a miRNA, is the identification of miRNA target genes. Recent biochemical advances including PAR-CLIP and HITS-CLIP allow for improved miRNA target predictions and are widely used to validate miRNA targets. Here, we present miRTar2GO, which is a model, trained on the common rules of miRNA-target interactions, Argonaute (Ago) CLIP-Seq data and experimentally validated miRNA target interactions. miRTar2GO is designed to predict miRNA target sites using more relaxed miRNA-target binding characteristics. More importantly, miRTar2GO allows for the prediction of cell-type specific miRNA targets. We have evaluated miRTar2GO against other widely used miRNA target prediction algorithms and demonstrated that miRTar2GO produced significantly higher F1 and G scores. Target predictions, binding specifications, results of the pathway analysis and gene ontology enrichment of miRNA targets are freely available at http://www.mirtar2go.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Striatal dopamine transporter binding for predicting the development of delayed neuropsychological sequelae in suicide attempters by carbon monoxide poisoning: A SPECT study.

    PubMed

    Yang, Kai-Chun; Ku, Hsiao-Lun; Wu, Chia-Liang; Wang, Shyh-Jen; Yang, Chen-Chang; Deng, Jou-Fang; Lee, Ming-Been; Chou, Yuan-Hwa

    2011-12-30

    Carbon monoxide poisoning (COP) after charcoal burning results in delayed neuropsychological sequelae (DNS), which show clinical resemblance to Parkinson's disease, without adequate predictors at present. This study examined the role of dopamine transporter (DAT) binding for the prediction of DNS. Twenty-seven suicide attempters with COP were recruited. Seven of them developed DNS, while the remainder did not. The striatal DAT binding was measured by single photon emission computed tomography with (99m)Tc-TRODAT. The specific uptake ratio was derived based on a ratio equilibrium model. Using a logistic regression model, multiple clinical variables were examined as potential predictors for DNS. COP patients with DNS had a lower binding on left striatal DAT binding than patients without DNS. Logistic regression analysis showed that a combination of initial loss of consciousness and lower left striatal DAT binding predicted the development of DNS. Our data indicate that the left striatal DAT binding could help to predict the development of DNS. This finding not only demonstrates the feasibility of brain imaging techniques for predicting the development of DNS but will also help clinicians to improve the quality of care for COP patients. 2011 Elsevier Ireland Ltd. All rights reserved.

  2. A comparative study of family-specific protein-ligand complex affinity prediction based on random forest approach

    NASA Astrophysics Data System (ADS)

    Wang, Yu; Guo, Yanzhi; Kuang, Qifan; Pu, Xuemei; Ji, Yue; Zhang, Zhihang; Li, Menglong

    2015-04-01

    The assessment of binding affinity between ligands and the target proteins plays an essential role in drug discovery and design process. As an alternative to widely used scoring approaches, machine learning methods have also been proposed for fast prediction of the binding affinity with promising results, but most of them were developed as all-purpose models despite of the specific functions of different protein families, since proteins from different function families always have different structures and physicochemical features. In this study, we proposed a random forest method to predict the protein-ligand binding affinity based on a comprehensive feature set covering protein sequence, binding pocket, ligand structure and intermolecular interaction. Feature processing and compression was respectively implemented for different protein family datasets, which indicates that different features contribute to different models, so individual representation for each protein family is necessary. Three family-specific models were constructed for three important protein target families of HIV-1 protease, trypsin and carbonic anhydrase respectively. As a comparison, two generic models including diverse protein families were also built. The evaluation results show that models on family-specific datasets have the superior performance to those on the generic datasets and the Pearson and Spearman correlation coefficients ( R p and Rs) on the test sets are 0.740, 0.874, 0.735 and 0.697, 0.853, 0.723 for HIV-1 protease, trypsin and carbonic anhydrase respectively. Comparisons with the other methods further demonstrate that individual representation and model construction for each protein family is a more reasonable way in predicting the affinity of one particular protein family.

  3. Structural, kinetic, and thermodynamic studies of specificity designed HIV-1 protease.

    PubMed

    Alvizo, Oscar; Mittal, Seema; Mayo, Stephen L; Schiffer, Celia A

    2012-07-01

    HIV-1 protease recognizes and cleaves more than 12 different substrates leading to viral maturation. While these substrates share no conserved motif, they are specifically selected for and cleaved by protease during viral life cycle. Drug resistant mutations evolve within the protease that compromise inhibitor binding but allow the continued recognition of all these substrates. While the substrate envelope defines a general shape for substrate recognition, successfully predicting the determinants of substrate binding specificity would provide additional insights into the mechanism of altered molecular recognition in resistant proteases. We designed a variant of HIV protease with altered specificity using positive computational design methods and validated the design using X-ray crystallography and enzyme biochemistry. The engineered variant, Pr3 (A28S/D30F/G48R), was designed to preferentially bind to one out of three of HIV protease's natural substrates; RT-RH over p2-NC and CA-p2. In kinetic assays, RT-RH binding specificity for Pr3 increased threefold compared to the wild-type (WT), which was further confirmed by isothermal titration calorimetry. Crystal structures of WT protease and the designed variant in complex with RT-RH, CA-p2, and p2-NC were determined. Structural analysis of the designed complexes revealed that one of the engineered substitutions (G48R) potentially stabilized heterogeneous flap conformations, thereby facilitating alternate modes of substrate binding. Our results demonstrate that while substrate specificity could be engineered in HIV protease, the structural pliability of protease restricted the propagation of interactions as predicted. These results offer new insights into the plasticity and structural determinants of substrate binding specificity of the HIV-1 protease. Copyright © 2012 The Protein Society.

  4. Identification of Specific DNA Binding Residues in the TCP Family of Transcription Factors in Arabidopsis[W

    PubMed Central

    Aggarwal, Pooja; Das Gupta, Mainak; Joseph, Agnel Praveen; Chatterjee, Nirmalya; Srinivasan, N.; Nath, Utpal

    2010-01-01

    The TCP transcription factors control multiple developmental traits in diverse plant species. Members of this family share an ∼60-residue-long TCP domain that binds to DNA. The TCP domain is predicted to form a basic helix-loop-helix (bHLH) structure but shares little sequence similarity with canonical bHLH domain. This classifies the TCP domain as a novel class of DNA binding domain specific to the plant kingdom. Little is known about how the TCP domain interacts with its target DNA. We report biochemical characterization and DNA binding properties of a TCP member in Arabidopsis thaliana, TCP4. We have shown that the 58-residue domain of TCP4 is essential and sufficient for binding to DNA and possesses DNA binding parameters comparable to canonical bHLH proteins. Using a yeast-based random mutagenesis screen and site-directed mutants, we identified the residues important for DNA binding and dimer formation. Mutants defective in binding and dimerization failed to rescue the phenotype of an Arabidopsis line lacking the endogenous TCP4 activity. By combining structure prediction, functional characterization of the mutants, and molecular modeling, we suggest a possible DNA binding mechanism for this class of transcription factors. PMID:20363772

  5. Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity.

    PubMed

    Korkuć, Paula; Walther, Dirk

    2015-01-01

    To better understand and ultimately predict both the metabolic activities as well as the signaling functions of metabolites, a detailed understanding of the physical interactions of metabolites with proteins is highly desirable. Focusing in particular on protein binding specificity vs. promiscuity, we performed a comprehensive analysis of the physicochemical properties of compound-protein binding events as reported in the Protein Data Bank (PDB). We compared the molecular and structural characteristics obtained for metabolites to those of the well-studied interactions of drug compounds with proteins. Promiscuously binding metabolites and drugs are characterized by low molecular weight and high structural flexibility. Unlike reported for drug compounds, low rather than high hydrophobicity appears associated, albeit weakly, with promiscuous binding for the metabolite set investigated in this study. Across several physicochemical properties, drug compounds exhibit characteristic binding propensities that are distinguishable from those associated with metabolites. Prediction of target diversity and compound promiscuity using physicochemical properties was possible at modest accuracy levels only, but was consistently better for drugs than for metabolites. Compound properties capturing structural flexibility and hydrogen-bond formation descriptors proved most informative in PLS-based prediction models. With regard to diversity of enzymatic activities of the respective metabolite target enzymes, the metabolites benzylsuccinate, hypoxanthine, trimethylamine N-oxide, oleoylglycerol, and resorcinol showed very narrow process involvement, while glycine, imidazole, tryptophan, succinate, and glutathione were identified to possess broad enzymatic reaction scopes. Promiscuous metabolites were found to mainly serve as general energy currency compounds, but were identified to also be involved in signaling processes and to appear in diverse organismal systems (digestive and nervous system) suggesting specific molecular and physiological roles of promiscuous metabolites.

  6. Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity

    PubMed Central

    Korkuć, Paula; Walther, Dirk

    2015-01-01

    To better understand and ultimately predict both the metabolic activities as well as the signaling functions of metabolites, a detailed understanding of the physical interactions of metabolites with proteins is highly desirable. Focusing in particular on protein binding specificity vs. promiscuity, we performed a comprehensive analysis of the physicochemical properties of compound-protein binding events as reported in the Protein Data Bank (PDB). We compared the molecular and structural characteristics obtained for metabolites to those of the well-studied interactions of drug compounds with proteins. Promiscuously binding metabolites and drugs are characterized by low molecular weight and high structural flexibility. Unlike reported for drug compounds, low rather than high hydrophobicity appears associated, albeit weakly, with promiscuous binding for the metabolite set investigated in this study. Across several physicochemical properties, drug compounds exhibit characteristic binding propensities that are distinguishable from those associated with metabolites. Prediction of target diversity and compound promiscuity using physicochemical properties was possible at modest accuracy levels only, but was consistently better for drugs than for metabolites. Compound properties capturing structural flexibility and hydrogen-bond formation descriptors proved most informative in PLS-based prediction models. With regard to diversity of enzymatic activities of the respective metabolite target enzymes, the metabolites benzylsuccinate, hypoxanthine, trimethylamine N-oxide, oleoylglycerol, and resorcinol showed very narrow process involvement, while glycine, imidazole, tryptophan, succinate, and glutathione were identified to possess broad enzymatic reaction scopes. Promiscuous metabolites were found to mainly serve as general energy currency compounds, but were identified to also be involved in signaling processes and to appear in diverse organismal systems (digestive and nervous system) suggesting specific molecular and physiological roles of promiscuous metabolites. PMID:26442281

  7. Degenerate Pax2 and Senseless binding motifs improve detection of low-affinity sites required for enhancer specificity

    PubMed Central

    Zandvakili, Arya; Campbell, Ian; Weirauch, Matthew T.

    2018-01-01

    Cells use thousands of regulatory sequences to recruit transcription factors (TFs) and produce specific transcriptional outcomes. Since TFs bind degenerate DNA sequences, discriminating functional TF binding sites (TFBSs) from background sequences represents a significant challenge. Here, we show that a Drosophila regulatory element that activates Epidermal Growth Factor signaling requires overlapping, low-affinity TFBSs for competing TFs (Pax2 and Senseless) to ensure cell- and segment-specific activity. Testing available TF binding models for Pax2 and Senseless, however, revealed variable accuracy in predicting such low-affinity TFBSs. To better define parameters that increase accuracy, we developed a method that systematically selects subsets of TFBSs based on predicted affinity to generate hundreds of position-weight matrices (PWMs). Counterintuitively, we found that degenerate PWMs produced from datasets depleted of high-affinity sequences were more accurate in identifying both low- and high-affinity TFBSs for the Pax2 and Senseless TFs. Taken together, these findings reveal how TFBS arrangement can be constrained by competition rather than cooperativity and that degenerate models of TF binding preferences can improve identification of biologically relevant low affinity TFBSs. PMID:29617378

  8. MHC2NNZ: A novel peptide binding prediction approach for HLA DQ molecules

    NASA Astrophysics Data System (ADS)

    Xie, Jiang; Zeng, Xu; Lu, Dongfang; Liu, Zhixiang; Wang, Jiao

    2017-07-01

    The major histocompatibility complex class II (MHC-II) molecule plays a crucial role in immunology. Computational prediction of MHC-II binding peptides can help researchers understand the mechanism of immune systems and design vaccines. Most of the prediction algorithms for MHC-II to date have made large efforts in human leukocyte antigen (HLA, the name of MHC in Human) molecules encoded in the DR locus. However, HLA DQ molecules are equally important and have only been made less progress because it is more difficult to handle them experimentally. In this study, we propose an artificial neural network-based approach called MHC2NNZ to predict peptides binding to HLA DQ molecules. Unlike previous artificial neural network-based methods, MHC2NNZ not only considers sequence similarity features but also captures the chemical and physical properties, and a novel method incorporating these properties is proposed to represent peptide flanking regions (PFR). Furthermore, MHC2NNZ improves the prediction accuracy by combining with amino acid preference at more specific positions of the peptides binding core. By evaluating on 3549 peptides binding to six most frequent HLA DQ molecules, MHC2NNZ is demonstrated to outperform other state-of-the-art MHC-II prediction methods.

  9. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction.

    PubMed

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H

    2017-01-09

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Screening Mixtures of Small Molecules for Binding to Multiple Sites on the Surface Tetanus Toxin C Fragment by Bioaffinity NMR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cosman, M; Zeller, L; Lightstone, F C

    2002-01-01

    The clostridial neurotoxins include the closely related tetanus (TeNT) and botulinum (BoNT) toxins. Botulinum toxin is used to treat severe muscle disorders and as a cosmetic wrinkle reducer. Large quantities of botulinum toxin have also been produced by terrorists for use as a biological weapon. Because there are no known antidotes for these toxins, they thus pose a potential threat to human health whether by an accidental overdose or by a hostile deployment. Thus, the discovery of high specificity and affinity compounds that can inhibit their binding to neural cells can be used as antidotes or in the design ofmore » chemical detectors. Using the crystal structure of the C fragment of the tetanus toxin (TetC), which is the cell recognition and cell surface binding domain, and the computational program DOCK, sets of small molecules have been predicted to bind to two different sites located on the surface of this protein. While Site-1 is common to the TeNT and BoNTs, Site-2 is unique to TeNT. Pairs of these molecules from each site can then be linked together synthetically to thereby increase the specificity and affinity for this toxin. Electrospray ionization mass spectroscopy was used to experimentally screen each compound for binding. Mixtures containing binders were further screened for activity under biologically relevant conditions using nuclear magnetic resonance (NMR) methods. The screening of mixtures of compounds offers increased efficiency and throughput as compared to testing single compounds and can also evaluate how possible structural changes induced by the binding of one ligand can influence the binding of the second ligand. In addition, competitive binding experiments with mixtures containing ligands predicted to bind the same site could identify the best binder for that site. NMR transfer nuclear Overhauser effect (trNOE) confirm that TetC binds doxorubicin but that this molecule is displaced by N-acetylneuraminic acid (sialic acid) in a mixture that also contains 3-sialyllactose (another predicted site 1 binder) and bisbenzimide 33342 (non-binder). A series of five predicted Site-2 binders were then screened sequentially in the presence of the Site-1 binder doxorubicin. These experiments showed that the compounds lavendustin A and naphthofluorescein-di-({beta}-D-galactopyranoside) binds along with doxorubicin to TetC. Further experiments indicate that doxorubicin and lavendustin are potential candidates to use in preparing a bidendate inhibitor specific for TetC. The simultaneous binding of two different predicted Site-2 ligands to TetC suggests that they may bind multiple sites. Another possibility is that the conformations of the binding sites are dynamic and can bind multiple diverse ligands at a single site depending on the pre-existing conformation of the protein, especially when doxorubicin is already bound.« less

  11. Position specific variation in the rate of evolution in transcription factor binding sites

    PubMed Central

    Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

    2003-01-01

    Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282

  12. Predicting the binding preference of transcription factors to individual DNA k-mers.

    PubMed

    Alleyne, Trevis M; Peña-Castillo, Lourdes; Badis, Gwenael; Talukder, Shaheynoor; Berger, Michael F; Gehrke, Andrew R; Philippakis, Anthony A; Bulyk, Martha L; Morris, Quaid D; Hughes, Timothy R

    2009-04-15

    Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.

  13. Prediction of TF target sites based on atomistic models of protein-DNA complexes

    PubMed Central

    Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno

    2008-01-01

    Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190

  14. PSSMHCpan: a novel PSSM-based software for predicting class I peptide-HLA binding affinity

    PubMed Central

    Liu, Geng; Li, Dongli; Li, Zhang; Qiu, Si; Li, Wenhui; Chao, Cheng-chi; Yang, Naibo; Li, Handong; Cheng, Zhen; Song, Xin; Cheng, Le; Zhang, Xiuqing; Wang, Jian; Yang, Huanming

    2017-01-01

    Abstract Predicting peptide binding affinity with human leukocyte antigen (HLA) is a crucial step in developing powerful antitumor vaccine for cancer immunotherapy. Currently available methods work quite well in predicting peptide binding affinity with HLA alleles such as HLA-A*0201, HLA-A*0101, and HLA-B*0702 in terms of sensitivity and specificity. However, quite a few types of HLA alleles that are present in the majority of human populations including HLA-A*0202, HLA-A*0203, HLA-A*6802, HLA-B*5101, HLA-B*5301, HLA-B*5401, and HLA-B*5701 still cannot be predicted with satisfactory accuracy using currently available methods. Furthermore, currently the most popularly used methods for predicting peptide binding affinity are inefficient in identifying neoantigens from a large quantity of whole genome and transcriptome sequencing data. Here we present a Position Specific Scoring Matrix (PSSM)-based software called PSSMHCpan to accurately and efficiently predict peptide binding affinity with a broad coverage of HLA class I alleles. We evaluated the performance of PSSMHCpan by analyzing 10-fold cross-validation on a training database containing 87 HLA alleles and obtained an average area under receiver operating characteristic curve (AUC) of 0.94 and accuracy (ACC) of 0.85. In an independent dataset (Peptide Database of Cancer Immunity) evaluation, PSSMHCpan is substantially better than the popularly used NetMHC-4.0, NetMHCpan-3.0, PickPocket, Nebula, and SMM with a sensitivity of 0.90, as compared to 0.74, 0.81, 0.77, 0.24, and 0.79. In addition, PSSMHCpan is more than 197 times faster than NetMHC-4.0, NetMHCpan-3.0, PickPocket, sNebula, and SMM when predicting neoantigens from 661 263 peptides from a breast tumor sample. Finally, we built a neoantigen prediction pipeline and identified 117 017 neoantigens from 467 cancer samples of various cancers from TCGA. PSSMHCpan is superior to the currently available methods in predicting peptide binding affinity with a broad coverage of HLA class I alleles. PMID:28327987

  15. Direct detection of methicillin resistance in Staphylococcus aureus in blood culture broth by use of a penicillin binding protein 2a latex agglutination test.

    PubMed

    Qian, Qinfang; Venkataraman, Lata; Kirby, James E; Gold, Howard S; Yamazumi, Toshiaki

    2010-04-01

    We studied the utility of performing a penicillin binding protein 2a latex agglutination (PBP-LA) assay directly on Bactec blood culture broth samples containing Staphylococcus aureus to rapidly detect methicillin resistance. The sensitivity, specificity, positive predictive value, and negative predictive value of this method were 94.1%, 97.5%, 98%, and 92.9%, respectively.

  16. Protein-Protein Interface Predictions by Data-Driven Methods: A Review

    PubMed Central

    Xue, Li C; Dobbs, Drena; Bonvin, Alexandre M.J.J.; Honavar, Vasant

    2015-01-01

    Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction. PMID:26460190

  17. Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights

    PubMed Central

    2011-01-01

    Background Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles. PMID:22189060

  18. Improved pan-specific MHC class I peptide-binding predictions using a novel representation of the MHC-binding cleft environment.

    PubMed

    Carrasco Pro, S; Zimic, M; Nielsen, M

    2014-02-01

    Major histocompatibility complex (MHC) molecules play a key role in cell-mediated immune responses presenting bounded peptides for recognition by the immune system cells. Several in silico methods have been developed to predict the binding affinity of a given peptide to a specific MHC molecule. One of the current state-of-the-art methods for MHC class I is NetMHCpan, which has a core ingredient for the representation of the MHC class I molecule using a pseudo-sequence representation of the binding cleft amino acid environment. New and large MHC-peptide-binding data sets are constantly being made available, and also new structures of MHC class I molecules with a bound peptide have been published. In order to test if the NetMHCpan method can be improved by integrating this novel information, we created new pseudo-sequence definitions for the MHC-binding cleft environment from sequence and structural analyses of different MHC data sets including human leukocyte antigen (HLA), non-human primates (chimpanzee, macaque and gorilla) and other animal alleles (cattle, mouse and swine). From these constructs, we showed that by focusing on MHC sequence positions found to be polymorphic across the MHC molecules used to train the method, the NetMHCpan method achieved a significant increase in the predictive performance, in particular, of non-human MHCs. This study hence showed that an improved performance of MHC-binding methods can be achieved not only by the accumulation of more MHC-peptide-binding data but also by a refined definition of the MHC-binding environment including information from non-human species. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  19. Enriching peptide libraries for binding affinity and specificity through computationally directed library design

    PubMed Central

    Foight, Glenna Wink; Chen, T. Scott; Richman, Daniel; Keating, Amy E.

    2017-01-01

    Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model. PMID:28236241

  20. Enriching Peptide Libraries for Binding Affinity and Specificity Through Computationally Directed Library Design.

    PubMed

    Foight, Glenna Wink; Chen, T Scott; Richman, Daniel; Keating, Amy E

    2017-01-01

    Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model.

  1. Characterization of domain-peptide interaction interface: a case study on the amphiphysin-1 SH3 domain.

    PubMed

    Hou, Tingjun; Zhang, Wei; Case, David A; Wang, Wei

    2008-02-29

    Many important protein-protein interactions are mediated by peptide recognition modular domains, such as the Src homology 3 (SH3), SH2, PDZ, and WW domains. Characterizing the interaction interface of domain-peptide complexes and predicting binding specificity for modular domains are critical for deciphering protein-protein interaction networks. Here, we propose the use of an energetic decomposition analysis to characterize domain-peptide interactions and the molecular interaction energy components (MIECs), including van der Waals, electrostatic, and desolvation energy between residue pairs on the binding interface. We show a proof-of-concept study on the amphiphysin-1 SH3 domain interacting with its peptide ligands. The structures of the human amphiphysin-1 SH3 domain complexed with 884 peptides were first modeled using virtual mutagenesis and optimized by molecular mechanics (MM) minimization. Next, the MIECs between domain and peptide residues were computed using the MM/generalized Born decomposition analysis. We conducted two types of statistical analyses on the MIECs to demonstrate their usefulness for predicting binding affinities of peptides and for classifying peptides into binder and non-binder categories. First, combining partial least squares analysis and genetic algorithm, we fitted linear regression models between the MIECs and the peptide binding affinities on the training data set. These models were then used to predict binding affinities for peptides in the test data set; the predicted values have a correlation coefficient of 0.81 and an unsigned mean error of 0.39 compared with the experimentally measured ones. The partial least squares-genetic algorithm analysis on the MIECs revealed the critical interactions for the binding specificity of the amphiphysin-1 SH3 domain. Next, a support vector machine (SVM) was employed to build classification models based on the MIECs of peptides in the training set. A rigorous training-validation procedure was used to assess the performances of different kernel functions in SVM and different combinations of the MIECs. The best SVM classifier gave satisfactory predictions for the test set, indicated by average prediction accuracy rates of 78% and 91% for the binding and non-binding peptides, respectively. We also showed that the performance of our approach on both binding affinity prediction and binder/non-binder classification was superior to the performances of the conventional MM/Poisson-Boltzmann solvent-accessible surface area and MM/generalized Born solvent-accessible surface area calculations. Our study demonstrates that the analysis of the MIECs between peptides and the SH3 domain can successfully characterize the binding interface, and it provides a framework to derive integrated prediction models for different domain-peptide systems.

  2. Application of binding free energy calculations to prediction of binding modes and affinities of MDM2 and MDMX inhibitors.

    PubMed

    Lee, Hui Sun; Jo, Sunhwan; Lim, Hyun-Suk; Im, Wonpil

    2012-07-23

    Molecular docking is widely used to obtain binding modes and binding affinities of a molecule to a given target protein. Despite considerable efforts, however, prediction of both properties by docking remains challenging mainly due to protein's structural flexibility and inaccuracy of scoring functions. Here, an integrated approach has been developed to improve the accuracy of binding mode and affinity prediction and tested for small molecule MDM2 and MDMX antagonists. In this approach, initial candidate models selected from docking are subjected to equilibration MD simulations to further filter the models. Free energy perturbation molecular dynamics (FEP/MD) simulations are then applied to the filtered ligand models to enhance the ability in predicting the near-native ligand conformation. The calculated binding free energies for MDM2 complexes are overestimated compared to experimental measurements mainly due to the difficulties in sampling highly flexible apo-MDM2. Nonetheless, the FEP/MD binding free energy calculations are more promising for discriminating binders from nonbinders than docking scores. In particular, the comparison between the MDM2 and MDMX results suggests that apo-MDMX has lower flexibility than apo-MDM2. In addition, the FEP/MD calculations provide detailed information on the different energetic contributions to ligand binding, leading to a better understanding of the sensitivity and specificity of protein-ligand interactions.

  3. Protein-protein interactions in paralogues: Electrostatics modulates specificity on a conserved steric scaffold

    PubMed Central

    Huber, Roland G.; Bond, Peter J.

    2017-01-01

    An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners. PMID:29016650

  4. Protein-protein interactions in paralogues: Electrostatics modulates specificity on a conserved steric scaffold.

    PubMed

    Ivanov, Stefan M; Cawley, Andrew; Huber, Roland G; Bond, Peter J; Warwicker, Jim

    2017-01-01

    An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners.

  5. Predicting Binding Free Energy Change Caused by Point Mutations with Knowledge-Modified MM/PBSA Method.

    PubMed

    Petukh, Marharyta; Li, Minghui; Alexov, Emil

    2015-07-01

    A new methodology termed Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) was developed to predict the changes of the binding free energy caused by mutations. The method utilizes 3D structures of the corresponding protein-protein complexes and takes advantage of both approaches: sequence- and structure-based methods. The method has two components: a MM/PBSA-based component, and an additional set of statistical terms delivered from statistical investigation of physico-chemical properties of protein complexes. While the approach is rigid body approach and does not explicitly consider plausible conformational changes caused by the binding, the effect of conformational changes, including changes away from binding interface, on electrostatics are mimicked with amino acid specific dielectric constants. This provides significant improvement of SAAMBE predictions as indicated by better match against experimentally determined binding free energy changes over 1300 mutations in 43 proteins. The final benchmarking resulted in a very good agreement with experimental data (correlation coefficient 0.624) while the algorithm being fast enough to allow for large-scale calculations (the average time is less than a minute per mutation).

  6. Predicting Nonspecific Ion Binding Using DelPhi

    PubMed Central

    Petukh, Marharyta; Zhenirovskyy, Maxim; Li, Chuan; Li, Lin; Wang, Lin; Alexov, Emil

    2012-01-01

    Ions are an important component of the cell and affect the corresponding biological macromolecules either via direct binding or as a screening ion cloud. Although some ion binding is highly specific and frequently associated with the function of the macromolecule, other ions bind to the protein surface nonspecifically, presumably because the electrostatic attraction is strong enough to immobilize them. Here, we test such a scenario and demonstrate that experimentally identified surface-bound ions are located at a potential that facilitates binding, which indicates that the major driving force is the electrostatics. Without taking into consideration geometrical factors and structural fluctuations, we show that ions tend to be bound onto the protein surface at positions with strong potential but with polarity opposite to that of the ion. This observation is used to develop a method that uses a DelPhi-calculated potential map in conjunction with an in-house-developed clustering algorithm to predict nonspecific ion-binding sites. Although this approach distinguishes only the polarity of the ions, and not their chemical nature, it can predict nonspecific binding of positively or negatively charged ions with acceptable accuracy. One can use the predictions in the Poisson-Boltzmann approach by placing explicit ions in the predicted positions, which in turn will reduce the magnitude of the local potential and extend the limits of the Poisson-Boltzmann equation. In addition, one can use this approach to place the desired number of ions before conducting molecular-dynamics simulations to neutralize the net charge of the protein, because it was shown to perform better than standard screened Coulomb canned routines, or to predict ion-binding sites in proteins. This latter is especially true for proteins that are involved in ion transport, because such ions are loosely bound and very difficult to detect experimentally. PMID:22735539

  7. FINDSITE-metal: Integrating evolutionary information and machine learning for structure-based metal binding site prediction at the proteome level

    PubMed Central

    Brylinski, Michal; Skolnick, Jeffrey

    2010-01-01

    The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609

  8. Predicting changes in cardiac myocyte contractility during early drug discovery with in vitro assays

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morton, M.J., E-mail: michael.morton@astrazeneca.com; Armstrong, D.; Abi Gerges, N.

    2014-09-01

    Cardiovascular-related adverse drug effects are a major concern for the pharmaceutical industry. Activity of an investigational drug at the L-type calcium channel could manifest in a number of ways, including changes in cardiac contractility. The aim of this study was to define which of the two assay technologies – radioligand-binding or automated electrophysiology – was most predictive of contractility effects in an in vitro myocyte contractility assay. The activity of reference and proprietary compounds at the L-type calcium channel was measured by radioligand-binding assays, conventional patch-clamp, automated electrophysiology, and by measurement of contractility in canine isolated cardiac myocytes. Activity inmore » the radioligand-binding assay at the L-type Ca channel phenylalkylamine binding site was most predictive of an inotropic effect in the canine cardiac myocyte assay. The sensitivity was 73%, specificity 83% and predictivity 78%. The radioligand-binding assay may be run at a single test concentration and potency estimated. The least predictive assay was automated electrophysiology which showed a significant bias when compared with other assay formats. Given the importance of the L-type calcium channel, not just in cardiac function, but also in other organ systems, a screening strategy emerges whereby single concentration ligand-binding can be performed early in the discovery process with sufficient predictivity, throughput and turnaround time to influence chemical design and address a significant safety-related liability, at relatively low cost. - Highlights: • The L-type calcium channel is a significant safety liability during drug discovery. • Radioligand-binding to the L-type calcium channel can be measured in vitro. • The assay can be run at a single test concentration as part of a screening cascade. • This measurement is highly predictive of changes in cardiac myocyte contractility.« less

  9. EL_PSSM-RT: DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation.

    PubMed

    Zhou, Jiyun; Lu, Qin; Xu, Ruifeng; He, Yulan; Wang, Hongpeng

    2017-08-29

    Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.

  10. Diversification of transcription factor-DNA interactions and the evolution of gene regulatory networks.

    PubMed

    Rogers, Julia M; Bulyk, Martha L

    2018-04-25

    Sequence-specific transcription factors (TFs) bind short DNA sequences in the genome to regulate the expression of target genes. In the last decade, numerous technical advances have enabled the determination of the DNA-binding specificities of many of these factors. Large-scale screens of many TFs enabled the creation of databases of TF DNA-binding specificities, typically represented as position weight matrices (PWMs). Although great progress has been made in determining and predicting binding specificities systematically, there are still many surprises to be found when studying a particular TF's interactions with DNA in detail. Paralogous TFs' binding specificities can differ in subtle ways, in a manner that is not immediately apparent from looking at their PWMs. These differences affect gene regulatory outputs and enable TFs to rewire transcriptional networks over evolutionary time. This review discusses recent observations made in the study of TF-DNA interactions that highlight the importance of continued in-depth analysis of TF-DNA interactions and their inherent complexity. This article is categorized under: Biological Mechanisms > Regulatory Biology. © 2018 Wiley Periodicals, Inc.

  11. Prediction of 3- to 5-Month Outcomes from Signs of Acute Bilirubin Toxicity in Newborn Infants.

    PubMed

    El Houchi, Salma Z; Iskander, Iman; Gamaleldin, Rasha; El Shenawy, Amira; Seoud, Iman; Abou-Youssef, Hazem; Wennberg, Richard P

    2017-04-01

    To evaluate the ability of the bilirubin-induced neurologic dysfunction (BIND) score to predict residual neurologic and auditory disability and to document the relationship of BIND score to total serum bilirubin (TSB) concentration. The BIND score (assessing mental status, muscle tone, and cry patterns) was obtained serially at 6- to 8-hour intervals in 220 near-term and full-term infants with severe hyperbilirubinemia. Neurologic and/or auditory outcomes at 3-5 months of age were correlated with the highest calculated BIND score. The BIND score was also correlated with TSB. Follow-up neurologic and auditory examinations were performed for 145/202 (72%) surviving infants. All infants with severe acute bilirubin encephalopathy (BIND scores 7-9) either died or suffered residual neurologic and auditory impairment. Of 24 cases with moderate encephalopathy (BIND 4-6), 15 (62.5%) resolved following aggressive intervention and were normal at follow-up. Three of 73 infants with mild encephalopathy (BIND scores 1-3) but severe jaundice (TSB ranging 33.5-38 mg/dL; 573-650 µmol/L) had residual neurologic and/or auditory impairment. A BIND score ≥4 had a specificity of 87.3% and a sensitivity of 97.4% for predicting poor neurologic outcomes (receiver operating characteristic analysis). BIND scores trended higher with severe hyperbilirubinemia (r 2  = 0.54, P < .005), but 5/39 (13%) infants with TSB ≥36.5 mg/dL (624 µmol/L) had BIND scores ≤3, and normal outcomes at 3-5 months. The BIND score can be used to evaluate the severity of acute bilirubin encephalopathy and predict residual neurologic and hearing dysfunction. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

    PubMed

    Choi, Daesik; Park, Byungkyu; Chae, Hanju; Lee, Wook; Han, Kyungsook

    2017-03-14

    Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .

  13. An Efficient Semi-supervised Learning Approach to Predict SH2 Domain Mediated Interactions.

    PubMed

    Kundu, Kousik; Backofen, Rolf

    2017-01-01

    Src homology 2 (SH2) domain is an important subclass of modular protein domains that plays an indispensable role in several biological processes in eukaryotes. SH2 domains specifically bind to the phosphotyrosine residue of their binding peptides to facilitate various molecular functions. For determining the subtle binding specificities of SH2 domains, it is very important to understand the intriguing mechanisms by which these domains recognize their target peptides in a complex cellular environment. There are several attempts have been made to predict SH2-peptide interactions using high-throughput data. However, these high-throughput data are often affected by a low signal to noise ratio. Furthermore, the prediction methods have several additional shortcomings, such as linearity problem, high computational complexity, etc. Thus, computational identification of SH2-peptide interactions using high-throughput data remains challenging. Here, we propose a machine learning approach based on an efficient semi-supervised learning technique for the prediction of 51 SH2 domain mediated interactions in the human proteome. In our study, we have successfully employed several strategies to tackle the major problems in computational identification of SH2-peptide interactions.

  14. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification.

    PubMed

    Andreatta, Massimo; Karosiene, Edita; Rasmussen, Michael; Stryhn, Anette; Buus, Søren; Nielsen, Morten

    2015-11-01

    A key event in the generation of a cellular response against malicious organisms through the endocytic pathway is binding of peptidic antigens by major histocompatibility complex class II (MHC class II) molecules. The bound peptide is then presented on the cell surface where it can be recognized by T helper lymphocytes. NetMHCIIpan is a state-of-the-art method for the quantitative prediction of peptide binding to any human or mouse MHC class II molecule of known sequence. In this paper, we describe an updated version of the method with improved peptide binding register identification. Binding register prediction is concerned with determining the minimal core region of nine residues directly in contact with the MHC binding cleft, a crucial piece of information both for the identification and design of CD4(+) T cell antigens. When applied to a set of 51 crystal structures of peptide-MHC complexes with known binding registers, the new method NetMHCIIpan-3.1 significantly outperformed the earlier 3.0 version. We illustrate the impact of accurate binding core identification for the interpretation of T cell cross-reactivity using tetramer double staining with a CMV epitope and its variants mapped to the epitope binding core. NetMHCIIpan is publicly available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.1 .

  15. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.

    PubMed

    Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques

    2008-01-01

    This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

  16. New horizons in mouse immunoinformatics: reliable in silico prediction of mouse class I histocompatibility major complex peptide binding affinity.

    PubMed

    Hattotuwagama, Channa K; Guan, Pingping; Doytchinova, Irini A; Flower, Darren R

    2004-11-21

    Quantitative structure-activity relationship (QSAR) analysis is a main cornerstone of modern informatic disciplines. Predictive computational models, based on QSAR technology, of peptide-major histocompatibility complex (MHC) binding affinity have now become a vital component of modern day computational immunovaccinology. Historically, such approaches have been built around semi-qualitative, classification methods, but these are now giving way to quantitative regression methods. The additive method, an established immunoinformatics technique for the quantitative prediction of peptide-protein affinity, was used here to identify the sequence dependence of peptide binding specificity for three mouse class I MHC alleles: H2-D(b), H2-K(b) and H2-K(k). As we show, in terms of reliability the resulting models represent a significant advance on existing methods. They can be used for the accurate prediction of T-cell epitopes and are freely available online ( http://www.jenner.ac.uk/MHCPred).

  17. CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data

    PubMed Central

    O'Connor, Timothy; Bodén, Mikael

    2017-01-01

    Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599

  18. Determinants of BH3 Binding Specificity for Mcl-1 versus Bcl-x[subscript L

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott

    2010-06-25

    Interactions among Bcl-2 family proteins are important for regulating apoptosis. Prosurvival members of the family interact with proapoptotic BH3 (Bcl-2-homology-3)-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the {alpha}-helical BH3 region of the proapoptotic proteins to a conserved hydrophobic groove on the prosurvival proteins. Native BH3-only proteins exhibit selectivity in binding prosurvival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the designmore » of new classes of selective inhibitors to serve as reagents or therapeutics. In this work, we used two complementary techniques - yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis - to elucidate specificity determinants for binding to Bcl-x{sub L} versus Mcl-1, two prominent prosurvival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-x{sub L} selectively or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1-selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-x{sub L}, Bcl-2, Bcl-w, and Bfl-1, whereas Bcl-x{sub L}-selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 versus Bcl-x{sub L} binders.« less

  19. Determinants of BH3 binding specificity for Mcl-1 vs. Bcl-xL

    PubMed Central

    Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott; Fire, Emiko; Grant, Robert A.; Keating, Amy E.

    2010-01-01

    Interactions among Bcl-2 family proteins are important for regulating apoptosis. Pro-survival members of the family interact with pro-apoptotic BH3-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the alpha-helical BH3 region of the pro-apoptotic proteins to a conserved hydrophobic groove on the pro-survival proteins. Native BH3-only proteins exhibit selectivity in binding pro-survival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the design of new classes of selective inhibitors to serve as reagents or therapeutics. In this work we used two complementary techniques, yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis, to elucidate specificity determinants for binding to Bcl-xL vs. Mcl-1, two prominent pro-survival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-xL selectively, or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1 selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-xL, Bcl-2, Bcl-w and Bfl-1, whereas Bcl-xL selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 vs. Bcl-xL binders. PMID:20363230

  20. SNP2TFBS - a database of regulatory SNPs affecting predicted transcription factor binding site affinity.

    PubMed

    Kumar, Sunil; Ambrosini, Giovanna; Bucher, Philipp

    2017-01-04

    SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Effective Mass Theory of 2D Excitons Revisited

    NASA Astrophysics Data System (ADS)

    Gonzalez, Joseph; Oleynik, Ivan

    Two-dimensional (2D) semiconducting materials possess an exceptionally unique set of electronic and excitonic properties due to the combined effects of quantum and dielectric confinement. Reliable determination of exciton binding energies from both first-principles many-body perturbation theory (GW/BSE) and experiment is very challenging due to the enormous computational expense as well as the tremendous technical difficulties in experiment.. Very recently, effective mass theories of 2D excitons have been developed as an attractive alternative for inexpensive and accurate evaluation of the exciton binding energies. In this presentation, we evaluate two effective mass theory approaches by Velizhanin et al and Olsen et al in predicting exciton binding energies across a wide range of 2D materials. We specifically analyze the trends related to the varying screening lengths and exciton effective masses. We also extended the effective mass theory of 2D excitons to include effects of electron and hole mass anisotropies (mx ≠ my) , the latter showing a substantial influence on exciton binding energies. The recent predictions of exciton binding energies being independent of the exciton effective mass and a linear correlation with the band gap of a specific material are also critically reexamined.

  2. ChIP-seq Accurately Predicts Tissue-Specific Activity of Enhancers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Visel, Axel; Blow, Matthew J.; Li, Zirong

    2009-02-01

    A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover since they are scattered amongst the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here, we performed chromatin immunoprecipitation with the enhancer-associated protein p300, followed by massively-parallel sequencing, to map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain, and limb tissue. Wemore » tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases revealed reproducible enhancer activity in those tissues predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities and suggest that such datasets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.« less

  3. In silico peptide-binding predictions of passerine MHC class I reveal similarities across distantly related species, suggesting convergence on the level of protein function.

    PubMed

    Follin, Elna; Karlsson, Maria; Lundegaard, Claus; Nielsen, Morten; Wallin, Stefan; Paulsson, Kajsa; Westerdahl, Helena

    2013-04-01

    The major histocompatibility complex (MHC) genes are the most polymorphic genes found in the vertebrate genome, and they encode proteins that play an essential role in the adaptive immune response. Many songbirds (passerines) have been shown to have a large number of transcribed MHC class I genes compared to most mammals. To elucidate the reason for this large number of genes, we compared 14 MHC class I alleles (α1-α3 domains), from great reed warbler, house sparrow and tree sparrow, via phylogenetic analysis, homology modelling and in silico peptide-binding predictions to investigate their functional and genetic relationships. We found more pronounced clustering of the MHC class I allomorphs (allele specific proteins) in regards to their function (peptide-binding specificities) compared to their genetic relationships (amino acid sequences), indicating that the high number of alleles is of functional significance. The MHC class I allomorphs from house sparrow and tree sparrow, species that diverged 10 million years ago (MYA), had overlapping peptide-binding specificities, and these similarities across species were also confirmed in phylogenetic analyses based on amino acid sequences. Notably, there were also overlapping peptide-binding specificities in the allomorphs from house sparrow and great reed warbler, although these species diverged 30 MYA. This overlap was not found in a tree based on amino acid sequences. Our interpretation is that convergent evolution on the level of the protein function, possibly driven by selection from shared pathogens, has resulted in allomorphs with similar peptide-binding repertoires, although trans-species evolution in combination with gene conversion cannot be ruled out.

  4. StaRProtein, A Web Server for Prediction of the Stability of Repeat Proteins

    PubMed Central

    Xu, Yongtao; Zhou, Xu; Huang, Meilan

    2015-01-01

    Repeat proteins have become increasingly important due to their capability to bind to almost any proteins and the potential as alternative therapy to monoclonal antibodies. In the past decade repeat proteins have been designed to mediate specific protein-protein interactions. The tetratricopeptide and ankyrin repeat proteins are two classes of helical repeat proteins that form different binding pockets to accommodate various partners. It is important to understand the factors that define folding and stability of repeat proteins in order to prioritize the most stable designed repeat proteins to further explore their potential binding affinities. Here we developed distance-dependant statistical potentials using two classes of alpha-helical repeat proteins, tetratricopeptide and ankyrin repeat proteins respectively, and evaluated their efficiency in predicting the stability of repeat proteins. We demonstrated that the repeat-specific statistical potentials based on these two classes of repeat proteins showed paramount accuracy compared with non-specific statistical potentials in: 1) discriminate correct vs. incorrect models 2) rank the stability of designed repeat proteins. In particular, the statistical scores correlate closely with the equilibrium unfolding free energies of repeat proteins and therefore would serve as a novel tool in quickly prioritizing the designed repeat proteins with high stability. StaRProtein web server was developed for predicting the stability of repeat proteins. PMID:25807112

  5. Rational truncation of an RNA aptamer to prostate-specific membrane antigen using computational structural modeling.

    PubMed

    Rockey, William M; Hernandez, Frank J; Huang, Sheng-You; Cao, Song; Howell, Craig A; Thomas, Gregory S; Liu, Xiu Ying; Lapteva, Natalia; Spencer, David M; McNamara, James O; Zou, Xiaoqin; Chen, Shi-Jie; Giangrande, Paloma H

    2011-10-01

    RNA aptamers represent an emerging class of pharmaceuticals with great potential for targeted cancer diagnostics and therapy. Several RNA aptamers that bind cancer cell-surface antigens with high affinity and specificity have been described. However, their clinical potential has yet to be realized. A significant obstacle to the clinical adoption of RNA aptamers is the high cost of manufacturing long RNA sequences through chemical synthesis. Therapeutic aptamers are often truncated postselection by using a trial-and-error process, which is time consuming and inefficient. Here, we used a "rational truncation" approach guided by RNA structural prediction and protein/RNA docking algorithms that enabled us to substantially truncateA9, an RNA aptamer to prostate-specific membrane antigen (PSMA),with great potential for targeted therapeutics. This truncated PSMA aptamer (A9L; 41mer) retains binding activity, functionality, and is amenable to large-scale chemical synthesis for future clinical applications. In addition, the modeled RNA tertiary structure and protein/RNA docking predictions revealed key nucleotides within the aptamer critical for binding to PSMA and inhibiting its enzymatic activity. Finally, this work highlights the utility of existing RNA structural prediction and protein docking techniques that may be generally applicable to developing RNA aptamers optimized for therapeutic use.

  6. Identification of Candidate Transcription Factor Binding Sites in the Cattle Genome

    PubMed Central

    Bickhart, Derek M.; Liu, George E.

    2013-01-01

    A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach—using sequence conservation across cattle, human and dog—and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS. PMID:23433959

  7. Understanding and predicting binding between human leukocyte antigens (HLAs) and peptides by network analysis.

    PubMed

    Luo, Heng; Ye, Hao; Ng, Hui; Shi, Leming; Tong, Weida; Mattes, William; Mendrick, Donna; Hong, Huixiao

    2015-01-01

    As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs.

  8. Understanding and predicting binding between human leukocyte antigens (HLAs) and peptides by network analysis

    PubMed Central

    2015-01-01

    Background As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Methods Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Results Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Conclusions Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs. PMID:26424483

  9. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets.

    PubMed

    Nielsen, Morten; Andreatta, Massimo

    2016-03-30

    Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space. We have developed a neural network-based machine-learning algorithm leveraging information across multiple receptor specificities and ligand length scales, and demonstrated how this approach significantly improves the accuracy for prediction of peptide binding and identification of MHC ligands. The method is available at www.cbs.dtu.dk/services/NetMHCpan-3.0 .

  10. Semi-supervised prediction of SH2-peptide interactions from imbalanced high-throughput data.

    PubMed

    Kundu, Kousik; Costa, Fabrizio; Huber, Michael; Reth, Michael; Backofen, Rolf

    2013-01-01

    Src homology 2 (SH2) domains are the largest family of the peptide-recognition modules (PRMs) that bind to phosphotyrosine containing peptides. Knowledge about binding partners of SH2-domains is key for a deeper understanding of different cellular processes. Given the high binding specificity of SH2, in-silico ligand peptide prediction is of great interest. Currently however, only a few approaches have been published for the prediction of SH2-peptide interactions. Their main shortcomings range from limited coverage, to restrictive modeling assumptions (they are mainly based on position specific scoring matrices and do not take into consideration complex amino acids inter-dependencies) and high computational complexity. We propose a simple yet effective machine learning approach for a large set of known human SH2 domains. We used comprehensive data from micro-array and peptide-array experiments on 51 human SH2 domains. In order to deal with the high data imbalance problem and the high signal-to-noise ration, we casted the problem in a semi-supervised setting. We report competitive predictive performance w.r.t. state-of-the-art. Specifically we obtain 0.83 AUC ROC and 0.93 AUC PR in comparison to 0.71 AUC ROC and 0.87 AUC PR previously achieved by the position specific scoring matrices (PSSMs) based SMALI approach. Our work provides three main contributions. First, we showed that better models can be obtained when the information on the non-interacting peptides (negative examples) is also used. Second, we improve performance when considering high order correlations between the ligand positions employing regularization techniques to effectively avoid overfitting issues. Third, we developed an approach to tackle the data imbalance problem using a semi-supervised strategy. Finally, we performed a genome-wide prediction of human SH2-peptide binding, uncovering several findings of biological relevance. We make our models and genome-wide predictions, for all the 51 SH2-domains, freely available to the scientific community under the following URLs: http://www.bioinf.uni-freiburg.de/Software/SH2PepInt/SH2PepInt.tar.gz and http://www.bioinf.uni-freiburg.de/Software/SH2PepInt/Genome-wide-predictions.tar.gz, respectively.

  11. Functional diversification of ROK-family transcriptional regulators of sugar catabolism in the Thermotogae phylum

    PubMed Central

    Kazanov, Marat D.; Li, Xiaoqing; Gelfand, Mikhail S.; Osterman, Andrei L.; Rodionov, Dmitry A.

    2013-01-01

    Large and functionally heterogeneous families of transcription factors have complex evolutionary histories. What shapes specificities toward effectors and DNA sites in paralogous regulators is a fundamental question in biology. Bacteria from the deep-branching lineage Thermotogae possess multiple paralogs of the repressor, open reading frame, kinase (ROK) family regulators that are characterized by carbohydrate-sensing domains shared with sugar kinases. We applied an integrated genomic approach to study functions and specificities of regulators from this family. A comparative analysis of 11 Thermotogae genomes revealed novel mechanisms of transcriptional regulation of the sugar utilization networks, DNA-binding motifs and specific functions. Reconstructed regulons for seven groups of ROK regulators were validated by DNA-binding assays using purified recombinant proteins from the model bacterium Thermotoga maritima. All tested regulators demonstrated specific binding to their predicted cognate DNA sites, and this binding was inhibited by specific effectors, mono- or disaccharides from their respective sugar catabolic pathways. By comparing ligand-binding domains of regulators with structurally characterized kinases from the ROK family, we elucidated signature amino acid residues determining sugar-ligand regulator specificity. Observed correlations between signature residues and the sugar-ligand specificities provide the framework for structure functional classification of the entire ROK family. PMID:23209028

  12. Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities

    PubMed Central

    Narasimhan, Kamesh; Lambert, Samuel A; Yang, Ally WH; Riddell, Jeremy; Mnaimneh, Sanie; Zheng, Hong; Albu, Mihai; Najafabadi, Hamed S; Reece-Hoyes, John S; Fuxman Bass, Juan I; Walhout, Albertha JM; Weirauch, Matthew T; Hughes, Timothy R

    2015-01-01

    Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs. DOI: http://dx.doi.org/10.7554/eLife.06967.001 PMID:25905672

  13. Site-Specific Phosphorylation of VEGFR2 Is Mediated by Receptor Trafficking: Insights from a Computational Model

    PubMed Central

    Clegg, Lindsay Wendel; Mac Gabhann, Feilim

    2015-01-01

    Matrix-binding isoforms and non-matrix-binding isoforms of vascular endothelial growth factor (VEGF) are both capable of stimulating vascular remodeling, but the resulting blood vessel networks are structurally and functionally different. Here, we develop and validate a computational model of the binding of soluble and immobilized ligands to VEGF receptor 2 (VEGFR2), the endosomal trafficking of VEGFR2, and site-specific VEGFR2 tyrosine phosphorylation to study differences in induced signaling between these VEGF isoforms. In capturing essential features of VEGFR2 signaling and trafficking, our model suggests that VEGFR2 trafficking parameters are largely consistent across multiple endothelial cell lines. Simulations demonstrate distinct localization of VEGFR2 phosphorylated on Y1175 and Y1214. This is the first model to clearly show that differences in site-specific VEGFR2 activation when stimulated with immobilized VEGF compared to soluble VEGF can be accounted for by altered trafficking of VEGFR2 without an intrinsic difference in receptor activation. The model predicts that Neuropilin-1 can induce differences in the surface-to-internal distribution of VEGFR2. Simulations also show that ligated VEGFR2 and phosphorylated VEGFR2 levels diverge over time following stimulation. Using this model, we identify multiple key levers that alter how VEGF binding to VEGFR2 results in different coordinated patterns of multiple downstream signaling pathways. Specifically, simulations predict that VEGF immobilization, interactions with Neuropilin-1, perturbations of VEGFR2 trafficking, and changes in expression or activity of phosphatases acting on VEGFR2 all affect the magnitude, duration, and relative strength of VEGFR2 phosphorylation on tyrosines 1175 and 1214, and they do so predictably within our single consistent model framework. PMID:26067165

  14. A Simple PB/LIE Free Energy Function Accurately Predicts the Peptide Binding Specificity of the Tiam1 PDZ Domain.

    PubMed

    Panel, Nicolas; Sun, Young Joo; Fuentes, Ernesto J; Simonson, Thomas

    2017-01-01

    PDZ domains generally bind short amino acid sequences at the C-terminus of target proteins, and short peptides can be used as inhibitors or model ligands. Here, we used experimental binding assays and molecular dynamics simulations to characterize 51 complexes involving the Tiam1 PDZ domain and to test the performance of a semi-empirical free energy function. The free energy function combined a Poisson-Boltzmann (PB) continuum electrostatic term, a van der Waals interaction energy, and a surface area term. Each term was empirically weighted, giving a Linear Interaction Energy or "PB/LIE" free energy. The model yielded a mean unsigned deviation of 0.43 kcal/mol and a Pearson correlation of 0.64 between experimental and computed free energies, which was superior to a Null model that assumes all complexes have the same affinity. Analyses of the models support several experimental observations that indicate the orientation of the α 2 helix is a critical determinant for peptide specificity. The models were also used to predict binding free energies for nine new variants, corresponding to point mutants of the Syndecan1 and Caspr4 peptides. The predictions did not reveal improved binding; however, they suggest that an unnatural amino acid could be used to increase protease resistance and peptide lifetimes in vivo . The overall performance of the model should allow its use in the design of new PDZ ligands in the future.

  15. A Simple PB/LIE Free Energy Function Accurately Predicts the Peptide Binding Specificity of the Tiam1 PDZ Domain

    PubMed Central

    Panel, Nicolas; Sun, Young Joo; Fuentes, Ernesto J.; Simonson, Thomas

    2017-01-01

    PDZ domains generally bind short amino acid sequences at the C-terminus of target proteins, and short peptides can be used as inhibitors or model ligands. Here, we used experimental binding assays and molecular dynamics simulations to characterize 51 complexes involving the Tiam1 PDZ domain and to test the performance of a semi-empirical free energy function. The free energy function combined a Poisson-Boltzmann (PB) continuum electrostatic term, a van der Waals interaction energy, and a surface area term. Each term was empirically weighted, giving a Linear Interaction Energy or “PB/LIE” free energy. The model yielded a mean unsigned deviation of 0.43 kcal/mol and a Pearson correlation of 0.64 between experimental and computed free energies, which was superior to a Null model that assumes all complexes have the same affinity. Analyses of the models support several experimental observations that indicate the orientation of the α2 helix is a critical determinant for peptide specificity. The models were also used to predict binding free energies for nine new variants, corresponding to point mutants of the Syndecan1 and Caspr4 peptides. The predictions did not reveal improved binding; however, they suggest that an unnatural amino acid could be used to increase protease resistance and peptide lifetimes in vivo. The overall performance of the model should allow its use in the design of new PDZ ligands in the future. PMID:29018806

  16. Specificity determinants for the abscisic acid response element.

    PubMed

    Sarkar, Aditya Kumar; Lahiri, Ansuman

    2013-01-01

    Abscisic acid (ABA) response elements (ABREs) are a group of cis-acting DNA elements that have been identified from promoter analysis of many ABA-regulated genes in plants. We are interested in understanding the mechanism of binding specificity between ABREs and a class of bZIP transcription factors known as ABRE binding factors (ABFs). In this work, we have modeled the homodimeric structure of the bZIP domain of ABRE binding factor 1 from Arabidopsis thaliana (AtABF1) and studied its interaction with ACGT core motif-containing ABRE sequences. We have also examined the variation in the stability of the protein-DNA complex upon mutating ABRE sequences using the protein design algorithm FoldX. The high throughput free energy calculations successfully predicted the ability of ABF1 to bind to alternative core motifs like GCGT or AAGT and also rationalized the role of the flanking sequences in determining the specificity of the protein-DNA interaction.

  17. A Thermoacidophile-Specific Protein Family, DUF3211, Functions as a Fatty Acid Carrier with Novel Binding Mode

    PubMed Central

    Miyakawa, Takuya; Sawano, Yoriko; Miyazono, Ken-ichi; Miyauchi, Yumiko; Hatano, Ken-ichi

    2013-01-01

    STK_08120 is a member of the thermoacidophile-specific DUF3211 protein family from Sulfolobus tokodaii strain 7. Its molecular function remains obscure, and sequence similarities for obtaining functional remarks are not available. In this study, the crystal structure of STK_08120 was determined at 1.79-Å resolution to predict its probable function using structure similarity searches. The structure adopts an α/β structure of a helix-grip fold, which is found in the START domain proteins with cavities for hydrophobic substrates or ligands. The detailed structural features implied that fatty acids are the primary ligand candidates for STK_08120, and binding assays revealed that the protein bound long-chain saturated fatty acids (>C14) and their trans-unsaturated types with an affinity equal to that for major fatty acid binding proteins in mammals and plants. Moreover, the structure of an STK_08120-myristic acid complex revealed a unique binding mode among fatty acid binding proteins. These results suggest that the thermoacidophile-specific protein family DUF3211 functions as a fatty acid carrier with a novel binding mode. PMID:23836863

  18. Nucleolar Trafficking of Nucleostemin Family Proteins: Common versus Protein-Specific Mechanisms▿ §

    PubMed Central

    Meng, Lingjun; Zhu, Qubo; Tsai, Robert Y. L.

    2007-01-01

    The nucleolus has begun to emerge as a subnuclear organelle capable of modulating the activities of nuclear proteins in a dynamic and cell type-dependent manner. It remains unclear whether one can extrapolate a rule that predicts the nucleolar localization of multiple proteins based on protein sequence. Here, we address this issue by determining the shared and unique mechanisms that regulate the static and dynamic distributions of a family of nucleolar GTP-binding proteins, consisting of nucleostemin (NS), guanine nucleotide binding protein-like 3 (GNL3L), and Ngp1. The nucleolar residence of GNL3L is short and primarily controlled by its basic-coiled-coil domain, whereas the nucleolar residence of NS and Ngp1 is long and requires the basic and the GTP-binding domains, the latter of which functions as a retention signal. All three proteins contain a nucleoplasmic localization signal (NpLS) that prevents their nucleolar accumulation. Unlike that of the basic domain, the activity of NpLS is dynamically controlled by the GTP-binding domain. The nucleolar retention and the NpLS-regulating functions of the G domain involve specific residues that cannot be predicted by overall protein homology. This work reveals common and protein-specific mechanisms underlying the nucleolar movement of NS family proteins. PMID:17923687

  19. Prediction of fatty acid-binding residues on protein surfaces with three-dimensional probability distributions of interacting atoms.

    PubMed

    Mahalingam, Rajasekaran; Peng, Hung-Pin; Yang, An-Suei

    2014-08-01

    Protein-fatty acid interaction is vital for many cellular processes and understanding this interaction is important for functional annotation as well as drug discovery. In this work, we present a method for predicting the fatty acid (FA)-binding residues by using three-dimensional probability density distributions of interacting atoms of FAs on protein surfaces which are derived from the known protein-FA complex structures. A machine learning algorithm was established to learn the characteristic patterns of the probability density maps specific to the FA-binding sites. The predictor was trained with five-fold cross validation on a non-redundant training set and then evaluated with an independent test set as well as on holo-apo pair's dataset. The results showed good accuracy in predicting the FA-binding residues. Further, the predictor developed in this study is implemented as an online server which is freely accessible at the following website, http://ismblab.genomics.sinica.edu.tw/. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Computational Predictions Provide Insights into the Biology of TAL Effector Target Sites

    PubMed Central

    Grau, Jan; Wolf, Annett; Reschke, Maik; Bonas, Ulla; Posch, Stefan; Boch, Jens

    2013-01-01

    Transcription activator-like (TAL) effectors are injected into host plant cells by Xanthomonas bacteria to function as transcriptional activators for the benefit of the pathogen. The DNA binding domain of TAL effectors is composed of conserved amino acid repeat structures containing repeat-variable diresidues (RVDs) that determine DNA binding specificity. In this paper, we present TALgetter, a new approach for predicting TAL effector target sites based on a statistical model. In contrast to previous approaches, the parameters of TALgetter are estimated from training data computationally. We demonstrate that TALgetter successfully predicts known TAL effector target sites and often yields a greater number of predictions that are consistent with up-regulation in gene expression microarrays than an existing approach, Target Finder of the TALE-NT suite. We study the binding specificities estimated by TALgetter and approve that different RVDs are differently important for transcriptional activation. In subsequent studies, the predictions of TALgetter indicate a previously unreported positional preference of TAL effector target sites relative to the transcription start site. In addition, several TAL effectors are predicted to bind to the TATA-box, which might constitute one general mode of transcriptional activation by TAL effectors. Scrutinizing the predicted target sites of TALgetter, we propose several novel TAL effector virulence targets in rice and sweet orange. TAL-mediated induction of the candidates is supported by gene expression microarrays. Validity of these targets is also supported by functional analogy to known TAL effector targets, by an over-representation of TAL effector targets with similar function, or by a biological function related to pathogen infection. Hence, these predicted TAL effector virulence targets are promising candidates for studying the virulence function of TAL effectors. TALgetter is implemented as part of the open-source Java library Jstacs, and is freely available as a web-application and a command line program. PMID:23526890

  1. Predicting Cortisol Exposure from Paediatric Hydrocortisone Formulation Using a Semi-Mechanistic Pharmacokinetic Model Established in Healthy Adults.

    PubMed

    Melin, Johanna; Parra-Guillen, Zinnia P; Hartung, Niklas; Huisinga, Wilhelm; Ross, Richard J; Whitaker, Martin J; Kloft, Charlotte

    2018-04-01

    Optimisation of hydrocortisone replacement therapy in children is challenging as there is currently no licensed formulation and dose in Europe for children under 6 years of age. In addition, hydrocortisone has non-linear pharmacokinetics caused by saturable plasma protein binding. A paediatric hydrocortisone formulation, Infacort ® oral hydrocortisone granules with taste masking, has therefore been developed. The objective of this study was to establish a population pharmacokinetic model based on studies in healthy adult volunteers to predict hydrocortisone exposure in paediatric patients with adrenal insufficiency. Cortisol and binding protein concentrations were evaluated in the absence and presence of dexamethasone in healthy volunteers (n = 30). Dexamethasone was used to suppress endogenous cortisol concentrations prior to and after single doses of 0.5, 2, 5 and 10 mg of Infacort ® or 20 mg of Infacort ® /hydrocortisone tablet/hydrocortisone intravenously. A plasma protein binding model was established using unbound and total cortisol concentrations, and sequentially integrated into the pharmacokinetic model. Both specific (non-linear) and non-specific (linear) protein binding were included in the cortisol binding model. A two-compartment disposition model with saturable absorption and constant endogenous cortisol baseline (Baseline cort ,15.5 nmol/L) described the data accurately. The predicted cortisol exposure for a given dose varied considerably within a small body weight range in individuals weighing <20 kg. Our semi-mechanistic population pharmacokinetic model for hydrocortisone captures the complex pharmacokinetics of hydrocortisone in a simplified but comprehensive framework. The predicted cortisol exposure indicated the importance of defining an accurate hydrocortisone dose to mimic physiological concentrations for neonates and infants weighing <20 kg. EudraCT number: 2013-000260-28, 2013-000259-42.

  2. A map of human PRDM9 binding provides evidence for novel behaviors of PRDM9 and other zinc-finger proteins in meiosis

    PubMed Central

    Noor, Nudrat; Bitoun, Emmanuelle; Tumian, Afidalina; Imbeault, Michael; Chapman, J Ross; Aricescu, A Radu

    2017-01-01

    PRDM9 binding localizes almost all meiotic recombination sites in humans and mice. However, most PRDM9-bound loci do not become recombination hotspots. To explore factors that affect binding and subsequent recombination outcomes, we mapped human PRDM9 binding sites in a transfected human cell line and measured PRDM9-induced histone modifications. These data reveal varied DNA-binding modalities of PRDM9. We also find that human PRDM9 frequently binds promoters, despite their low recombination rates, and it can activate expression of a small number of genes including CTCFL and VCX. Furthermore, we identify specific sequence motifs that predict consistent, localized meiotic recombination suppression around a subset of PRDM9 binding sites. These motifs strongly associate with KRAB-ZNF protein binding, TRIM28 recruitment, and specific histone modifications. Finally, we demonstrate that, in addition to binding DNA, PRDM9's zinc fingers also mediate its multimerization, and we show that a pair of highly diverged alleles preferentially form homo-multimers. PMID:29072575

  3. A method for predicting individual residue contributions to enzyme specificity and binding-site energies, and its application to MTH1.

    PubMed

    Stewart, James J P

    2016-11-01

    A new method for predicting the energy contributions to substrate binding and to specificity has been developed. Conventional global optimization methods do not permit the subtle effects responsible for these properties to be modeled with sufficient precision to allow confidence to be placed in the results, but by making simple alterations to the model, the precisions of the various energies involved can be improved from about ±2 kcal mol -1 to ±0.1 kcal mol -1 . This technique was applied to the oxidized nucleotide pyrophosphohydrolase enzyme MTH1. MTH1 is unusual in that the binding and reaction sites are well separated-an advantage from a computational chemistry perspective, as it allows the energetics involved in docking to be modeled without the need to consider any issues relating to reaction mechanisms. In this study, two types of energy terms were investigated: the noncovalent interactions between the binding site and the substrate, and those responsible for discriminating between the oxidized nucleotide 8-oxo-dGTP and the normal dGTP. Both of these were investigated using the semiempirical method PM7 in the program MOPAC. The contributions of the individual residues to both the binding energy and the specificity of MTH1 were calculated by simulating the effect of mutations. Where comparisons were possible, all calculated results were in agreement with experimental observations. This technique provides fresh insight into the binding mechanism that enzymes use for discriminating between possible substrates.

  4. Structural and functional characterization of solute binding proteins for aromatic compounds derived from lignin: p-coumaric acid and related aromatic acids.

    PubMed

    Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R

    2013-10-01

    Lignin comprises 15-25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP-binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute-binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. Copyright © 2013 Wiley Periodicals, Inc.

  5. Structural and functional characterization of solute binding proteins for aromatic compounds derived from lignin: p-coumaric acid and related aromatic acids

    PubMed Central

    Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C.; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R.

    2013-01-01

    Lignin comprises 15.25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP.binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute.binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. PMID:23606130

  6. TAL effector-DNA specificity.

    PubMed

    Scholze, Heidi; Boch, Jens

    2010-01-01

    TAL effectors are important virulence factors of bacterial plant pathogenic Xanthomonas, which infect a wide variety of plants including valuable crops like pepper, rice, and citrus. TAL proteins are translocated via the bacterial type III secretion system into host cells and induce transcription of plant genes by binding to target gene promoters. Members of the TAL effector family differ mainly in their central domain of tandemly arranged repeats of typically 34 amino acids each with hypervariable di-amino acids at positions 12 and 13. We recently showed that target DNA-recognition specificity of TAL effectors is encoded in a modular and clearly predictable mode. The repeats of TAL effectors feature a surprising one repeat-to-one-bp correlation with different repeat types exhibiting a different DNA base pair specificity. Accordingly, we predicted DNA specificities of TAL effectors and generated artificial TAL proteins with novel DNA recognition specificities. We describe here novel artificial TALs and discuss implications for the DNA recognition specificity. The unique TAL-DNA binding domain allows design of proteins with potentially any given DNA recognition specificity enabling many uses for biotechnology.

  7. Finding the target sites of RNA-binding proteins

    PubMed Central

    Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D

    2014-01-01

    RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996

  8. Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.

    PubMed

    Nielsen, Morten; Lundegaard, Claus; Worning, Peder; Hvid, Christina Sylvester; Lamberth, Kasper; Buus, Søren; Brunak, Søren; Lund, Ole

    2004-06-12

    Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design. We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.

  9. Great interactions: How binding incorrect partners can teach us about protein recognition and function.

    PubMed

    Vamparys, Lydie; Laurent, Benoist; Carbone, Alessandra; Sacquin-Mora, Sophie

    2016-10-01

    Protein-protein interactions play a key part in most biological processes and understanding their mechanism is a fundamental problem leading to numerous practical applications. The prediction of protein binding sites in particular is of paramount importance since proteins now represent a major class of therapeutic targets. Amongst others methods, docking simulations between two proteins known to interact can be a useful tool for the prediction of likely binding patches on a protein surface. From the analysis of the protein interfaces generated by a massive cross-docking experiment using the 168 proteins of the Docking Benchmark 2.0, where all possible protein pairs, and not only experimental ones, have been docked together, we show that it is also possible to predict a protein's binding residues without having any prior knowledge regarding its potential interaction partners. Evaluating the performance of cross-docking predictions using the area under the specificity-sensitivity ROC curve (AUC) leads to an AUC value of 0.77 for the complete benchmark (compared to the 0.5 AUC value obtained for random predictions). Furthermore, a new clustering analysis performed on the binding patches that are scattered on the protein surface show that their distribution and growth will depend on the protein's functional group. Finally, in several cases, the binding-site predictions resulting from the cross-docking simulations will lead to the identification of an alternate interface, which corresponds to the interaction with a biomolecular partner that is not included in the original benchmark. Proteins 2016; 84:1408-1421. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

  10. Comprehensive analysis of T cell epitope discovery strategies using 17DD yellow fever virus structural proteins and BALB/c (H2d) mice model.

    PubMed

    Maciel, Milton; Kellathur, Srinivasan N; Chikhlikar, Pryia; Dhalia, Rafael; Sidney, John; Sette, Alessandro; August, Thomas J; Marques, Ernesto T A

    2008-08-15

    Immunomics research uses in silico epitope prediction, as well as in vivo and in vitro approaches. We inoculated BALB/c (H2d) mice with 17DD yellow fever vaccine to investigate the correlations between approaches used for epitope discovery: ELISPOT assays, binding assays, and prediction software. Our results showed a good agreement between ELISPOT and binding assays, which seemed to correlate with the protein immunogenicity. PREDBALB/c prediction software partially agreed with the ELISPOT and binding assay results, but presented low specificity. The use of prediction software to exclude peptides containing no epitopes, followed by high throughput screening of the remaining peptides by ELISPOT, and the use of MHC-biding assays to characterize the MHC restrictions demonstrated to be an efficient strategy. The results allowed the characterization of 2 MHC class I and 17 class II epitopes in the envelope protein of the YF virus in BALB/c (H2d) mice.

  11. Tb3+-cleavage assays reveal specific Mg2+ binding sites necessary to pre-fold the btuB riboswitch for AdoCbl binding

    NASA Astrophysics Data System (ADS)

    Choudhary, Pallavi K.; Gallo, Sofia; Sigel, Roland K. O.

    2017-03-01

    Riboswitches are RNA elements that bind specific metabolites in order to regulate the gene expression involved in controlling the cellular concentration of the respective molecule or ion. Ligand recognition is mostly facilitated by Mg2+ mediated pre-organization of the riboswitch to an active tertiary fold. To predict these specific Mg2+ induced tertiary interactions of the btuB riboswitch from E. coli, we here report Mg2+ binding pockets in its aptameric part in both, the ligand-free and the ligand-bound form. An ensemble of weak and strong metal ion binding sites distributed over the entire aptamer was detected by terbium(III) cleavage assays, Tb3+ being an established Mg2+ mimic. Interestingly many of the Mn+ (n = 2 or 3) binding sites involve conserved bases within the class of coenzyme B12-binding riboswitches. Comparison with the published crystal structure of the coenzyme B12 riboswitch of S. thermophilum aided in identifying a common set of Mn+ binding sites that might be crucial for tertiary interactions involved in the organization of the aptamer. Our results suggest that Mn+ binding at strategic locations of the btuB riboswitch indeed facilitates the assembly of the binding pocket needed for ligand recognition. Binding of the specific ligand, coenzyme B12 (AdoCbl), to the btuB aptamer does however not lead to drastic alterations of these Mn+ binding cores, indicating the lack of a major rearrangement within the three-dimensional structure of the RNA. This finding is strengthened by Tb3+ mediated footprints of the riboswitch's structure in its ligand-free and ligand-bound state indicating that AdoCbl indeed induces local changes rather than a global structural rearrangement.

  12. Rate and extent of protein localization is controlled by peptide-binding domain association kinetics and morphology.

    PubMed

    Mills, Evan; Truong, Kevin

    2009-06-01

    Protein localization is an important regulatory mechanism in many cell signaling pathways such as cytoskeletal organization and genetic regulation. The specific mechanism of protein localization determines the kinetics and morphological constraints of protein translocation, and thus affects the rate and extent of localization. To investigate the affect of localization kinetics and morphology on protein localization, we designed a protein localization system based on Ca(2+)-calmodulin and Src homology 3 domain binding peptides that can translocate between specific localizations in response to a Ca(2+) signal. We used a stochastic biomolecular simulator to predict that such a protein localization system will exhibit slower and less complete translocations when the association kinetics of a binding domain and peptide are reduced. As well, we predicted that increasing the diffusion resistance by manipulating the morphology of the system would similarly impair translocation speed and completeness. We then constructed a network of synthetic fusion proteins and showed that these predictions could be qualitatively confirmed in vitro. This work provides a basis for explaining the different characteristics (rate and extent) of protein transport and localization in cells as a consequence of the kinetics and morphology of the transport mechanism.

  13. Prediction of striatal D2 receptor binding by DRD2/ANKK1 TaqIA allele status

    PubMed Central

    Eisenstein, Sarah A.; Bogdan, Ryan; Love-Gregory, Latisha; Corral-Frías, Nadia S.; Koller, Jonathan M.; Black, Kevin J.; Moerlein, Stephen M.; Perlmutter, Joel S.; Barch, Deanna M.; Hershey, Tamara

    2016-01-01

    In humans, the A1 (T) allele of the dopamine (DA) D2 receptor/ankyrin repeat and kinase domain containing 1 (DRD2/ANKK1) TaqIA (rs1800497) single nucleotide polymorphism has been associated with reduced striatal DA D2/D3 receptor (D2/D3R) availability. However, radioligands used to estimate D2/D3R are displaceable by endogenous DA and are non-selective for D2R, leaving the relationship between TaqIA genotype and D2R specific binding uncertain. Using the positron emission tomography (PET) radioligand, (N‐[11C]methyl)benperidol ([11C]NMB), which is highly selective for D2R over D3R and is not displaceable by endogenous DA, the current study examined whether DRD2/ANKK1 TaqIA genotype predicts D2R specific binding in 2 independent samples. Sample 1 (n = 39) was composed of obese and non-obese adults; sample 2 (n = 18) was composed of healthy controls, unmedicated individuals with schizophrenia, and siblings of individuals with schizophrenia. Across both samples, A1 allele carriers (A1+) had 5-12% less striatal D2R specific binding relative to individuals homozygous for the A2 allele (A1−), regardless of body mass index or diagnostic group. This reduction is comparable to previous PET studies of D2/D3R availability (10-14%). The pooled effect size for the difference in total striatal D2R binding between A1+ and A1− was large (0.84). In summary, in line with studies using displaceable D2/D3R radioligands, our results indicate that DRD2/ANKK1 TaqIA allele status predicts striatal D2R specific binding as measured by D2R-selective [11C]NMB. These findings support the hypothesis that DRD2/ANKK1 TaqIA allele status may modify D2R, perhaps conferring risk for certain disease states. GRAPHICAL ABSTRACT We investigated the difference in striatal dopamine D2 receptor binding, as measured by PET with (N-[11C]methyl)benperidol ([11C]NMB), between A1 allele carriers (A1+) and individuals homozygous for the A2 allele (A1−) of the DRD2/ANKK1 TaqIA single nucleotide polymorphism. In Study 1, A1+ had 5-12% less striatal [11C]NMB binding than A1−. PMID:27241797

  14. Steric and thermodynamic limits of design for the incorporation of large unnatural amino acids in aminoacyl-tRNA synthetase enzymes.

    PubMed

    Armen, Roger S; Schiller, Stefan M; Brooks, Charles L

    2010-06-01

    Orthogonal aminoacyl-tRNA synthetase/tRNA pairs from archaea have been evolved to facilitate site specific in vivo incorporation of unnatural amino acids into proteins in Escherichia coli. Using this approach, unnatural amino acids have been successfully incorporated with high translational efficiency and fidelity. In this study, CHARMM-based molecular docking and free energy calculations were used to evaluate rational design of specific protein-ligand interactions for aminoacyl-tRNA synthetases. A series of novel unnatural amino acid ligands were docked into the p-benzoyl-L-phenylalanine tRNA synthetase, which revealed that the binding pocket of the enzyme does not provide sufficient space for significantly larger ligands. Specific binding site residues were mutated to alanine to create additional space to accommodate larger target ligands, and then mutations were introduced to improve binding free energy. This approach was used to redesign binding sites for several different target ligands, which were then tested against the standard 20 amino acids to verify target specificity. Only the synthetase designed to bind Man-alpha-O-Tyr was predicted to be sufficiently selective for the target ligand and also thermodynamically stable. Our study suggests that extensive redesign of the tRNA synthatase binding pocket for large bulky ligands may be quite thermodynamically unfavorable.

  15. Engineering the Pseudomonas aeruginosa II lectin: designing mutants with changed affinity and specificity

    NASA Astrophysics Data System (ADS)

    Kříž, Zdeněk; Adam, Jan; Mrázková, Jana; Zotos, Petros; Chatzipavlou, Thomais; Wimmerová, Michaela; Koča, Jaroslav

    2014-09-01

    This article focuses on designing mutations of the PA-IIL lectin from Pseudomonas aeruginosa that lead to change in specificity. Following the previous results revealing the importance of the amino acid triad 22-23-24 (so-called specificity-binding loop), saturation in silico mutagenesis was performed, with the intent of finding mutations that increase the lectin's affinity and modify its specificity. For that purpose, a combination of docking, molecular dynamics and binding free energy calculation was used. The combination of methods revealed mutations that changed the performance of the wild-type lectin and its mutants to their preferred partners. The mutation at position 22 resulted in 85 % in inactivation of the binding site, and the mutation at 23 did not have strong effects thanks to the side chain being pointed away from the binding site. Molecular dynamics simulations followed by binding free energy calculation were performed on mutants with promising results from docking, and also at those where the amino acid at position 24 was replaced for bulkier or longer polar chain. The key mutants were also prepared in vitro and their binding properties determined by isothermal titration calorimetry. Combination of the used methods proved to be able to predict changes in the lectin performance and helped in explaining the data observed experimentally.

  16. Fast and reliable prediction of domain-peptide binding affinity using coarse-grained structure models.

    PubMed

    Tian, Feifei; Tan, Rui; Guo, Tailin; Zhou, Peng; Yang, Li

    2013-07-01

    Domain-peptide recognition and interaction are fundamentally important for eukaryotic signaling and regulatory networks. It is thus essential to quantitatively infer the binding stability and specificity of such interaction based upon large-scale but low-accurate complex structure models which could be readily obtained from sophisticated molecular modeling procedure. In the present study, a new method is described for the fast and reliable prediction of domain-peptide binding affinity with coarse-grained structure models. This method is designed to tolerate strong random noises involved in domain-peptide complex structures and uses statistical modeling approach to eliminate systematic bias associated with a group of investigated samples. As a paradigm, this method was employed to model and predict the binding behavior of various peptides to four evolutionarily unrelated peptide-recognition domains (PRDs), i.e. human amph SH3, human nherf PDZ, yeast syh GYF and yeast bmh 14-3-3, and moreover, we explored the molecular mechanism and biological implication underlying the binding of cognate and noncognate peptide ligands to their domain receptors. It is expected that the newly proposed method could be further used to perform genome-wide inference of domain-peptide binding at three-dimensional structure level. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  17. Characterizing SH2 Domain Specificity and Network Interactions Using SPOT Peptide Arrays.

    PubMed

    Liu, Bernard A

    2017-01-01

    Src Homology 2 (SH2) domains are protein interaction modules that recognize and bind tyrosine phosphorylated ligands. Their ability to distinguish binding to over thousands of potential phosphotyrosine (pTyr) ligands within the cell is critical for the fidelity of receptor tyrosine kinase (RTK) signaling. Within humans there are over a hundred SH2 domains with more than several thousand potential ligands across many cell types and cell states. Therefore, defining the specificity of individual SH2 domains is critical for predicting and identifying their physiological ligands. Here, in this chapter, I describe the broad use of SPOT peptide arrays for examining SH2 domain specificity. An orientated peptide array library (OPAL) approach can uncover both favorable and non-favorable residues, thus providing an in-depth analysis to SH2 specificity. Moreover, I discuss the application of SPOT arrays for paneling SH2 ligand binding with physiological peptides.

  18. The solution structure of the pentatricopeptide repeat protein PPR10 upon binding atpH RNA

    PubMed Central

    Gully, Benjamin S.; Cowieson, Nathan; Stanley, Will A.; Shearston, Kate; Small, Ian D.; Barkan, Alice; Bond, Charles S.

    2015-01-01

    The pentatricopeptide repeat (PPR) protein family is a large family of RNA-binding proteins that is characterized by tandem arrays of a degenerate 35-amino-acid motif which form an α-solenoid structure. PPR proteins influence the editing, splicing, translation and stability of specific RNAs in mitochondria and chloroplasts. Zea mays PPR10 is amongst the best studied PPR proteins, where sequence-specific binding to two RNA transcripts, atpH and psaJ, has been demonstrated to follow a recognition code where the identity of two amino acids per repeat determines the base-specificity. A recently solved ZmPPR10:psaJ complex crystal structure suggested a homodimeric complex with considerably fewer sequence-specific protein–RNA contacts than inferred previously. Here we describe the solution structure of the ZmPPR10:atpH complex using size-exclusion chromatography-coupled synchrotron small-angle X-ray scattering (SEC-SY-SAXS). Our results support prior evidence that PPR10 binds RNA as a monomer, and that it does so in a manner that is commensurate with a canonical and predictable RNA-binding mode across much of the RNA–protein interface. PMID:25609698

  19. MuPeXI: prediction of neo-epitopes from tumor sequencing data.

    PubMed

    Bjerregaard, Anne-Mette; Nielsen, Morten; Hadrup, Sine Reker; Szallasi, Zoltan; Eklund, Aron Charles

    2017-09-01

    Personalization of immunotherapies such as cancer vaccines and adoptive T cell therapy depends on identification of patient-specific neo-epitopes that can be specifically targeted. MuPeXI, the mutant peptide extractor and informer, is a program to identify tumor-specific peptides and assess their potential to be neo-epitopes. The program input is a file with somatic mutation calls, a list of HLA types, and optionally a gene expression profile. The output is a table with all tumor-specific peptides derived from nucleotide substitutions, insertions, and deletions, along with comprehensive annotation, including HLA binding and similarity to normal peptides. The peptides are sorted according to a priority score which is intended to roughly predict immunogenicity. We applied MuPeXI to three tumors for which predicted MHC-binding peptides had been screened for T cell reactivity, and found that MuPeXI was able to prioritize immunogenic peptides with an area under the curve of 0.63. Compared to other available tools, MuPeXI provides more information and is easier to use. MuPeXI is available as stand-alone software and as a web server at http://www.cbs.dtu.dk/services/MuPeXI .

  20. The electrostatic surface of MDM2 modulates the specificity of its interaction with phosphorylated and unphosphorylated p53 peptides.

    PubMed

    Brown, Christopher John; Srinivasan, Deepa; Jun, Lee Hui; Coomber, David; Verma, Chandra S; Lane, David P

    2008-03-01

    Florescence anisotropy measurements using FAM-labelled p53 peptides showed that the binding of the peptides to MDM2 was dependant upon the phosphorylation of p53 at Thr18 and that this binding was modulated by the electrostatic properties of MDM2. In agreement with computational predictions, the binding to phosphorylated p53 peptide, in comparison to the unphosphorylated p53 peptide, was enhanced upon mutation of 3 key residues on the MDM2 surface.

  1. Learning a peptide-protein binding affinity predictor with kernel ridge regression

    PubMed Central

    2013-01-01

    Background The cellular function of a vast majority of proteins is performed through physical interactions with other biomolecules, which, most of the time, are other proteins. Peptides represent templates of choice for mimicking a secondary structure in order to modulate protein-protein interaction. They are thus an interesting class of therapeutics since they also display strong activity, high selectivity, low toxicity and few drug-drug interactions. Furthermore, predicting peptides that would bind to a specific MHC alleles would be of tremendous benefit to improve vaccine based therapy and possibly generate antibodies with greater affinity. Modern computational methods have the potential to accelerate and lower the cost of drug and vaccine discovery by selecting potential compounds for testing in silico prior to biological validation. Results We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalizes eight kernels, comprised of the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it’s approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of predicting the binding affinity of any peptide to any protein with reasonable accuracy. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. Conclusion On all benchmarks, our method significantly (p-value ≤ 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. Moreover, generating reliable peptide-protein binding affinities will also improve system biology modelling of interaction pathways. Lastly, the method should be of value to a large segment of the research community with the potential to accelerate the discovery of peptide-based drugs and facilitate vaccine development. The proposed kernel is freely available at http://graal.ift.ulaval.ca/downloads/gs-kernel/. PMID:23497081

  2. Saccharomyces cerevisiae MSH2-MSH3 and MSH2-MSH6 complexes display distinct requirements for DNA binding Domain I in mismatch recognition.

    PubMed Central

    Lee, Susan D.; Surtees, Jennifer A.; Alani, Eric

    2007-01-01

    In eukaryotic mismatch repair (MMR) MSH2-MSH6 initiates the repair of base-base and small insertion/deletion mismatches while MSH2-MSH3 repairs larger insertion/deletion mismatches. In this study we showed that the msh2Δ1 mutation, containing a complete deletion of the conserved mismatch recognition Domain I of MSH2, conferred a separation of function phenotype with respect to MSH2-MSH3 and MSH2-MSH6 functions. Strains bearing the msh2Δ1 mutation were nearly wild-type in MSH2-MSH6-mediated MMR and in suppressing recombination between DNA sequences predicted to form mismatches recognized by MSH2-MSH6. However, these strains were completely defective in MSH2-MSH3-mediated MMR and recombination functions. This information encouraged us to analyze the contributions of Domain I to the mismatch binding specificity of MSH2-MSH3 in genetic and biochemical assays. We found that Domain I in MSH2 contributed a non-specific DNA binding activity while Domain I of MSH3 appeared important for mismatch binding specificity and for suppressing non-specific DNA-binding. These observations reveal distinct requirements for the MSH2 DNA binding Domain I in the repair of DNA mismatches and suggest that the binding of MSH2-MSH3 to mismatch DNA involves protein-DNA contacts that appear very different from those required for MSH2-MSH6 mismatch binding. PMID:17157869

  3. Saccharomyces cerevisiae MSH2-MSH3 and MSH2-MSH6 complexes display distinct requirements for DNA binding domain I in mismatch recognition.

    PubMed

    Lee, Susan D; Surtees, Jennifer A; Alani, Eric

    2007-02-09

    In eukaryotic mismatch repair (MMR) MSH2-MSH6 initiates the repair of base-base and small insertion/deletion mismatches while MSH2-MSH3 repairs larger insertion/deletion mismatches. Here, we show that the msh2Delta1 mutation, containing a complete deletion of the conserved mismatch recognition domain I of MSH2, conferred a separation of function phenotype with respect to MSH2-MSH3 and MSH2-MSH6 functions. Strains bearing the msh2Delta1 mutation were nearly wild-type in MSH2-MSH6-mediated MMR and in suppressing recombination between DNA sequences predicted to form mismatches recognized by MSH2-MSH6. However, these strains were completely defective in MSH2-MSH3-mediated MMR and recombination functions. This information encouraged us to analyze the contributions of domain I to the mismatch binding specificity of MSH2-MSH3 in genetic and biochemical assays. We found that domain I in MSH2 contributed a non-specific DNA binding activity while domain I of MSH3 appeared important for mismatch binding specificity and for suppressing non-specific DNA binding. These observations reveal distinct requirements for the MSH2 DNA binding domain I in the repair of DNA mismatches and suggest that the binding of MSH2-MSH3 to mismatch DNA involves protein-DNA contacts that appear very different from those required for MSH2-MSH6 mismatch binding.

  4. Characterization of the molecular basis of group II intron RNA recognition by CRS1-CRM domains.

    PubMed

    Keren, Ido; Klipcan, Liron; Bezawork-Geleta, Ayenachew; Kolton, Max; Shaya, Felix; Ostersetzer-Biran, Oren

    2008-08-22

    CRM (chloroplast RNA splicing and ribosome maturation) is a recently recognized RNA-binding domain of ancient origin that has been retained in eukaryotic genomes only within the plant lineage. Whereas in bacteria CRM domains exist as single domain proteins involved in ribosome maturation, in plants they are found in a family of proteins that contain between one and four repeats. Several members of this family with multiple CRM domains have been shown to be required for the splicing of specific plastidic group II introns. Detailed biochemical analysis of one of these factors in maize, CRS1, demonstrated its high affinity and specific binding to the single group II intron whose splicing it facilitates, the plastid-encoded atpF intron RNA. Through its association with two intronic regions, CRS1 guides the folding of atpF intron RNA into its predicted "catalytically active" form. To understand how multiple CRM domains cooperate to achieve high affinity sequence-specific binding to RNA, we analyzed the RNA binding affinity and specificity associated with each individual CRM domain in CRS1; whereas CRM3 bound tightly to the RNA, CRM1 associated specifically with a unique region found within atpF intron domain I. CRM2, which demonstrated only low binding affinity, also seems to form specific interactions with regions localized to domains I, III, and IV. We further show that CRM domains share structural similarities and RNA binding characteristics with the well known RNA recognition motif domain.

  5. Ribonucleoprotein complexes in neurologic diseases.

    PubMed

    Ule, Jernej

    2008-10-01

    Ribonucleoprotein (RNP) complexes regulate the tissue-specific RNA processing and transport that increases the coding capacity of our genome and the ability to respond quickly and precisely to the diverse set of signals. This review focuses on three proteins that are part of RNP complexes in most cells of our body: TAR DNA-binding protein (TDP-43), the survival motor neuron protein (SMN), and fragile-X mental retardation protein (FMRP). In particular, the review asks the question why these ubiquitous proteins are primarily associated with defects in specific regions of the central nervous system? To understand this question, it is important to understand the role of genetic and cellular environment in causing the defect in the protein, as well as how the defective protein leads to misregulation of specific target RNAs. Two approaches for comprehensive analysis of defective RNA-protein interactions are presented. The first approach defines the RNA code or the collection of proteins that bind to a certain cis-acting RNA site in order to lead to a predictable outcome. The second approach defines the RNA map or the summary of positions on target RNAs where binding of a particular RNA-binding protein leads to a predictable outcome. As we learn more about the RNA codes and maps that guide the action of the dynamic RNP world in our brain, possibilities for new treatments of neurologic diseases are bound to emerge.

  6. Improved binding affinity and interesting selectivities of aminopyrimidine-bearing carbohydrate receptors in comparison with their aminopyridine analogues.

    PubMed

    Lippe, Jan; Seichter, Wilhelm; Mazik, Monika

    2015-12-28

    Due to the problems with the exact prediction of the binding properties of an artificial carbohydrate receptor, the identification of characteristic structural features, having the ability to influence the binding properties in a predictable way, is of high importance. The purpose of our investigation was to examine whether the previously observed higher affinity of 2-aminopyrimidine-bearing carbohydrate receptors in comparison with aminopyridine substituted analogues represents a general tendency of aminopyrimidine-bearing compounds. Systematic binding studies on new compounds consisting of 2-aminopyrimidine groups confirmed such a tendency and allowed the identification of interesting structure-activity relationships. Receptors having different symmetries showed systematic preferences for specific glycosides, which are remarkable for such simple receptor systems. Particularly suitable receptor architectures for the recognition of selected glycosides were identified and represent a valuable base for further developments in this field.

  7. Measurements of nonlinear Hall-driven reconnection in the reversed field pinch

    NASA Astrophysics Data System (ADS)

    Tharp, Timothy D.

    Complex organisms are able to develop because of the complex regulatory systems that control their gene expression. The first step in this regulation, transcription initiation, is controlled by transcription factors. Transcription factors are modular proteins composed of two distinct domains, the DNA binding domain and the regulatory domain. These molecules are involved in a plethora of important biological processes including embryogenesis, development, cell health, and cancer. Tissue enriched transcription factors Nkx-2.5 and Gata4 are involved in cardiac development and cardiac health. In this thesis the DNA binding specificity of Nkx-2.5 will be analyzed using a high throughput double stranded DNA platform called Cognate Site Identifier (CSI) arrays (Chapter 2). The full DNA binding specificity of Nkx-2.5 and Nkx-2.5 mutants will be visualized using Sequence Specificity Landscapes (SSLs). In Chapter 3, the definition of binding specificity will be investigated by evaluating a number of different DNA binding folds by CSI and SSLs. CSI and SSLs will also be used to evaluate different pyrrole/imidazole hairpin polyamides in order to better characterize these small molecule DNA binding domains. CSI and SSL data will be applied to the genome in order to explain the biological function an artificial transcription factor. Chapter 4 will discuss the mechanism of nonspecific DNA binding. The historical means of predicting DNA binding will be challenged by utilizing high throughput experiments. The effect of salt concentration on both specific and nonspecific binding will also be investigated. Finally, in Chapter 5, a generation of Protein DNA Dimerizer will be discussed. A PDD that regulates transcription on genomic DNA by binding cooperatively with the heart IF Gata4 will be characterized. These studies provide understanding of, and a means to control, how transcription factors sample the endless sea of DNA in the genome in order to regulate gene expression with such wonderful specificity.

  8. Genome-wide inference of transcription factor-DNA binding specificity in cell regeneration using a combination strategy.

    PubMed

    Wang, Xiaofeng; Zhang, Aiqun; Ren, Weizheng; Chen, Caiyu; Dong, Jiahong

    2012-11-01

    The cell growth, development, and regeneration of tissue and organ are associated with a large number of gene regulation events, which are mediated in part by transcription factors (TFs) binding to cis-regulatory elements involved in the genome. Predicting the binding affinity and inferring the binding specificity of TF-DNA interactions at the genomic level would be fundamentally helpful for our understanding of the molecular mechanism and biological implication underlying sequence-specific TF-DNA recognition. In this study, we report the development of a combination method to characterize the interaction behavior of a 11-mer oligonucleotide segment and its mutations with the Gcn4p protein, a homodimeric, basic leucine zipper TF, and to predict the binding affinity and specificity of potential Gcn4p binders in the genome-wide scale. In this procedure, a position-mutated energy matrix is created based on molecular modeling analysis of native and mutated Gcn4p-DNA complex structures to describe the position-independent interaction energy profile of Gcn4p with different nucleotide types at each position of the oligonucleotide, and the energy terms extracted from the matrix and their interactives are then correlated with experimentally measured affinities of 19268 distinct oligonucleotides using statistical modeling methodology. Subsequently, the best one of built regression models is successfully applied to screen those of potential high-affinity Gcn4p binders from the complete genome. The findings arising from this study are briefly listed below: (i) The 11 positions of oligonucleotides are highly interactive and non-additive in contribution to Gcn4p-DNA binding affinity; (ii) Indirect conformational effects upon nucleotide mutations as well as associated subtle changes in interfacial atomic contacts, but not the direct nonbonded interactions, are primarily responsible for the sequence-specific recognition; (iii) The intrinsic synergistic effects among the sequence positions of oligonucleotides determine Gcn4p-DNA binding affinity and specificity; (iv) Linear regression models in conjunction with variable selection seem to perform fairly well in capturing the internal dependences hidden in the Gcn4p-DNA system, albeit ignoring nonlinear factors may lead the models to systematically underestimate and overestimate high- and low-affinity samples, respectively. © 2012 John Wiley & Sons A/S.

  9. A multiprotein binding interface in an intrinsically disordered region of the tumor suppressor protein interferon regulatory factor-1.

    PubMed

    Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R; Vojtesek, Borivoj; Ball, Kathryn L

    2011-04-22

    The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106-140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs.

  10. Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

    PubMed

    Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

    2013-02-01

    Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

  11. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites.

    PubMed

    Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong

    2015-01-01

    Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.

  12. footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

    PubMed

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2014-01-15

    Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.

  13. RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2017-02-28

    RNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation. In viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications. The iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep.

  14. Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction.

    PubMed

    Pérot, Stéphanie; Regad, Leslie; Reynès, Christelle; Spérandio, Olivier; Miteva, Maria A; Villoutreix, Bruno O; Camproux, Anne-Claude

    2013-01-01

    Pockets are today at the cornerstones of modern drug discovery projects and at the crossroad of several research fields, from structural biology to mathematical modeling. Being able to predict if a small molecule could bind to one or more protein targets or if a protein could bind to some given ligands is very useful for drug discovery endeavors, anticipation of binding to off- and anti-targets. To date, several studies explore such questions from chemogenomic approach to reverse docking methods. Most of these studies have been performed either from the viewpoint of ligands or targets. However it seems valuable to use information from both ligands and target binding pockets. Hence, we present a multivariate approach relating ligand properties with protein pocket properties from the analysis of known ligand-protein interactions. We explored and optimized the pocket-ligand pair space by combining pocket and ligand descriptors using Principal Component Analysis and developed a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physico-chemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physico-chemical properties and capture relevant information with respect to protein-ligand interactions. Based on these pocket-ligand correspondences, a protocol of prediction of clusters sharing similarity in terms of recognition characteristics is developed for a given pocket-ligand complex and gives high performances. It is then extended to cluster prediction for a given pocket in order to acquire knowledge about its expected ligand profile or to cluster prediction for a given ligand in order to acquire knowledge about its expected pocket profile. This prediction approach shows promising results and could contribute to predict some ligand properties critical for binding to a given pocket, and conversely, some key pocket properties for ligand binding.

  15. Insights into an Original Pocket-Ligand Pair Classification: A Promising Tool for Ligand Profile Prediction

    PubMed Central

    Reynès, Christelle; Spérandio, Olivier; Miteva, Maria A.; Villoutreix, Bruno O.; Camproux, Anne-Claude

    2013-01-01

    Pockets are today at the cornerstones of modern drug discovery projects and at the crossroad of several research fields, from structural biology to mathematical modeling. Being able to predict if a small molecule could bind to one or more protein targets or if a protein could bind to some given ligands is very useful for drug discovery endeavors, anticipation of binding to off- and anti-targets. To date, several studies explore such questions from chemogenomic approach to reverse docking methods. Most of these studies have been performed either from the viewpoint of ligands or targets. However it seems valuable to use information from both ligands and target binding pockets. Hence, we present a multivariate approach relating ligand properties with protein pocket properties from the analysis of known ligand-protein interactions. We explored and optimized the pocket-ligand pair space by combining pocket and ligand descriptors using Principal Component Analysis and developed a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physico-chemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physico-chemical properties and capture relevant information with respect to protein-ligand interactions. Based on these pocket-ligand correspondences, a protocol of prediction of clusters sharing similarity in terms of recognition characteristics is developed for a given pocket-ligand complex and gives high performances. It is then extended to cluster prediction for a given pocket in order to acquire knowledge about its expected ligand profile or to cluster prediction for a given ligand in order to acquire knowledge about its expected pocket profile. This prediction approach shows promising results and could contribute to predict some ligand properties critical for binding to a given pocket, and conversely, some key pocket properties for ligand binding. PMID:23840299

  16. Electrostatic Interactions Mediate Binding of Obscurin to Small Ankyrin 1: Biochemical and Molecular Modeling Studies

    PubMed Central

    Busby, Ben; Oashi, Taiji; Willis, Chris D.; Ackermann, Maegen A.; Kontrogianni-Konstantopoulos, Aikaterini; MacKerell, Alexander D.; Bloch, Robert J.

    2012-01-01

    Small ankyrin 1 (sAnk1; also Ank1.5) is an integral protein of the sarcoplasmic reticulum in skeletal and cardiac muscle cells, where it is thought to bind to the C-terminal region of obscurin, a large modular protein that surrounds the contractile apparatus. Using fusion proteins in vitro, in combination with site directed mutagenesis and surface plasmon resonance measurements, we previously showed that the binding site on sAnk1 for obscurin consists in part of six lysine and arginine residues. Here we show that four charged residues in the high affinity binding site on obscurin for sAnk1, between residues 6316-6345, consisting of three glutamates and a lysine, are necessary, but not sufficient, for this site on obscurin to bind with high affinity to sAnk1. We also identify specific complementary mutations in sAnk1 that can partially or completely compensate for the changes in binding caused by charge-switching mutations in obscurin. We used molecular modeling to develop structural models of residues 6322-6339 of obscurin bound to sAnk1. The models, based on a combination of Brownian and molecular dynamics simulations, predict that the binding site on sAnk1 for obscurin is organized as two ankyrin-like repeats, with the last α-helical segment oriented at an angle to the nearby helices, allowing lysine-6338 of obscurin to form an ionic interaction with aspartate-111 of sAnk1. This prediction was validated by double mutant cycle experiments. Our results are consistent with a model in which electrostatic interactions between specific pairs of side chains on obscurin and sAnk1 promote binding and complex formation. PMID:21333652

  17. Cloning of an SNF2/SWI2-related protein that binds specifically to the SPH motifs of the SV40 enhancer and to the HIV-1 promoter.

    PubMed

    Sheridan, P L; Schorpp, M; Voz, M L; Jones, K A

    1995-03-03

    We have isolated a human cDNA clone encoding HIP116, a protein that binds to the SPH repeats of the SV40 enhancer and to the TATA/inhibitor region of the human immunodeficiency virus (HIV)-1 promoter. The predicted HIP116 protein is related to the yeast SNF2/SWI2 transcription factor and to other members of this extended family and contains seven domains similar to those found in the vaccinia NTP1 ATPase. Interestingly, HIP116 also contains a C3HC4 zinc-binding motif (RING finger) interspersed between the ATPase motifs in an arrangement similar to that found in the yeast RAD5 and RAD16 proteins. The HIP116 amino terminus is unique among the members of this family, and houses a specific DNA-binding domain. Antiserum raised against HIP116 recognizes a 116-kDa nuclear protein in Western blots and specifically supershifts SV40 and HIV-1 protein-DNA complexes in gel shift experiments. The binding site for HIP116 on the SV40 enhancer directly overlaps the site for TEF-1, and like TEF-1, binding of HIP116 to the SV40 enhancer is destroyed by mutations that inhibit SPH enhancer activity in vivo. Purified fractions of HIP116 display strong ATPase activity that is preferentially stimulated by SPH DNA and can be inhibited specifically by antibodies to HIP116. These findings suggest that HIP116 might affect transcription, directly or indirectly, by acting as a DNA binding site-specific ATPase.

  18. DNA breathing dynamics distinguish binding from nonbinding consensus sites for transcription factor YY1 in cells.

    PubMed

    Alexandrov, Boian S; Fukuyo, Yayoi; Lange, Martin; Horikoshi, Nobuo; Gelev, Vladimir; Rasmussen, Kim Ø; Bishop, Alan R; Usheva, Anny

    2012-11-01

    The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Giuliani, Sarah E; Frank, Ashley M; Corgliano, Danielle M

    Abstract Background: Transporter proteins are one of an organism s primary interfaces with the environment. The expressed set of transporters mediates cellular metabolic capabilities and influences signal transduction pathways and regulatory networks. The functional annotation of most transporters is currently limited to general classification into families. The development of capabilities to map ligands with specific transporters would improve our knowledge of the function of these proteins, improve the annotation of related genomes, and facilitate predictions for their role in cellular responses to environmental changes. Results: To improve the utility of the functional annotation for ABC transporters, we expressed and purifiedmore » the set of solute binding proteins from Rhodopseudomonas palustris and characterized their ligand-binding specificity. Our approach utilized ligand libraries consisting of environmental and cellular metabolic compounds, and fluorescence thermal shift based high throughput ligand binding screens. This process resulted in the identification of specific binding ligands for approximately 64% of the purified and screened proteins. The collection of binding ligands is representative of common functionalities associated with many bacterial organisms as well as specific capabilities linked to the ecological niche occupied by R. palustris. Conclusion: The functional screen identified specific ligands that bound to ABC transporter periplasmic binding subunits from R. palustris. These assignments provide unique insight for the metabolic capabilities of this organism and are consistent with the ecological niche of strain isolation. This functional insight can be used to improve the annotation of related organisms and provides a route to evaluate the evolution of this important and diverse group of transporter proteins.« less

  20. Steric and Thermodynamic Limits of Design for the Incorporation of Large UnNatural Amino Acids in Aminoacyl-tRNA Synthetase Enzymes

    PubMed Central

    Armen, Roger S.; Schiller, Stefan M.; Brooks, Charles L.

    2015-01-01

    Orthogonal aminoacyl-tRNA synthetase/tRNA pairs from archaea have been evolved to facilitate site specific in vivo incorporation of unnatural amino acids into proteins in Escherichia coli. Using this approach, unnatural amino acids have been successfully incorporated with high translational efficiency and fidelity. In this study, CHARMM-based molecular docking and free energy calculations were used to evaluate rational design of specific protein-ligand interactions for aminoacyl-tRNA synthetases. A series of novel unnatural amino acid ligands were docked into the p-benzoyl-L-phenylalanine tRNA synthetase, which revealed that the binding pocket of the enzyme does not provide sufficient space for significantly larger ligands. Specific binding site residues were mutated to alanine to create additional space to accommodate larger target ligands, and then mutations were introduced to improve binding free energy. This approach was used to redesign binding sites for several different target ligands, which were then tested against the standard 20 amino acids to verify target specificity. Only the synthetase designed to bind Man-α-O-Tyr was predicted to be sufficiently selective for the target ligand and also thermodynamically stable. Our study suggests that extensive redesign of the tRNA synthatase binding pocket for large bulky ligands may be quite thermodynamically unfavorable. PMID:20310065

  1. Two residues in the basic region of the yeast transcription factor Yap8 are crucial for its DNA-binding specificity.

    PubMed

    Amaral, Catarina; Pimentel, Catarina; Matos, Rute G; Arraiano, Cecília M; Matzapetakis, Manolis; Rodrigues-Pousada, Claudina

    2013-01-01

    In Saccharomyces cerevisiae, the transcription factor Yap8 is a key determinant in arsenic stress response. Contrary to Yap1, another basic region-leucine zipper (bZIP) yeast regulator, Yap8 has a very restricted DNA-binding specificity and only orchestrates the expression of ACR2 and ACR3 genes. In the DNA-binding basic region, Yap8 has three distinct amino acids residues, Leu26, Ser29 and Asn31, at sites of highly conserved positions in the other Yap family of transcriptional regulators and Pap1 of Schizosaccharomyces pombe. To evaluate whether these residues are relevant to Yap8 specificity, we first built a homology model of the complex Yap8bZIP-DNA based on Pap1-DNA crystal structure. Several Yap8 mutants were then generated in order to confirm the contribution of the residues predicted to interact with DNA. Using bioinformatics analysis together with in vivo and in vitro approaches, we have identified several conserved residues critical for Yap8-DNA binding. Moreover, our data suggest that Leu26 is required for Yap8 binding to DNA and that this residue together with Asn31, hinder Yap1 response element recognition by Yap8, thus narrowing its DNA-binding specificity. Furthermore our results point to a role of these two amino acids in the stability of the Yap8-DNA complex.

  2. Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape.

    PubMed

    Dai, Hanjun; Umarov, Ramzan; Kuwahara, Hiroyuki; Li, Yu; Song, Le; Gao, Xin

    2017-11-15

    An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods. Our program is freely available at https://github.com/ramzan1990/sequence2vec. xin.gao@kaust.edu.sa or lsong@cc.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  3. Binding Free Energy Calculations for Lead Optimization: Assessment of Their Accuracy in an Industrial Drug Design Context.

    PubMed

    Homeyer, Nadine; Stoll, Friederike; Hillisch, Alexander; Gohlke, Holger

    2014-08-12

    Correctly ranking compounds according to their computed relative binding affinities will be of great value for decision making in the lead optimization phase of industrial drug discovery. However, the performance of existing computationally demanding binding free energy calculation methods in this context is largely unknown. We analyzed the performance of the molecular mechanics continuum solvent, the linear interaction energy (LIE), and the thermodynamic integration (TI) approach for three sets of compounds from industrial lead optimization projects. The data sets pose challenges typical for this early stage of drug discovery. None of the methods was sufficiently predictive when applied out of the box without considering these challenges. Detailed investigations of failures revealed critical points that are essential for good binding free energy predictions. When data set-specific features were considered accordingly, predictions valuable for lead optimization could be obtained for all approaches but LIE. Our findings lead to clear recommendations for when to use which of the above approaches. Our findings also stress the important role of expert knowledge in this process, not least for estimating the accuracy of prediction results by TI, using indicators such as the size and chemical structure of exchanged groups and the statistical error in the predictions. Such knowledge will be invaluable when it comes to the question which of the TI results can be trusted for decision making.

  4. Predicting permanent and transient protein-protein interfaces.

    PubMed

    La, David; Kong, Misun; Hoffman, William; Choi, Youn Im; Kihara, Daisuke

    2013-05-01

    Protein-protein interactions (PPIs) are involved in diverse functions in a cell. To optimize functional roles of interactions, proteins interact with a spectrum of binding affinities. Interactions are conventionally classified into permanent and transient, where the former denotes tight binding between proteins that result in strong complexes, whereas the latter compose of relatively weak interactions that can dissociate after binding to regulate functional activity at specific time point. Knowing the type of interactions has significant implications for understanding the nature and function of PPIs. In this study, we constructed amino acid substitution models that capture mutation patterns at permanent and transient type of protein interfaces, which were found to be different with statistical significance. Using the substitution models, we developed a novel computational method that predicts permanent and transient protein binding interfaces (PBIs) in protein surfaces. Without knowledge of the interacting partner, the method uses a single query protein structure and a multiple sequence alignment of the sequence family. Using a large dataset of permanent and transient proteins, we show that our method, BindML+, performs very well in protein interface classification. A very high area under the curve (AUC) value of 0.957 was observed when predicted protein binding sites were classified. Remarkably, near prefect accuracy was achieved with an AUC of 0.991 when actual binding sites were classified. The developed method will be also useful for protein design of permanent and transient PBIs. Copyright © 2013 Wiley Periodicals, Inc.

  5. The structure of distractor-response bindings: Conditions for configural and elemental integration.

    PubMed

    Moeller, Birte; Frings, Christian; Pfister, Roland

    2016-04-01

    Human action control is influenced by bindings between perceived stimuli and responses carried out in their presence. Notably, responses given to a target stimulus can also be integrated with additional response-irrelevant distractor stimuli that accompany the target (distractor-response binding). Subsequently reencountering such a distractor then retrieves the associated response. Although a large body of evidence supports the existence of this effect, the specific structure of distractor-response bindings is still unclear. Here, we test the predictions derived from 2 possible assumptions about the structure of bindings between distractors and responses. According to a configural approach, the entire distractor object is integrated with a response, and only upon repetition of the entire distractor object the associated response would be retrieved. According to an elemental approach, one would predict integration of individual distractor features with the response and retrieval due to the repetition of an individual distractor feature. Four experiments indicate that both, configural and elemental bindings exist and specify boundary conditions for each type of binding. These findings provide detailed insights into the architecture of bindings between response-irrelevant stimuli and actions and thus allow for specifying how distractor stimuli influence human behavior. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  6. Structure and DNA-Binding Sites of the SWI1 AT-rich Interaction Domain (ARID) Suggest Determinants for Sequence-Specific DNA Recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Suhkmann; Zhang, Ziming; Upchurch, Sean

    2004-04-16

    2 ARID is a homologous family of DNA-binding domains that occur in DNA binding proteins from a wide variety of species, ranging from yeast to nematodes, insects, mammals and plants. SWI1, a member of the SWI/SNF protein complex that is involved in chromatin remodeling during transcription, contains the ARID motif. The ARID domain of human SWI1 (also known as p270) does not select for a specific DNA sequence from a random sequence pool. The lack of sequence specificity shown by the SWI1 ARID domain stands in contrast to the other characterized ARID domains, which recognize specific AT-rich sequences. We havemore » solved the three-dimensional structure of human SWI1 ARID using solution NMR methods. In addition, we have characterized non-specific DNA-binding by the SWI1 ARID domain. Results from this study indicate that a flexible long internal loop in ARID motif is likely to be important for sequence specific DNA-recognition. The structure of human SWI1 ARID domain also represents a distinct structural subfamily. Studies of ARID indicate that boundary of the DNA binding structural and functional domains can extend beyond the sequence homologous region in a homologous family of proteins. Structural studies of homologous domains such as ARID family of DNA-binding domains should provide information to better predict the boundary of structural and functional domains in structural genomic studies. Key Words: ARID, SWI1, NMR, structural genomics, protein-DNA interaction.« less

  7. A novel assay to identify the trafficking proteins that bind to specific vesicle populations

    PubMed Central

    Bentley, Marvin; Banker, Gary

    2016-01-01

    Here we describe a method capable of identifying interactions between candidate trafficking proteins and a defined vesicle population in intact cells. The assay involves the expression of an FKBP12-rapamycin–binding domain (FRB)–tagged candidate vesicle-binding protein that can be inducibly linked to an FKBP-tagged molecular motor. If the FRB-tagged candidate protein binds the labeled vesicles, then linking the FRB and FKBP domains recruits motors to the vesicles and causes a predictable, highly distinctive change in vesicle trafficking. We describe two versions of the assay: a general protocol for use in cells with a typical microtubule-organizing center and a specialized protocol designed to detect protein-vesicle interactions in cultured neurons. We have successfully used this assay to identify kinesins and Rabs that bind to a variety of different vesicle populations. In principle, this assay could be used to investigate interactions between any category of vesicle trafficking proteins and any vesicle population that can be specifically labeled. PMID:26621371

  8. Analysis of functional importance of binding sites in the Drosophila gap gene network model.

    PubMed

    Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria

    2015-01-01

    The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.

  9. NetMHCcons: a consensus method for the major histocompatibility complex class I predictions.

    PubMed

    Karosiene, Edita; Lundegaard, Claus; Lund, Ole; Nielsen, Morten

    2012-03-01

    A key role in cell-mediated immunity is dedicated to the major histocompatibility complex (MHC) molecules that bind peptides for presentation on the cell surface. Several in silico methods capable of predicting peptide binding to MHC class I have been developed. The accuracy of these methods depends on the data available characterizing the binding specificity of the MHC molecules. It has, moreover, been demonstrated that consensus methods defined as combinations of two or more different methods led to improved prediction accuracy. This plethora of methods makes it very difficult for the non-expert user to choose the most suitable method for predicting binding to a given MHC molecule. In this study, we have therefore made an in-depth analysis of combinations of three state-of-the-art MHC-peptide binding prediction methods (NetMHC, NetMHCpan and PickPocket). We demonstrate that a simple combination of NetMHC and NetMHCpan gives the highest performance when the allele in question is included in the training and is characterized by at least 50 data points with at least ten binders. Otherwise, NetMHCpan is the best predictor. When an allele has not been characterized, the performance depends on the distance to the training data. NetMHCpan has the highest performance when close neighbours are present in the training set, while the combination of NetMHCpan and PickPocket outperforms either of the two methods for alleles with more remote neighbours. The final method, NetMHCcons, is publicly available at www.cbs.dtu.dk/services/NetMHCcons , and allows the user in an automatic manner to obtain the most accurate predictions for any given MHC molecule.

  10. Druggability of methyl-lysine binding sites

    NASA Astrophysics Data System (ADS)

    Santiago, C.; Nguyen, K.; Schapira, M.

    2011-12-01

    Structural modules that specifically recognize—or read—methylated or acetylated lysine residues on histone peptides are important components of chromatin-mediated signaling and epigenetic regulation of gene expression. Deregulation of epigenetic mechanisms is associated with disease conditions, and antagonists of acetyl-lysine binding bromodomains are efficacious in animal models of cancer and inflammation, but little is known regarding the druggability of methyl-lysine binding modules. We conducted a systematic structural analysis of readers of methyl marks and derived a predictive druggability landscape of methyl-lysine binding modules. We show that these target classes are generally less druggable than bromodomains, but that some proteins stand as notable exceptions.

  11. Quantum annealing versus classical machine learning applied to a simplified computational biology problem

    PubMed Central

    Li, Richard Y.; Di Felice, Rosa; Rohs, Remo; Lidar, Daniel A.

    2018-01-01

    Transcription factors regulate gene expression, but how these proteins recognize and specifically bind to their DNA targets is still debated. Machine learning models are effective means to reveal interaction mechanisms. Here we studied the ability of a quantum machine learning approach to predict binding specificity. Using simplified datasets of a small number of DNA sequences derived from actual binding affinity experiments, we trained a commercially available quantum annealer to classify and rank transcription factor binding. The results were compared to state-of-the-art classical approaches for the same simplified datasets, including simulated annealing, simulated quantum annealing, multiple linear regression, LASSO, and extreme gradient boosting. Despite technological limitations, we find a slight advantage in classification performance and nearly equal ranking performance using the quantum annealer for these fairly small training data sets. Thus, we propose that quantum annealing might be an effective method to implement machine learning for certain computational biology problems. PMID:29652405

  12. Deep-sea vent phage DNA polymerase specifically initiates DNA synthesis in the absence of primers.

    PubMed

    Zhu, Bin; Wang, Longfei; Mitsunobu, Hitoshi; Lu, Xueling; Hernandez, Alfredo J; Yoshida-Takashima, Yukari; Nunoura, Takuro; Tabor, Stanley; Richardson, Charles C

    2017-03-21

    A DNA polymerase is encoded by the deep-sea vent phage NrS-1. NrS-1 has a unique genome organization containing genes that are predicted to encode a helicase and a single-stranded DNA (ssDNA)-binding protein. The gene for an unknown protein shares weak homology with the bifunctional primase-polymerases (prim-pols) from archaeal plasmids but is missing the zinc-binding domain typically found in primases. We show that this gene product has efficient DNA polymerase activity and is processive in DNA synthesis in the presence of the NrS-1 helicase and ssDNA-binding protein. Remarkably, this NrS-1 DNA polymerase initiates DNA synthesis from a specific template DNA sequence in the absence of any primer. The de novo DNA polymerase activity resides in the N-terminal domain of the protein, whereas the C-terminal domain enhances DNA binding.

  13. Serial interactome capture of the human cell nucleus.

    PubMed

    Conrad, Thomas; Albrecht, Anne-Susann; de Melo Costa, Veronica Rodrigues; Sauer, Sascha; Meierhofer, David; Ørom, Ulf Andersson

    2016-04-04

    Novel RNA-guided cellular functions are paralleled by an increasing number of RNA-binding proteins (RBPs). Here we present 'serial RNA interactome capture' (serIC), a multiple purification procedure of ultraviolet-crosslinked poly(A)-RNA-protein complexes that enables global RBP detection with high specificity. We apply serIC to the nuclei of proliferating K562 cells to obtain the first human nuclear RNA interactome. The domain composition of the 382 identified nuclear RBPs markedly differs from previous IC experiments, including few factors without known RNA-binding domains that are in good agreement with computationally predicted RNA binding. serIC extends the number of DNA-RNA-binding proteins (DRBPs), and reveals a network of RBPs involved in p53 signalling and double-strand break repair. serIC is an effective tool to couple global RBP capture with additional selection or labelling steps for specific detection of highly purified RBPs.

  14. Screening and structure-based modeling of T-cell epitopes of Nipah virus proteome: an immunoinformatic approach for designing peptide-based vaccine.

    PubMed

    Kamthania, Mohit; Sharma, D K

    2015-12-01

    Identification of Nipah virus (NiV) T-cell-specific antigen is urgently needed for appropriate diagnostic and vaccination. In the present study, prediction and modeling of T-cell epitopes of Nipah virus antigenic proteins nucleocapsid, phosphoprotein, matrix, fusion, glycoprotein, L protein, W protein, V protein and C protein followed by the binding simulation studies of predicted highest binding scorers with their corresponding MHC class I alleles were done. Immunoinformatic tool ProPred1 was used to predict the promiscuous MHC class I epitopes of viral antigenic proteins. The molecular modelings of the epitopes were done by PEPstr server. And alleles structure were predicted by MODELLER 9.10. Molecular dynamics (MD) simulation studies were performed through the NAMD graphical user interface embedded in visual molecular dynamics. Epitopes VPATNSPEL, NPTAVPFTL and LLFVFGPNL of Nucleocapsid, V protein and Fusion protein have considerable binding energy and score with HLA-B7, HLA-B*2705 and HLA-A2MHC class I allele, respectively. These three predicted peptides are highly potential to induce T-cell-mediated immune response and are expected to be useful in designing epitope-based vaccines against Nipah virus after further testing by wet laboratory studies.

  15. Sex- and Tissue-specific Functions of Drosophila Doublesex Transcription Factor Target Genes

    PubMed Central

    Clough, Emily; Jimenez, Erin; Kim, Yoo-Ah; Whitworth, Cale; Neville, Megan C.; Hempel, Leonie; Pavlou, Hania J.; Chen, Zhen-Xia; Sturgill, David; Dale, Ryan; Smith, Harold E.; Przytycka, Teresa M.; Goodwin, Stephen F.; Van Doren, Mark; Oliver, Brian

    2014-01-01

    Primary sex determination “switches” evolve rapidly, but Doublesex (DSX) related transcription factors (DMRTs) act downstream of these switches to control sexual development in most animal species. Drosophila dsx encodes female- and male-specific isoforms (DSXF and DSXM), but little is known about how dsx controls sexual development, whether DSXF and DSXM bind different targets, or how DSX proteins direct different outcomes in diverse tissues. We undertook genome-wide analyses to identify DSX targets using in vivo occupancy, binding site prediction, and evolutionary conservation. We find that DSXF and DSXM bind thousands of the same targets in multiple tissues in both sexes, yet these targets have sex- and tissue-specific functions. Interestingly, DSX targets show considerable overlap with targets identified for mouse DMRT1. DSX targets include transcription factors and signaling pathway components providing for direct and indirect regulation of sex-biased expression. PMID:25535918

  16. A Hot-Spot Motif Characterizes the Interface between a Designed Ankyrin-Repeat Protein and Its Target Ligand

    PubMed Central

    Cheung, Luthur Siu-Lun; Kanwar, Manu; Ostermeier, Marc; Konstantopoulos, Konstantinos

    2012-01-01

    Nonantibody scaffolds such as designed ankyrin repeat proteins (DARPins) can be rapidly engineered to detect diverse target proteins with high specificity and offer an attractive alternative to antibodies. Using molecular simulations, we predicted that the binding interface between DARPin off7 and its ligand (maltose binding protein; MBP) is characterized by a hot-spot motif in which binding energy is largely concentrated on a few amino acids. To experimentally test this prediction, we fused MBP to a transmembrane domain to properly orient the protein into a polymer-cushioned lipid bilayer, and characterized its interaction with off7 using force spectroscopy. Using this, to our knowledge, novel technique along with surface plasmon resonance, we validated the simulation predictions and characterized the effects of select mutations on the kinetics of the off7-MBP interaction. Our integrated approach offers scientific insights on how the engineered protein interacts with the target molecule. PMID:22325262

  17. Protein interactions and ligand binding: from protein subfamilies to functional specificity.

    PubMed

    Rausell, Antonio; Juan, David; Pazos, Florencio; Valencia, Alfonso

    2010-02-02

    The divergence accumulated during the evolution of protein families translates into their internal organization as subfamilies, and it is directly reflected in the characteristic patterns of differentially conserved residues. These specifically conserved positions in protein subfamilies are known as "specificity determining positions" (SDPs). Previous studies have limited their analysis to the study of the relationship between these positions and ligand-binding specificity, demonstrating significant yet limited predictive capacity. We have systematically extended this observation to include the role of differential protein interactions in the segregation of protein subfamilies and explored in detail the structural distribution of SDPs at protein interfaces. Our results show the extensive influence of protein interactions in the evolution of protein families and the widespread association of SDPs with protein interfaces. The combined analysis of SDPs in interfaces and ligand-binding sites provides a more complete picture of the organization of protein families, constituting the necessary framework for a large scale analysis of the evolution of protein function.

  18. Exploring the role of water in molecular recognition: predicting protein ligandability using a combinatorial search of surface hydration sites.

    PubMed

    Vukovic, Sinisa; Brennan, Paul E; Huggins, David J

    2016-09-01

    The interaction between any two biological molecules must compete with their interaction with water molecules. This makes water the most important molecule in medicine, as it controls the interactions of every therapeutic with its target. A small molecule binding to a protein is able to recognize a unique binding site on a protein by displacing bound water molecules from specific hydration sites. Quantifying the interactions of these water molecules allows us to estimate the potential of the protein to bind a small molecule. This is referred to as ligandability. In the study, we describe a method to predict ligandability by performing a search of all possible combinations of hydration sites on protein surfaces. We predict ligandability as the summed binding free energy for each of the constituent hydration sites, computed using inhomogeneous fluid solvation theory. We compared the predicted ligandability with the maximum observed binding affinity for 20 proteins in the human bromodomain family. Based on this comparison, it was determined that effective inhibitors have been developed for the majority of bromodomains, in the range from 10 to 100 nM. However, we predict that more potent inhibitors can be developed for the bromodomains BPTF and BRD7 with relative ease, but that further efforts to develop inhibitors for ATAD2 will be extremely challenging. We have also made predictions for the 14 bromodomains with no reported small molecule K d values by isothermal titration calorimetry. The calculations predict that PBRM1(1) will be a challenging target, while others such as TAF1L(2), PBRM1(4) and TAF1(2), should be highly ligandable. As an outcome of this work, we assembled a database of experimental maximal K d that can serve as a community resource assisting medicinal chemistry efforts focused on BRDs. Effective prediction of ligandability would be a very useful tool in the drug discovery process.

  19. Exploring the role of water in molecular recognition: predicting protein ligandability using a combinatorial search of surface hydration sites

    NASA Astrophysics Data System (ADS)

    Vukovic, Sinisa; Brennan, Paul E.; Huggins, David J.

    2016-09-01

    The interaction between any two biological molecules must compete with their interaction with water molecules. This makes water the most important molecule in medicine, as it controls the interactions of every therapeutic with its target. A small molecule binding to a protein is able to recognize a unique binding site on a protein by displacing bound water molecules from specific hydration sites. Quantifying the interactions of these water molecules allows us to estimate the potential of the protein to bind a small molecule. This is referred to as ligandability. In the study, we describe a method to predict ligandability by performing a search of all possible combinations of hydration sites on protein surfaces. We predict ligandability as the summed binding free energy for each of the constituent hydration sites, computed using inhomogeneous fluid solvation theory. We compared the predicted ligandability with the maximum observed binding affinity for 20 proteins in the human bromodomain family. Based on this comparison, it was determined that effective inhibitors have been developed for the majority of bromodomains, in the range from 10 to 100 nM. However, we predict that more potent inhibitors can be developed for the bromodomains BPTF and BRD7 with relative ease, but that further efforts to develop inhibitors for ATAD2 will be extremely challenging. We have also made predictions for the 14 bromodomains with no reported small molecule K d values by isothermal titration calorimetry. The calculations predict that PBRM1(1) will be a challenging target, while others such as TAF1L(2), PBRM1(4) and TAF1(2), should be highly ligandable. As an outcome of this work, we assembled a database of experimental maximal K d that can serve as a community resource assisting medicinal chemistry efforts focused on BRDs. Effective prediction of ligandability would be a very useful tool in the drug discovery process.

  20. A Multiprotein Binding Interface in an Intrinsically Disordered Region of the Tumor Suppressor Protein Interferon Regulatory Factor-1*

    PubMed Central

    Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R.; Vojtesek, Borivoj; Ball, Kathryn L.

    2011-01-01

    The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106–140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs. PMID:21245151

  1. A new method for the construction of a mutant library with a predictable occurrence rate using Poisson distribution.

    PubMed

    Seong, Ki Moon; Park, Hweon; Kim, Seong Jung; Ha, Hyo Nam; Lee, Jae Yung; Kim, Joon

    2007-06-01

    A yeast transcriptional activator, Gcn4p, induces the expression of genes that are involved in amino acid and purine biosynthetic pathways under amino acid starvation. Gcn4p has an acidic activation domain in the central region and a bZIP domain in the C-terminus that is divided into the DNA-binding motif and dimerization leucine zipper motif. In order to identify amino acids in the DNA-binding motif of Gcn4p which are involved in transcriptional activation, we constructed mutant libraries in the DNA-binding motif through an innovative application of random mutagenesis. Mutant library made by oligonucleotides which were mutated randomly using the Poisson distribution showed that the actual mutation frequency was in good agreement with expected values. This method could save the time and effort to create a mutant library with a predictable mutation frequency. Based on the studies using the mutant libraries constructed by the new method, the specific residues of the DNA-binding domain in Gcn4p appear to be involved in the transcriptional activities on a conserved binding site.

  2. Effects of serotonin-2A receptor binding and gender on personality traits and suicidal behavior in borderline personality disorder.

    PubMed

    Soloff, Paul H; Chiappetta, Laurel; Mason, Neale Scott; Becker, Carl; Price, Julie C

    2014-06-30

    Impulsivity and aggressiveness are personality traits associated with a vulnerability to suicidal behavior. Behavioral expression of these traits differs by gender and has been related to central serotonergic function. We assessed the relationships between serotonin-2A receptor function, gender, and personality traits in borderline personality disorder (BPD), a disorder characterized by impulsive-aggression and recurrent suicidal behavior. Participants, who included 33 BPD patients and 27 healthy controls (HC), were assessed for Axis I and II disorders with the Structured Clinical Interview for DSM-IV and the International Personality Disorders Examination, and with the Diagnostic Interview for Borderline Patients-Revised for BPD. Depressed mood, impulsivity, aggression, and temperament were assessed with standardized measures. Positron emission tomography with [(18)F]altanserin as ligand and arterial blood sampling was used to determine the binding potentials (BPND) of serotonin-2A receptors in 11 regions of interest. Data were analyzed using Logan graphical analysis, controlling for age and non-specific binding. Among BPD subjects, aggression, Cluster B co-morbidity, antisocial PD, and childhood abuse were each related to altanserin binding. BPND values predicted impulsivity and aggression in BPD females (but not BPD males), and in HC males (but not HC females.) Altanserin binding was greater in BPD females than males in every contrast, but it did not discriminate suicide attempters from non-attempters. Region-specific differences in serotonin-2A receptor binding related to diagnosis and gender predicted clinical expression of aggression and impulsivity. Vulnerability to suicidal behavior in BPD may be related to serotonin-2A binding through expression of personality risk factors. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  3. Genome-Wide Prediction and Validation of Peptides That Bind Human Prosurvival Bcl-2 Proteins

    PubMed Central

    DeBartolo, Joe; Taipale, Mikko; Keating, Amy E.

    2014-01-01

    Programmed cell death is regulated by interactions between pro-apoptotic and prosurvival members of the Bcl-2 family. Pro-apoptotic family members contain a weakly conserved BH3 motif that can adopt an alpha-helical structure and bind to a groove on prosurvival partners Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. Peptides corresponding to roughly 13 reported BH3 motifs have been verified to bind in this manner. Due to their short lengths and low sequence conservation, BH3 motifs are not detected using standard sequence-based bioinformatics approaches. Thus, it is possible that many additional proteins harbor BH3-like sequences that can mediate interactions with the Bcl-2 family. In this work, we used structure-based and data-based Bcl-2 interaction models to find new BH3-like peptides in the human proteome. We used peptide SPOT arrays to test candidate peptides for interaction with one or more of the prosurvival proteins Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. For the 36 most promising array candidates, we quantified binding to all five human receptors using direct and competition binding assays in solution. All 36 peptides showed evidence of interaction with at least one prosurvival protein, and 22 peptides bound at least one prosurvival protein with a dissociation constant between 1 and 500 nM; many peptides had specificity profiles not previously observed. We also screened the full-length parent proteins of a subset of array-tested peptides for binding to Bcl-xL and Mcl-1. Finally, we used the peptide binding data, in conjunction with previously reported interactions, to assess the affinity and specificity prediction performance of different models. PMID:24967846

  4. Side-chain conformational space analysis (SCSA): A multi conformation-based QSAR approach for modeling and prediction of protein-peptide binding affinities

    NASA Astrophysics Data System (ADS)

    Zhou, Peng; Chen, Xiang; Shang, Zhicai

    2009-03-01

    In this article, the concept of multi conformation-based quantitative structure-activity relationship (MCB-QSAR) is proposed, and based upon that, we describe a new approach called the side-chain conformational space analysis (SCSA) to model and predict protein-peptide binding affinities. In SCSA, multi-conformations (rather than traditional single-conformation) have received much attention, and the statistical average information on multi-conformations of side chains is determined using self-consistent mean field theory based upon side chain rotamer library. Thereby, enthalpy contributions (including electrostatic, steric, hydrophobic interaction and hydrogen bond) and conformational entropy effects to the binding are investigated in terms of occurrence probability of residue rotamers. Then, SCSA was applied into the dataset of 419 HLA-A*0201 binding peptides, and nonbonding contributions of each position in peptide ligands are well determined. For the peptides, the hydrogen bond and electrostatic interactions of the two ends are essential to the binding specificity, van der Waals and hydrophobic interactions of all the positions ensure strong binding affinity, and the loss of conformational entropy at anchor positions partially counteracts other favorable nonbonding effects.

  5. The development and application of a quantitative peptide microarray platform to SH2 domain specificity space

    NASA Astrophysics Data System (ADS)

    Engelmann, Brett Warren

    The Src homology 2 (SH2) domains evolved alongside protein tyrosine kinases (PTKs) and phosphatases (PTPs) in metazoans to recognize the phosphotyrosine (pY) post-translational modification. The human genome encodes 121 SH2 domains within 111 SH2 domain containing proteins that represent the primary mechanism for cellular signal transduction immediately downstream of PTKs. Despite pY recognition contributing to roughly half of the binding energy, SH2 domains possess substantial binding specificity, or affinity discrimination between phosphopeptide ligands. This specificity is largely imparted by amino acids (AAs) adjacent to the pY, typically from positions +1 to +4 C-terminal to the pY. Much experimental effort has been undertaken to construct preferred binding motifs for many SH2 domains. However, due to limitations in previous experimental methodologies these motifs do not account for the interplay between AAs. It was therefore not known how AAs within the context of individual peptides function to impart SH2 domain specificity. In this work we identified the critical role context plays in defining SH2 domain specificity for physiological ligands. We also constructed a high quality interactome using 50 SH2 domains and 192 physiological ligands. We next developed a quantitative high-throughput (Q-HTP) peptide microarray platform to assess the affinities four SH2 domains have for 124 physiological ligands. We demonstrated the superior characteristics of our platform relative to preceding approaches and validated our results using established biophysical techniques, literature corroboration, and predictive algorithms. The quantitative information provided by the arrays was leveraged to investigate SH2 domain binding distributions and identify points of binding overlap. Our microarray derived affinity estimates were integrated to produce quantitative interaction motifs capable of predicting interactions. Furthermore, our microarrays proved capable of resolving subtle contextual differences within motifs that modulate interaction affinities. We conclude that contextually informed specificity profiling of protein interaction domains using the methodologies developed in this study can inform efforts to understand the interconnectivity of signaling networks in normal and aberrant states. Three supplementary tables containing detailed lists of peptides, interactions, and sources of corroborative information are provided.

  6. Subfamily-specific adaptations in the structures of two penicillin-binding proteins from Mycobacterium tuberculosis

    DOE PAGES

    Prigozhin, Daniil M.; Krieger, Inna V.; Huizar, John P.; ...

    2014-12-31

    Beta-lactam antibiotics target penicillin-binding proteins including several enzyme classes essential for bacterial cell-wall homeostasis. To better understand the functional and inhibitor-binding specificities of penicillin-binding proteins from the pathogen, Mycobacterium tuberculosis, we carried out structural and phylogenetic analysis of two predicted D,D-carboxypeptidases, Rv2911 and Rv3330. Optimization of Rv2911 for crystallization using directed evolution and the GFP folding reporter method yielded a soluble quadruple mutant. Structures of optimized Rv2911 bound to phenylmethylsulfonyl fluoride and Rv3330 bound to meropenem show that, in contrast to the nonspecific inhibitor, meropenem forms an extended interaction with the enzyme along a conserved surface. Phylogenetic analysis shows thatmore » Rv2911 and Rv3330 belong to different clades that emerged in Actinobacteria and are not represented in model organisms such as Escherichia coli and Bacillus subtilis. Clade-specific adaptations allow these enzymes to fulfill distinct physiological roles despite strict conservation of core catalytic residues. The characteristic differences include potential protein-protein interaction surfaces and specificity-determining residues surrounding the catalytic site. Overall, these structural insights lay the groundwork to develop improved beta-lactam therapeutics for tuberculosis.« less

  7. Subfamily-specific adaptations in the structures of two penicillin-binding proteins from Mycobacterium tuberculosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prigozhin, Daniil M.; Krieger, Inna V.; Huizar, John P.

    Beta-lactam antibiotics target penicillin-binding proteins including several enzyme classes essential for bacterial cell-wall homeostasis. To better understand the functional and inhibitor-binding specificities of penicillin-binding proteins from the pathogen, Mycobacterium tuberculosis, we carried out structural and phylogenetic analysis of two predicted D,D-carboxypeptidases, Rv2911 and Rv3330. Optimization of Rv2911 for crystallization using directed evolution and the GFP folding reporter method yielded a soluble quadruple mutant. Structures of optimized Rv2911 bound to phenylmethylsulfonyl fluoride and Rv3330 bound to meropenem show that, in contrast to the nonspecific inhibitor, meropenem forms an extended interaction with the enzyme along a conserved surface. Phylogenetic analysis shows thatmore » Rv2911 and Rv3330 belong to different clades that emerged in Actinobacteria and are not represented in model organisms such as Escherichia coli and Bacillus subtilis. Clade-specific adaptations allow these enzymes to fulfill distinct physiological roles despite strict conservation of core catalytic residues. The characteristic differences include potential protein-protein interaction surfaces and specificity-determining residues surrounding the catalytic site. Overall, these structural insights lay the groundwork to develop improved beta-lactam therapeutics for tuberculosis.« less

  8. A novel class of plant-specific zinc-dependent DNA-binding protein that binds to A/T-rich DNA sequences

    PubMed Central

    Nagano, Yukio; Furuhashi, Hirofumi; Inaba, Takehito; Sasaki, Yukiko

    2001-01-01

    Complementary DNA encoding a DNA-binding protein, designated PLATZ1 (plant AT-rich sequence- and zinc-binding protein 1), was isolated from peas. The amino acid sequence of the protein is similar to those of other uncharacterized proteins predicted from the genome sequences of higher plants. However, no paralogous sequences have been found outside the plant kingdom. Multiple alignments among these paralogous proteins show that several cysteine and histidine residues are invariant, suggesting that these proteins are a novel class of zinc-dependent DNA-binding proteins with two distantly located regions, C-x2-H-x11-C-x2-C-x(4–5)-C-x2-C-x(3–7)-H-x2-H and C-x2-C-x(10–11)-C-x3-C. In an electrophoretic mobility shift assay, the zinc chelator 1,10-o-phenanthroline inhibited DNA binding, and two distant zinc-binding regions were required for DNA binding. A protein blot with 65ZnCl2 showed that both regions are required for zinc-binding activity. The PLATZ1 protein non-specifically binds to A/T-rich sequences, including the upstream region of the pea GTPase pra2 and plastocyanin petE genes. Expression of the PLATZ1 repressed those of the reporter constructs containing the coding sequence of luciferase gene driven by the cauliflower mosaic virus (CaMV) 35S90 promoter fused to the tandem repeat of the A/T-rich sequences. These results indicate that PLATZ1 is a novel class of plant-specific zinc-dependent DNA-binding protein responsible for A/T-rich sequence-mediated transcriptional repression. PMID:11600698

  9. Nonlinear scoring functions for similarity-based ligand docking and binding affinity prediction.

    PubMed

    Brylinski, Michal

    2013-11-25

    A common strategy for virtual screening considers a systematic docking of a large library of organic compounds into the target sites in protein receptors with promising leads selected based on favorable intermolecular interactions. Despite a continuous progress in the modeling of protein-ligand interactions for pharmaceutical design, important challenges still remain, thus the development of novel techniques is required. In this communication, we describe eSimDock, a new approach to ligand docking and binding affinity prediction. eSimDock employs nonlinear machine learning-based scoring functions to improve the accuracy of ligand ranking and similarity-based binding pose prediction, and to increase the tolerance to structural imperfections in the target structures. In large-scale benchmarking using the Astex/CCDC data set, we show that 53.9% (67.9%) of the predicted ligand poses have RMSD of <2 Å (<3 Å). Moreover, using binding sites predicted by recently developed eFindSite, eSimDock models ligand binding poses with an RMSD of 4 Å for 50.0-39.7% of the complexes at the protein homology level limited to 80-40%. Simulations against non-native receptor structures, whose mean backbone rearrangements vary from 0.5 to 5.0 Å Cα-RMSD, show that the ratio of docking accuracy and the estimated upper bound is at a constant level of ∼0.65. Pearson correlation coefficient between experimental and predicted by eSimDock Ki values for a large data set of the crystal structures of protein-ligand complexes from BindingDB is 0.58, which decreases only to 0.46 when target structures distorted to 3.0 Å Cα-RMSD are used. Finally, two case studies demonstrate that eSimDock can be customized to specific applications as well. These encouraging results show that the performance of eSimDock is largely unaffected by the deformations of ligand binding regions, thus it represents a practical strategy for across-proteome virtual screening using protein models. eSimDock is freely available to the academic community as a Web server at http://www.brylinski.org/esimdock .

  10. Jenner-predict server: prediction of protein vaccine candidates (PVCs) in bacteria based on host-pathogen interactions

    PubMed Central

    2013-01-01

    Background Subunit vaccines based on recombinant proteins have been effective in preventing infectious diseases and are expected to meet the demands of future vaccine development. Computational approach, especially reverse vaccinology (RV) method has enormous potential for identification of protein vaccine candidates (PVCs) from a proteome. The existing protective antigen prediction software and web servers have low prediction accuracy leading to limited applications for vaccine development. Besides machine learning techniques, those software and web servers have considered only protein’s adhesin-likeliness as criterion for identification of PVCs. Several non-adhesin functional classes of proteins involved in host-pathogen interactions and pathogenesis are known to provide protection against bacterial infections. Therefore, knowledge of bacterial pathogenesis has potential to identify PVCs. Results A web server, Jenner-Predict, has been developed for prediction of PVCs from proteomes of bacterial pathogens. The web server targets host-pathogen interactions and pathogenesis by considering known functional domains from protein classes such as adhesin, virulence, invasin, porin, flagellin, colonization, toxin, choline-binding, penicillin-binding, transferring-binding, fibronectin-binding and solute-binding. It predicts non-cytosolic proteins containing above domains as PVCs. It also provides vaccine potential of PVCs in terms of their possible immunogenicity by comparing with experimentally known IEDB epitopes, absence of autoimmunity and conservation in different strains. Predicted PVCs are prioritized so that only few prospective PVCs could be validated experimentally. The performance of web server was evaluated against known protective antigens from diverse classes of bacteria reported in Protegen database and datasets used for VaxiJen server development. The web server efficiently predicted known vaccine candidates reported from Streptococcus pneumoniae and Escherichia coli proteomes. The Jenner-Predict server outperformed NERVE, Vaxign and VaxiJen methods. It has sensitivity of 0.774 and 0.711 for Protegen and VaxiJen dataset, respectively while specificity of 0.940 has been obtained for the latter dataset. Conclusions Better prediction accuracy of Jenner-Predict web server signifies that domains involved in host-pathogen interactions and pathogenesis are better criteria for prediction of PVCs. The web server has successfully predicted maximum known PVCs belonging to different functional classes. Jenner-Predict server is freely accessible at http://117.211.115.67/vaccine/home.html PMID:23815072

  11. Protein–DNA Interactions: The Story so Far and a New Method for Prediction

    DOE PAGES

    Jones, Susan; Thornton, Janet M.

    2003-01-01

    This review describes methods for the prediction of DNA binding function, and specifically summarizes a new method using 3D structural templates. The new method features the HTH motif that is found in approximately one-third of DNAbinding protein families. A library of 3D structural templates of HTH motifs was derived from proteins in the PDB. Templates were scanned against complete protein structures and the optimal superposition of a template on a structure calculated. Significance thresholds in terms of a minimum root mean squared deviation (rmsd) of an optimal superposition, and a minimum motif accessible surface area (ASA), have been calculated. Inmore » this way, it is possible to scan the template library against proteins of unknown function to make predictions about DNA-binding functionality.« less

  12. Sequence Based Prediction of DNA-Binding Proteins Based on Hybrid Feature Selection Using Random Forest and Gaussian Naïve Bayes

    PubMed Central

    Lou, Wangchao; Wang, Xiaoqing; Chen, Fan; Chen, Yixiao; Jiang, Bo; Zhang, Hua

    2014-01-01

    Developing an efficient method for determination of the DNA-binding proteins, due to their vital roles in gene regulation, is becoming highly desired since it would be invaluable to advance our understanding of protein functions. In this study, we proposed a new method for the prediction of the DNA-binding proteins, by performing the feature rank using random forest and the wrapper-based feature selection using forward best-first search strategy. The features comprise information from primary sequence, predicted secondary structure, predicted relative solvent accessibility, and position specific scoring matrix. The proposed method, called DBPPred, used Gaussian naïve Bayes as the underlying classifier since it outperformed five other classifiers, including decision tree, logistic regression, k-nearest neighbor, support vector machine with polynomial kernel, and support vector machine with radial basis function. As a result, the proposed DBPPred yields the highest average accuracy of 0.791 and average MCC of 0.583 according to the five-fold cross validation with ten runs on the training benchmark dataset PDB594. Subsequently, blind tests on the independent dataset PDB186 by the proposed model trained on the entire PDB594 dataset and by other five existing methods (including iDNA-Prot, DNA-Prot, DNAbinder, DNABIND and DBD-Threader) were performed, resulting in that the proposed DBPPred yielded the highest accuracy of 0.769, MCC of 0.538, and AUC of 0.790. The independent tests performed by the proposed DBPPred on completely a large non-DNA binding protein dataset and two RNA binding protein datasets also showed improved or comparable quality when compared with the relevant prediction methods. Moreover, we observed that majority of the selected features by the proposed method are statistically significantly different between the mean feature values of the DNA-binding and the non DNA-binding proteins. All of the experimental results indicate that the proposed DBPPred can be an alternative perspective predictor for large-scale determination of DNA-binding proteins. PMID:24475169

  13. OST-HTH: a novel predicted RNA-binding domain

    PubMed Central

    2010-01-01

    Background The mechanism by which the arthropod Oskar and vertebrate TDRD5/TDRD7 proteins nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. Using sequence profile searches we identify a novel domain in these proteins that is widely conserved across eukaryotes and bacteria. Results Using contextual information from domain architectures, sequence-structure superpositions and available functional information we predict that this domain is likely to adopt the winged helix-turn-helix fold and bind RNA with a potential specificity for dsRNA. We show that in eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Conclusions Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized domain (DUF88). We present evidence that it is an RNAse belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains and might be recruited to degrade certain RNAs. Reviewers This article was reviewed by Sandor Pongor and Arcady Mushegian. PMID:20302647

  14. Improved detection of DNA-binding proteins via compression technology on PSSM information.

    PubMed

    Wang, Yubo; Ding, Yijie; Guo, Fei; Wei, Leyi; Tang, Jijun

    2017-01-01

    Since the importance of DNA-binding proteins in multiple biomolecular functions has been recognized, an increasing number of researchers are attempting to identify DNA-binding proteins. In recent years, the machine learning methods have become more and more compelling in the case of protein sequence data soaring, because of their favorable speed and accuracy. In this paper, we extract three features from the protein sequence, namely NMBAC (Normalized Moreau-Broto Autocorrelation), PSSM-DWT (Position-specific scoring matrix-Discrete Wavelet Transform), and PSSM-DCT (Position-specific scoring matrix-Discrete Cosine Transform). We also employ feature selection algorithm on these feature vectors. Then, these features are fed into the training SVM (support vector machine) model as classifier to predict DNA-binding proteins. Our method applys three datasets, namely PDB1075, PDB594 and PDB186, to evaluate the performance of our approach. The PDB1075 and PDB594 datasets are employed for Jackknife test and the PDB186 dataset is used for the independent test. Our method achieves the best accuracy in the Jacknife test, from 79.20% to 86.23% and 80.5% to 86.20% on PDB1075 and PDB594 datasets, respectively. In the independent test, the accuracy of our method comes to 76.3%. The performance of independent test also shows that our method has a certain ability to be effectively used for DNA-binding protein prediction. The data and source code are at https://doi.org/10.6084/m9.figshare.5104084.

  15. Differential binding of calmodulin-related proteins to their targets revealed through high-density Arabidopsis protein microarrays

    PubMed Central

    Popescu, Sorina C.; Popescu, George V.; Bachan, Shawn; Zhang, Zimei; Seay, Montrell; Gerstein, Mark; Snyder, Michael; Dinesh-Kumar, S. P.

    2007-01-01

    Calmodulins (CaMs) are the most ubiquitous calcium sensors in eukaryotes. A number of CaM-binding proteins have been identified through classical methods, and many proteins have been predicted to bind CaMs based on their structural homology with known targets. However, multicellular organisms typically contain many CaM-like (CML) proteins, and a global identification of their targets and specificity of interaction is lacking. In an effort to develop a platform for large-scale analysis of proteins in plants we have developed a protein microarray and used it to study the global analysis of CaM/CML interactions. An Arabidopsis thaliana expression collection containing 1,133 ORFs was generated and used to produce proteins with an optimized medium-throughput plant-based expression system. Protein microarrays were prepared and screened with several CaMs/CMLs. A large number of previously known and novel CaM/CML targets were identified, including transcription factors, receptor and intracellular protein kinases, F-box proteins, RNA-binding proteins, and proteins of unknown function. Multiple CaM/CML proteins bound many binding partners, but the majority of targets were specific to one or a few CaMs/CMLs indicating that different CaM family members function through different targets. Based on our analyses, the emergent CaM/CML interactome is more extensive than previously predicted. Our results suggest that calcium functions through distinct CaM/CML proteins to regulate a wide range of targets and cellular activities. PMID:17360592

  16. Crystallographic and Computational Studies of a Class II MHC Complex with a Nonconforming Peptide: HLA-DRA/DRB3*0101

    NASA Astrophysics Data System (ADS)

    Parry, Christian S.; Gorski, Jack; Stern, Lawrence J.

    2003-03-01

    The stable binding of processed foreign peptide to a class II major histocompatibility (MHC) molecule and subsequent presentation to a T cell receptor is a central event in immune recognition and regulation. Polymorphic residues on the floor of the peptide binding site form pockets that anchor peptide side chains. These and other residues in the helical wall of the groove determine the specificity of each allele and define a motif. Allele specific motifs allow the prediction of epitopes from the sequence of pathogens. There are, however, known epitopes that do not satisfy these motifs: anchor motifs are not adequate for predicting epitopes as there are apparently major and minor motifs. We present crystallographic studies into the nature of the interactions that govern the binding of these so called nonconforming peptides. We would like to understand the role of the P10 pocket and find out whether the peptides that do not obey the consensus anchor motif bind in the canonical conformation observed in in prior structures of class II MHC-peptide complexes. HLA-DRB3*0101 complexed with peptide crystallized in unit cell 92.10 x 92.10 x 248.30 (90, 90, 90), P41212, and the diffraction data is reliable to 2.2ÅWe are complementing our studies with dynamical long time simulations to answer these questions, particularly the interplay of the anchor motifs in peptide binding, the range of protein and ligand conformations, and water hydration structures.

  17. Rigid-Docking Approaches to Explore Protein-Protein Interaction Space.

    PubMed

    Matsuzaki, Yuri; Uchikoga, Nobuyuki; Ohue, Masahito; Akiyama, Yutaka

    Protein-protein interactions play core roles in living cells, especially in the regulatory systems. As information on proteins has rapidly accumulated on publicly available databases, much effort has been made to obtain a better picture of protein-protein interaction networks using protein tertiary structure data. Predicting relevant interacting partners from their tertiary structure is a challenging task and computer science methods have the potential to assist with this. Protein-protein rigid docking has been utilized by several projects, docking-based approaches having the advantages that they can suggest binding poses of predicted binding partners which would help in understanding the interaction mechanisms and that comparing docking results of both non-binders and binders can lead to understanding the specificity of protein-protein interactions from structural viewpoints. In this review we focus on explaining current computational prediction methods to predict pairwise direct protein-protein interactions that form protein complexes.

  18. Predicting MHC-II binding affinity using multiple instance regression

    PubMed Central

    EL-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2011-01-01

    Reliably predicting the ability of antigen peptides to bind to major histocompatibility complex class II (MHC-II) molecules is an essential step in developing new vaccines. Uncovering the amino acid sequence correlates of the binding affinity of MHC-II binding peptides is important for understanding pathogenesis and immune response. The task of predicting MHC-II binding peptides is complicated by the significant variability in their length. Most existing computational methods for predicting MHC-II binding peptides focus on identifying a nine amino acids core region in each binding peptide. We formulate the problems of qualitatively and quantitatively predicting flexible length MHC-II peptides as multiple instance learning and multiple instance regression problems, respectively. Based on this formulation, we introduce MHCMIR, a novel method for predicting MHC-II binding affinity using multiple instance regression. We present results of experiments using several benchmark datasets that show that MHCMIR is competitive with the state-of-the-art methods for predicting MHC-II binding peptides. An online web server that implements the MHCMIR method for MHC-II binding affinity prediction is freely accessible at http://ailab.cs.iastate.edu/mhcmir. PMID:20855923

  19. Co-Occurring Atomic Contacts for the Characterization of Protein Binding Hot Spots.

    PubMed

    Liu, Qian; Ren, Jing; Song, Jiangning; Li, Jinyan

    2015-01-01

    A binding hot spot is a small area at a protein-protein interface that can make significant contribution to binding free energy. This work investigates the substantial contribution made by some special co-occurring atomic contacts at a binding hot spot. A co-occurring atomic contact is a pair of atomic contacts that are close to each other with no more than three covalent-bond steps. We found that two kinds of co-occurring atomic contacts can play an important part in the accurate prediction of binding hot spot residues. One is the co-occurrence of two nearby hydrogen bonds. For example, mutations of any residue in a hydrogen bond network consisting of multiple co-occurring hydrogen bonds could disrupt the interaction considerably. The other kind of co-occurring atomic contact is the co-occurrence of a hydrophobic carbon contact and a contact between a hydrophobic carbon atom and a π ring. In fact, this co-occurrence signifies the collective effect of hydrophobic contacts. We also found that the B-factor measurements of several specific groups of amino acids are useful for the prediction of hot spots. Taking the B-factor, individual atomic contacts and the co-occurring contacts as features, we developed a new prediction method and thoroughly assessed its performance via cross-validation and independent dataset test. The results show that our method achieves higher prediction performance than well-known methods such as Robetta, FoldX and Hotpoint. We conclude that these contact descriptors, in particular the novel co-occurring atomic contacts, can be used to facilitate accurate and interpretable characterization of protein binding hot spots.

  20. Co-Occurring Atomic Contacts for the Characterization of Protein Binding Hot Spots

    PubMed Central

    Liu, Qian; Ren, Jing; Song, Jiangning; Li, Jinyan

    2015-01-01

    A binding hot spot is a small area at a protein-protein interface that can make significant contribution to binding free energy. This work investigates the substantial contribution made by some special co-occurring atomic contacts at a binding hot spot. A co-occurring atomic contact is a pair of atomic contacts that are close to each other with no more than three covalent-bond steps. We found that two kinds of co-occurring atomic contacts can play an important part in the accurate prediction of binding hot spot residues. One is the co-occurrence of two nearby hydrogen bonds. For example, mutations of any residue in a hydrogen bond network consisting of multiple co-occurring hydrogen bonds could disrupt the interaction considerably. The other kind of co-occurring atomic contact is the co-occurrence of a hydrophobic carbon contact and a contact between a hydrophobic carbon atom and a π ring. In fact, this co-occurrence signifies the collective effect of hydrophobic contacts. We also found that the B-factor measurements of several specific groups of amino acids are useful for the prediction of hot spots. Taking the B-factor, individual atomic contacts and the co-occurring contacts as features, we developed a new prediction method and thoroughly assessed its performance via cross-validation and independent dataset test. The results show that our method achieves higher prediction performance than well-known methods such as Robetta, FoldX and Hotpoint. We conclude that these contact descriptors, in particular the novel co-occurring atomic contacts, can be used to facilitate accurate and interpretable characterization of protein binding hot spots. PMID:26675422

  1. Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2018-05-02

    RNA-binding proteins (RBPs) take over 5∼10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using pattern learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN run 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. https://github.com/xypan1232/iDeepE. xypan172436@gmail.com or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.

  2. Binding Properties of General Odorant Binding Proteins from the Oriental Fruit Moth, Grapholita molesta (Busck) (Lepidoptera: Tortricidae)

    PubMed Central

    Li, Guangwei; Chen, Xiulin; Li, Boliao; Zhang, Guohui; Li, Yiping; Wu, Junxiang

    2016-01-01

    Background The oriental fruit moth Grapholita molesta is a host-switching pest species. The adults highly depend on olfactory cues in locating optimal host plants and oviposition sites. Odorant binding proteins (OBPs) are thought to be responsible for recognizing and transporting hydrophobic odorants across the aqueous sensillum lymph to stimulate the odorant receptors (ORs) within the antennal sensilla and activate the olfactory signal transduction pathway. Exploring the physiological function of these OBPs could facilitate understanding insect chemical communications. Methodology/Principal Finding Two antennae-specific general OBPs (GOBPs) of G. molesta were expressed and purified in vitro. The binding affinities of G. molesta GOBP1 and 2 (GmolGOBP1 and 2) for sex pheromone components and host plant volatiles were measured by fluorescence ligand-binding assays. The distribution of GmolGOBP1 and 2 in the antennal sensillum were defined by whole mount fluorescence immunohistochemistry (WM-FIHC) experiments. The binding sites of GmolGOBP2 were predicted using homology modeling, molecular docking and site-directed mutagenesis. Both GmolGOBP1 and 2 are housing in sensilla basiconica and with no differences in male and female antennae. Recombinant GmolGOBP1 (rGmolGOBP1) exhibited broad binding properties towards host plant volatiles and sex pheromone components; rGmolGOBP2 could not effectively bind host plant volatiles but showed specific binding affinity with a minor sex pheromone component dodecanol. We chose GmolGOBP2 and dodecanol for further homology modeling, molecular docking, and site-directed mutagenesis. Binding affinities of mutants demonstrated that Thr9 was the key binding site and confirmed dodecanol bonding to protein involves a hydrogen bond. Combined with the pH effect on binding affinities of rGmolGOBP2, ligand binding and release of GmolGOBP2 were related to a pH-dependent conformational transition. Conclusion Two rGmolGOBPs exhibit different binding characteristics for tested ligands. rGmolGOBP1 has dual functions in recognition of host plant volatiles and sex pheromone components, while rGmolGOBP2 is mainly involved in minor sex pheromone component dodecanol perception. This study also provides empirical evidence for the predicted functions of key amino acids in recombinant protein ligand-binding characteristics. PMID:27152703

  3. Computational prediction and biochemical characterization of novel RNA aptamers to Rift Valley fever virus nucleocapsid protein.

    PubMed

    Ellenbecker, Mary; St Goddard, Jeremy; Sundet, Alec; Lanchy, Jean-Marc; Raiford, Douglas; Lodmell, J Stephen

    2015-10-01

    Rift Valley fever virus (RVFV) is a potent human and livestock pathogen endemic to sub-Saharan Africa and the Arabian Peninsula that has potential to spread to other parts of the world. Although there is no proven effective and safe treatment for RVFV infections, a potential therapeutic target is the virally encoded nucleocapsid protein (N). During the course of infection, N binds to viral RNA, and perturbation of this interaction can inhibit viral replication. To gain insight into how N recognizes viral RNA specifically, we designed an algorithm that uses a distance matrix and multidimensional scaling to compare the predicted secondary structures of known N-binding RNAs, or aptamers, that were isolated and characterized in previous in vitro evolution experiment. These aptamers did not exhibit overt sequence or predicted structure similarity, so we employed bioinformatic methods to propose novel aptamers based on analysis and clustering of secondary structures. We screened and scored the predicted secondary structures of novel randomly generated RNA sequences in silico and selected several of these putative N-binding RNAs whose secondary structures were similar to those of known N-binding RNAs. We found that overall the in silico generated RNA sequences bound well to N in vitro. Furthermore, introduction of these RNAs into cells prior to infection with RVFV inhibited viral replication in cell culture. This proof of concept study demonstrates how the predictive power of bioinformatics and the empirical power of biochemistry can be jointly harnessed to discover, synthesize, and test new RNA sequences that bind tightly to RVFV N protein. The approach would be easily generalizable to other applications. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Characterization of Escherichia coli Type 1 Pilus Mutants with Altered Binding Specificities

    PubMed Central

    Harris, Sandra L.; Spears, Patricia A.; Havell, Edward A.; Hamrick, Terri S.; Horton, John R.; Orndorff, Paul E.

    2001-01-01

    PCR mutagenesis and a unique enrichment scheme were used to obtain two mutants, each with a single lesion in fimH, the chromosomal gene that encodes the adhesin protein (FimH) of Escherichia coli type 1 pili. These mutants were noteworthy in part because both were altered in the normal range of cell types bound by FimH. One mutation altered an amino acid at a site previously shown to be involved in temperature-dependent binding, and the other altered an amino acid lining the predicted FimH binding pocket. PMID:11395476

  5. Biochemical profiling in silico--predicting substrate specificities of large enzyme families.

    PubMed

    Tyagi, Sadhna; Pleiss, Juergen

    2006-06-25

    A general high-throughput method for in silico biochemical profiling of enzyme families has been developed based on covalent docking of potential substrates into the binding sites of target enzymes. The method has been tested by systematically docking transition state--analogous intermediates of 12 substrates into the binding sites of 20 alpha/beta hydrolases from 15 homologous families. To evaluate the effect of side chain orientations to the docking results, 137 crystal structures were included in the analysis. A good substrate must fulfil two criteria: it must bind in a productive geometry with four hydrogen bonds between the substrate and the catalytic histidine and the oxyanion hole, and a high affinity of the enzyme-substrate complex as predicted by a high docking score. The modelling results in general reproduce experimental data on substrate specificity and stereoselectivity: the differences in substrate specificity of cholinesterases toward acetyl- and butyrylcholine, the changes of activity of lipases and esterases upon the size of the acid moieties, activity of lipases and esterases toward tertiary alcohols, and the stereopreference of lipases and esterases toward chiral secondary alcohols. Rigidity of the docking procedure was the major reason for false positive and false negative predictions, as the geometry of the complex and docking score may sensitively depend on the orientation of individual side chains. Therefore, appropriate structures have to be identified. In silico biochemical profiling provides a time efficient and cost saving protocol for virtual screening to identify the potential substrates of the members of large enzyme family from a library of molecules.

  6. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system.

    PubMed

    Hogan, Daniel J; Riordan, Daniel P; Gerber, André P; Herschlag, Daniel; Brown, Patrick O

    2008-10-28

    RNA-binding proteins (RBPs) have roles in the regulation of many post-transcriptional steps in gene expression, but relatively few RBPs have been systematically studied. We searched for the RNA targets of 40 proteins in the yeast Saccharomyces cerevisiae: a selective sample of the approximately 600 annotated and predicted RBPs, as well as several proteins not annotated as RBPs. At least 33 of these 40 proteins, including three of the four proteins that were not previously known or predicted to be RBPs, were reproducibly associated with specific sets of a few to several hundred RNAs. Remarkably, many of the RBPs we studied bound mRNAs whose protein products share identifiable functional or cytotopic features. We identified specific sequences or predicted structures significantly enriched in target mRNAs of 16 RBPs. These potential RNA-recognition elements were diverse in sequence, structure, and location: some were found predominantly in 3'-untranslated regions, others in 5'-untranslated regions, some in coding sequences, and many in two or more of these features. Although this study only examined a small fraction of the universe of yeast RBPs, 70% of the mRNA transcriptome had significant associations with at least one of these RBPs, and on average, each distinct yeast mRNA interacted with three of the RBPs, suggesting the potential for a rich, multidimensional network of regulation. These results strongly suggest that combinatorial binding of RBPs to specific recognition elements in mRNAs is a pervasive mechanism for multi-dimensional regulation of their post-transcriptional fate.

  7. Impact of mutations on the allosteric conformational equilibrium

    PubMed Central

    Weinkam, Patrick; Chen, Yao Chi; Pons, Jaume; Sali, Andrej

    2012-01-01

    Allostery in a protein involves effector binding at an allosteric site that changes the structure and/or dynamics at a distant, functional site. In addition to the chemical equilibrium of ligand binding, allostery involves a conformational equilibrium between one protein substate that binds the effector and a second substate that less strongly binds the effector. We run molecular dynamics simulations using simple, smooth energy landscapes to sample specific ligand-induced conformational transitions, as defined by the effector-bound and unbound protein structures. These simulations can be performed using our web server: http://salilab.org/allosmod/. We then develop a set of features to analyze the simulations and capture the relevant thermodynamic properties of the allosteric conformational equilibrium. These features are based on molecular mechanics energy functions, stereochemical effects, and structural/dynamic coupling between sites. Using a machine-learning algorithm on a dataset of 10 proteins and 179 mutations, we predict both the magnitude and sign of the allosteric conformational equilibrium shift by the mutation; the impact of a large identifiable fraction of the mutations can be predicted with an average unsigned error of 1 kBT. With similar accuracy, we predict the mutation effects for an 11th protein that was omitted from the initial training and testing of the machine-learning algorithm. We also assess which calculated thermodynamic properties contribute most to the accuracy of the prediction. PMID:23228330

  8. Inter-species chimeras of leukaemia inhibitory factor define a major human receptor-binding determinant.

    PubMed Central

    Owczarek, C M; Layton, M J; Metcalf, D; Lock, P; Willson, T A; Gough, N M; Nicola, N A

    1993-01-01

    Human leukaemia inhibitory factor (hLIF) binds to both human and mouse LIF receptors (LIF-R), while mouse LIF (mLIF) binds only to mouse LIF-R. Moreover, hLIF binds with higher affinity to the mLIF-R than does mLIF. In order to define the regions of the hLIF molecule responsible for species-specific interaction with the hLIF-R and for the unusual high-affinity binding to the mLIF-R, a series of 15 mouse/human LIF hybrids has been generated. Perhaps surprisingly, both of these properties mapped to the same region of the hLIF molecule. The predominant contribution was from residues in the loop linking the third and fourth helices, with lesser contributions from residues in the third helix and the loop connecting the second and third helices in the predicted three-dimensional structure. Since all chimeras retained full biological activity and receptor-binding activity on mouse cells, and there was little variation in the specific biological activity of the purified proteins, it can be concluded that the overall secondary and tertiary structures of each chimera were intact. This observation also implied that the primary binding sites on mLIF and hLIF for the mLIF-R were unaltered by inter-species domain swapping. Consequently, the site on the hLIF molecule that confers species-specific binding to the hLIF-R and higher affinity binding to the mLIF-R, must constitute an additional interaction site to that used by both mLIF and hLIF to bind to the mLIF-R. These studies define a maximum of 15 amino acid differences between hLIF and mLIF that are responsible for the different properties of these proteins. Images PMID:8253075

  9. A novel assay reveals preferential binding between Rabs, kinesins, and specific endosomal subpopulations

    PubMed Central

    Bentley, Marvin; Decker, Helena; Luisi, Julie

    2015-01-01

    Identifying the proteins that regulate vesicle trafficking is a fundamental problem in cell biology. In this paper, we introduce a new assay that involves the expression of an FKBP12-rapamycin–binding domain–tagged candidate vesicle-binding protein, which can be inducibly linked to dynein or kinesin. Vesicles can be labeled by any convenient method. If the candidate protein binds the labeled vesicles, addition of the linker drug results in a predictable, highly distinctive change in vesicle localization. This assay generates robust and easily interpretable results that provide direct experimental evidence of binding between a candidate protein and the vesicle population of interest. We used this approach to compare the binding of Kinesin-3 family members with different endosomal populations. We found that KIF13A and KIF13B bind preferentially to early endosomes and that KIF1A and KIF1Bβ bind preferentially to late endosomes and lysosomes. This assay may have broad utility for identifying the trafficking proteins that bind to different vesicle populations. PMID:25624392

  10. Genome-wide Expression Profiling, In Vivo DNA Binding Analysis, and Probabilistic Motif Prediction Reveal Novel Abf1 Target Genes during Fermentation, Respiration, and Sporulation in Yeast

    PubMed Central

    Schlecht, Ulrich; Erb, Ionas; Demougin, Philippe; Robine, Nicolas; Borde, Valérie; van Nimwegen, Erik; Nicolas, Alain

    2008-01-01

    The autonomously replicating sequence binding factor 1 (Abf1) was initially identified as an essential DNA replication factor and later shown to be a component of the regulatory network controlling mitotic and meiotic cell cycle progression in budding yeast. The protein is thought to exert its functions via specific interaction with its target site as part of distinct protein complexes, but its roles during mitotic growth and meiotic development are only partially understood. Here, we report a comprehensive approach aiming at the identification of direct Abf1-target genes expressed during fermentation, respiration, and sporulation. Computational prediction of the protein's target sites was integrated with a genome-wide DNA binding assay in growing and sporulating cells. The resulting data were combined with the output of expression profiling studies using wild-type versus temperature-sensitive alleles. This work identified 434 protein-coding loci as being transcriptionally dependent on Abf1. More than 60% of their putative promoter regions contained a computationally predicted Abf1 binding site and/or were bound by Abf1 in vivo, identifying them as direct targets. The present study revealed numerous loci previously unknown to be under Abf1 control, and it yielded evidence for the protein's variable DNA binding pattern during mitotic growth and meiotic development. PMID:18305101

  11. Structure, Function, and Evolution of Biogenic Amine-binding Proteins in Soft Ticks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mans, Ben J.; Ribeiro, Jose M.C.; Andersen, John F.

    2008-08-19

    Two highly abundant lipocalins, monomine and monotonin, have been isolated from the salivary gland of the soft tick Argas monolakensis and shown to bind histamine and 5-hydroxytryptamine (5-HT), respectively. The crystal structures of monomine and a paralog of monotonin were determined in the presence of ligands to compare the determinants of ligand binding. Both the structures and binding measurements indicate that the proteins have a single binding site rather than the two sites previously described for the female-specific histamine-binding protein (FS-HBP), the histamine-binding lipocalin of the tick Rhipicephalus appendiculatus. The binding sites of monomine and monotonin are similar to themore » lower, low affinity site of FS-HBP. The interaction of the protein with the aliphatic amine group of the ligand is very similar for the all of the proteins, whereas specificity is determined by interactions with the aromatic portion of the ligand. Interestingly, protein interaction with the imidazole ring of histamine differs significantly between the low affinity binding site of FS-HBP and monomine, suggesting that histamine binding has evolved independently in the two lineages. From the conserved features of these proteins, a tick lipocalin biogenic amine-binding motif could be derived that was used to predict biogenic amine-binding function in other tick lipocalins. Heterologous expression of genes from salivary gland libraries led to the discovery of biogenic amine-binding proteins in soft (Ornithodoros) and hard (Ixodes) tick genera. The data generated were used to reconstruct the most probable evolutionary pathway for the evolution of biogenic amine-binding in tick lipocalins.« less

  12. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

    PubMed Central

    Karnik, Rahul; Beer, Michael A.

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  13. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

    PubMed

    Karnik, Rahul; Beer, Michael A

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.

  14. Binding Specificity of Two PBPs in the Yellow Peach Moth Conogethes punctiferalis (Guenée)

    PubMed Central

    Ge, Xing; Ahmed, Tofael; Zhang, Tiantao; Wang, Zhenying; He, Kanglai; Bai, Shuxiong

    2018-01-01

    Pheromone binding proteins (PBPs) play an important role in olfaction of insects by transporting sex pheromones across the sensillum lymph to odorant receptors. To obtain a better understanding of the molecular basis between PBPs and semiochemicals, we have cloned, expressed, and purified two PBPs (CpunPBP2 and CpunPBP5) from the antennae of Conogethes punctiferalis. Fluorescence competitive binding assays were used to investigate binding affinities of CpunPBP2 and CpunPBP5 to sex pheromone and volatiles. Results indicate both CpunPBP2 and CpunPBP5 bind sex pheromones E10-16:Ald, Z10-16:Ald and hexadecanal with higher affinities. In addition, CpunPBP2 and CpunPBP5 also could bind some odorants, such as 1-tetradecanol, trans-caryopyllene, farnesene, and β-farnesene. Homology modeling to predict 3D structure and molecular docking to predict key binding sites were used, to better understand interactions of CpunPBP2 and CpunPBP5 with sex pheromones E10-16:Ald and Z10-16:Ald. According to the results, Phe9, Phe33, Ser53, and Phe115 were key binding sites predicted for CpunPBP2, as were Ser9, Phe12, Val115, and Arg120 for CpunPBP5. Binding affinities of four mutants of CpunPBP2 and four mutants of CpunPBP5 with the two sex pheromones were investigated by fluorescence competitive binding assays. Results indicate that single nucleotides mutation may affect interactions between PBPs and sex pheromones. Expression levels of CpunPBP2 and CpunPBP5 in different tissues were evaluated using qPCR. Results show that CpunPBP2 and CpunPBP5 were largely amplified in the antennae, with low expression levels in other tissues. CpunPBP2 was expressed mainly in male antennae, whereas CpunPBP5 was expressed mainly in female antennae. These results provide new insights into understanding the recognition between PBPs and ligands. PMID:29666585

  15. Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.

    PubMed

    Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I

    2001-08-01

    DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.

  16. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

    PubMed

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-02-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.

  17. HITS-CLIP yields genome-wide insights into brain alternative RNA processing

    NASA Astrophysics Data System (ADS)

    Licatalosi, Donny D.; Mele, Aldo; Fak, John J.; Ule, Jernej; Kayikci, Melis; Chi, Sung Wook; Clark, Tyson A.; Schweitzer, Anthony C.; Blume, John E.; Wang, Xuning; Darnell, Jennifer C.; Darnell, Robert B.

    2008-11-01

    Protein-RNA interactions have critical roles in all aspects of gene expression. However, applying biochemical methods to understand such interactions in living tissues has been challenging. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova revealed extremely reproducible RNA-binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3' untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.

  18. Molecular Mechanotransduction: how forces trigger cytoskeletal dynamics

    NASA Astrophysics Data System (ADS)

    Ehrlicher, Allen

    2012-02-01

    Mechanical stresses elicit cellular reactions mediated by chemical signals. Defective responses to forces underlie human medical disorders, such as cardiac failure and pulmonary injury. Despite detailed knowledge of the cytoskeleton's structure, the specific molecular switches that convert mechanical stimuli into chemical signals have remained elusive. Here we identify the actin-binding protein, filamin A (FLNa) as a central mechanotransduction element of the cytoskeleton by using Fluorescence Loss After photoConversion (FLAC), a novel high-speed alternative to FRAP. We reconstituted a minimal system consisting of actin filaments, FLNa and two FLNa-binding partners: the cytoplasmic tail of ß-integrin, and FilGAP. Integrins form an essential mechanical linkage between extracellular and intracellular environments, with ß integrin tails connecting to the actin cytoskeleton by binding directly to filamin. FilGAP is a FLNa-binding GTPase-activating protein specific for Rac, which in vivo regulates cell spreading and bleb formation. We demonstrate that both externally-imposed bulk shear and myosin II driven forces differentially regulate the binding of integrin and FilGAP to FLNa. Consistent with structural predictions, strain increases ß-integrin binding to FLNa, whereas it causes FilGAP to dissociate from FLNa, providing a direct and specific molecular basis for cellular mechanotransduction. These results identify the first molecular mechanotransduction element within the actin cytoskeleton, revealing that mechanical strain of key proteins regulates the binding of signaling molecules. Moreover, GAP activity has been shown to switch cell movement from mesenchymal to amoeboid motility, suggesting that mechanical forces directly impact the invasiveness of cancer.

  19. Identification and positional distribution analysis of transcription factor binding sites for genes from the wheat fl-cDNA sequences.

    PubMed

    Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui

    2017-06-01

    The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.

  20. Coupling Protein Side-Chain and Backbone Flexibility Improves the Re-design of Protein-Ligand Specificity.

    PubMed

    Ollikainen, Noah; de Jong, René M; Kortemme, Tanja

    2015-01-01

    Interactions between small molecules and proteins play critical roles in regulating and facilitating diverse biological functions, yet our ability to accurately re-engineer the specificity of these interactions using computational approaches has been limited. One main difficulty, in addition to inaccuracies in energy functions, is the exquisite sensitivity of protein-ligand interactions to subtle conformational changes, coupled with the computational problem of sampling the large conformational search space of degrees of freedom of ligands, amino acid side chains, and the protein backbone. Here, we describe two benchmarks for evaluating the accuracy of computational approaches for re-engineering protein-ligand interactions: (i) prediction of enzyme specificity altering mutations and (ii) prediction of sequence tolerance in ligand binding sites. After finding that current state-of-the-art "fixed backbone" design methods perform poorly on these tests, we develop a new "coupled moves" design method in the program Rosetta that couples changes to protein sequence with alterations in both protein side-chain and protein backbone conformations, and allows for changes in ligand rigid-body and torsion degrees of freedom. We show significantly increased accuracy in both predicting ligand specificity altering mutations and binding site sequences. These methodological improvements should be useful for many applications of protein-ligand design. The approach also provides insights into the role of subtle conformational adjustments that enable functional changes not only in engineering applications but also in natural protein evolution.

  1. Structure of adenovirus bound to cellular receptor car

    DOEpatents

    Freimuth, Paul I.

    2004-05-18

    Disclosed is a mutant adenovirus which has a genome comprising one or more mutations in sequences which encode the fiber protein knob domain wherein the mutation causes the encoded viral particle to have significantly weakened binding affinity for CARD1 relative to wild-type adenovirus. Such mutations may be in sequences which encode either the AB loop, or the HI loop of the fiber protein knob domain. Specific residues and mutations are described. Also disclosed is a method for generating a mutant adenovirus which is characterized by a receptor binding affinity or specificity which differs substantially from wild type. In the method, residues of the adenovirus fiber protein knob domain which are predicted to alter D1 binding when mutated, are identified from the crystal structure coordinates of the AD12knob:CAR-D1 complex. A mutation which alters one or more of the identified residues is introduced into the genome of the adenovirus to generate a mutant adenovirus. Whether or not the mutant produced exhibits altered adenovirus-CAR binding properties is then determined.

  2. A close relative of the nuclear, chromosomal high-mobility group protein HMG1 in yeast mitochondria.

    PubMed Central

    Diffley, J F; Stillman, B

    1991-01-01

    ABF2 (ARS-binding factor 2), a small, basic DNA-binding protein that binds specifically to the autonomously replicating sequence ARS1, is located primarily in the mitochondria of the yeast Saccharomyces cerevisiae. The abundance of ABF2 and the phenotype of abf2- null mutants argue that this protein plays a key role in the structure, maintenance, and expression of the yeast mitochondrial genome. The predicted amino acid sequence of ABF2 is closely related to the high-mobility group proteins HMG1 and HMG2 from vertebrate cell nuclei and to several other DNA-binding proteins. Additionally, ABF2 and the other HMG-related proteins are related to a globular domain from the heat shock protein hsp70 family. ABF2 interacts with DNA both nonspecifically and in a specific manner within regulatory regions, suggesting a mechanism whereby it may aid in compacting the mitochondrial genome without interfering with expression. Images PMID:1881919

  3. Interaction of E. coli outer-membrane protein A with sugars on the receptors of the brain microvascular endothelial cells.

    PubMed

    Datta, Deepshikha; Vaidehi, Nagarajan; Floriano, Wely B; Kim, Kwang S; Prasadarao, Nemani V; Goddard, William A

    2003-02-01

    Esherichia coli, the most common gram-negative bacteria, can penetrate the brain microvascular endothelial cells (BMECs) during the neonatal period to cause meningitis with significant morbidity and mortality. Experimental studies have shown that outer-membrane protein A (OmpA) of E. coli plays a key role in the initial steps of the invasion process by binding to specific sugar moieties present on the glycoproteins of BMEC. These experiments also show that polymers of chitobiose (GlcNAcbeta1-4GlcNAc) block the invasion, while epitopes substituted with the L-fucosyl group do not. We used HierDock computational technique that consists of a hierarchy of coarse grain docking method with molecular dynamics (MD) to predict the binding sites and energies of interactions of GlcNAcbeta1-4GlcNAc and other sugars with OmpA. The results suggest two important binding sites for the interaction of carbohydrate epitopes of BMEC glycoproteins to OmpA. We identify one site as the binding pocket for chitobiose (GlcNAcbeta1-4GlcNAc) in OmpA, while the second region (including loops 1 and 2) may be important for recognition of specific sugars. We find that the site involving loops 1 and 2 has relative binding energies that correlate well with experimental observations. This theoretical study elucidates the interaction sites of chitobiose with OmpA and the binding site predictions made in this article are testable either by mutation studies or invasion assays. These results can be further extended in suggesting possible peptide antagonists and drug design for therapeutic strategies. Copyright 2002 Wiley-Liss, Inc.

  4. NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence

    PubMed Central

    Nielsen, Morten; Lundegaard, Claus; Blicher, Thomas; Lamberth, Kasper; Harndahl, Mikkel; Justesen, Sune; Røder, Gustav; Peters, Bjoern; Sette, Alessandro; Lund, Ole; Buus, Søren

    2007-01-01

    Background Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking. Principal Findings Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis. Conclusions Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan. PMID:17726526

  5. Theoretical studies on beta and delta isoform-specific binding mechanisms of phosphoinositide 3-kinase inhibitors.

    PubMed

    Zhu, Jingyu; Pan, Peichen; Li, Youyong; Wang, Man; Li, Dan; Cao, Biyin; Mao, Xinliang; Hou, Tingjun

    2014-03-04

    Phosphoinositide 3-kinase (PI3K) is known to be closely related to tumorigenesis and cell proliferation, and controls a variety of cellular processes, including proliferation, growth, apoptosis, migration, metabolism, etc. The PI3K family comprises eight catalytic isoforms, which are subdivided into three classes. Recently, the discovery of inhibitors that block a single isoform of PI3K has continued to attract special attention because they may have higher selectivity for certain tumors and less toxicity for healthy cells. The PI3Kβ and PI3Kδ share fewer studies than α/γ, and therefore, in this work, the combination of molecular dynamics simulations and free energy calculations was employed to explore the binding of three isoform-specific PI3K inhibitors (COM8, IC87114, and GDC-0941) to PI3Kβ or PI3Kδ. The isoform specificities of the studied inhibitors derived from the predicted binding free energies are in good agreement with the experimental data. In addition, the key residues critical for PI3Kβ or PI3Kδ selectivity were highlighted by decomposing the binding free energies into the contributions from individual residues. It was observed that although PI3Kβ and PI3Kδ share the conserved ATP-binding pockets, individual residues do behave differently, particularly the residues critical for PI3Kβ or PI3Kδ selectivity. It can be concluded that the inhibitor specificity between PI3Kβ and PI3Kδ is determined by the additive contributions from multiple residues, not just a single one. This study provides valuable information for understanding the isoform-specific binding mechanisms of PI3K inhibitors, and should be useful for the rational design of novel and selective PI3K inhibitors.

  6. Computational chemistry in 25 years

    NASA Astrophysics Data System (ADS)

    Abagyan, Ruben

    2012-01-01

    Here we are making some predictions based on three methods: a straightforward extrapolations of the existing trends; a self-fulfilling prophecy; and picking some current grievances and predicting that they will be addressed or solved. We predict the growth of multicore computing and dramatic growth of data, as well as the improvements in force fields and sampling methods. We also predict that effects of therapeutic and environmental molecules on human body, as well as complex natural chemical signalling will be understood in terms of three dimensional models of their binding to specific pockets.

  7. Nucleolin forms a specific complex with a fragment of the viral (minus) strand of minute virus of mice DNA.

    PubMed Central

    Barrijal, S; Perros, M; Gu, Z; Avalosse, B L; Belenguer, P; Amalric, F; Rommelaere, J

    1992-01-01

    Nucleolin, a major nucleolar protein, forms a specific complex with the genome (a single-stranded DNA molecule of minus polarity) of parvovirus MVMp in vitro. By means of South-western blotting experiments, we mapped the binding site to a 222-nucleotide motif within the non-structural transcription unit, referred to as NUBE (nucleolin-binding element). The specificity of the interaction was confirmed by competitive gel retardation assays. DNaseI and nuclease S1 probing showed that NUBE folds into a secondary structure, in agreement with a computer-assisted conformational prediction. The whole NUBE may be necessary for the interaction with nucleolin, as suggested by the failure of NUBE subfragments to bind the protein and by the nuclease footprinting experiments. The present work extends the previously reported ability of nucleolin to form a specific complex with ribosomal RNA, to a defined DNA substrate. Considering the tropism of MVMp DNA replication for host cell nucleoli, these data raise the possibility that nucleolin may contribute to the regulation of the parvoviral life-cycle. Images PMID:1408821

  8. DNA-binding specificity prediction with FoldX.

    PubMed

    Nadra, Alejandro D; Serrano, Luis; Alibés, Andreu

    2011-01-01

    With the advent of Synthetic Biology, a field between basic science and applied engineering, new computational tools are needed to help scientists reach their goal, their design, optimizing resources. In this chapter, we present a simple and powerful method to either know the DNA specificity of a wild-type protein or design new specificities by using the protein design algorithm FoldX. The only basic requirement is having a good resolution structure of the complex. Protein-DNA interaction design may aid the development of new parts designed to be orthogonal, decoupled, and precise in its target. Further, it could help to fine-tune the systems in terms of specificity, discrimination, and binding constants. In the age of newly developed devices and invented systems, computer-aided engineering promises to be an invaluable tool. Copyright © 2011 Elsevier Inc. All rights reserved.

  9. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    PubMed Central

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  10. Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations.

    PubMed

    Cournia, Zoe; Allen, Bryce; Sherman, Woody

    2017-12-26

    Accurate in silico prediction of protein-ligand binding affinities has been a primary objective of structure-based drug design for decades due to the putative value it would bring to the drug discovery process. However, computational methods have historically failed to deliver value in real-world drug discovery applications due to a variety of scientific, technical, and practical challenges. Recently, a family of approaches commonly referred to as relative binding free energy (RBFE) calculations, which rely on physics-based molecular simulations and statistical mechanics, have shown promise in reliably generating accurate predictions in the context of drug discovery projects. This advance arises from accumulating developments in the underlying scientific methods (decades of research on force fields and sampling algorithms) coupled with vast increases in computational resources (graphics processing units and cloud infrastructures). Mounting evidence from retrospective validation studies, blind challenge predictions, and prospective applications suggests that RBFE simulations can now predict the affinity differences for congeneric ligands with sufficient accuracy and throughput to deliver considerable value in hit-to-lead and lead optimization efforts. Here, we present an overview of current RBFE implementations, highlighting recent advances and remaining challenges, along with examples that emphasize practical considerations for obtaining reliable RBFE results. We focus specifically on relative binding free energies because the calculations are less computationally intensive than absolute binding free energy (ABFE) calculations and map directly onto the hit-to-lead and lead optimization processes, where the prediction of relative binding energies between a reference molecule and new ideas (virtual molecules) can be used to prioritize molecules for synthesis. We describe the critical aspects of running RBFE calculations, from both theoretical and applied perspectives, using a combination of retrospective literature examples and prospective studies from drug discovery projects. This work is intended to provide a contemporary overview of the scientific, technical, and practical issues associated with running relative binding free energy simulations, with a focus on real-world drug discovery applications. We offer guidelines for improving the accuracy of RBFE simulations, especially for challenging cases, and emphasize unresolved issues that could be improved by further research in the field.

  11. Dynamics simulations for engineering macromolecular interactions

    NASA Astrophysics Data System (ADS)

    Robinson-Mosher, Avi; Shinar, Tamar; Silver, Pamela A.; Way, Jeffrey

    2013-06-01

    The predictable engineering of well-behaved transcriptional circuits is a central goal of synthetic biology. The artificial attachment of promoters to transcription factor genes usually results in noisy or chaotic behaviors, and such systems are unlikely to be useful in practical applications. Natural transcriptional regulation relies extensively on protein-protein interactions to insure tightly controlled behavior, but such tight control has been elusive in engineered systems. To help engineer protein-protein interactions, we have developed a molecular dynamics simulation framework that simplifies features of proteins moving by constrained Brownian motion, with the goal of performing long simulations. The behavior of a simulated protein system is determined by summation of forces that include a Brownian force, a drag force, excluded volume constraints, relative position constraints, and binding constraints that relate to experimentally determined on-rates and off-rates for chosen protein elements in a system. Proteins are abstracted as spheres. Binding surfaces are defined radially within a protein. Peptide linkers are abstracted as small protein-like spheres with rigid connections. To address whether our framework could generate useful predictions, we simulated the behavior of an engineered fusion protein consisting of two 20 000 Da proteins attached by flexible glycine/serine-type linkers. The two protein elements remained closely associated, as if constrained by a random walk in three dimensions of the peptide linker, as opposed to showing a distribution of distances expected if movement were dominated by Brownian motion of the protein domains only. We also simulated the behavior of fluorescent proteins tethered by a linker of varying length, compared the predicted Förster resonance energy transfer with previous experimental observations, and obtained a good correspondence. Finally, we simulated the binding behavior of a fusion of two ligands that could simultaneously bind to distinct cell-surface receptors, and explored the landscape of linker lengths and stiffnesses that could enhance receptor binding of one ligand when the other ligand has already bound to its receptor, thus, addressing potential mechanisms for improving targeted signal transduction proteins. These specific results have implications for the design of targeted fusion proteins and artificial transcription factors involving fusion of natural domains. More broadly, the simulation framework described here could be extended to include more detailed system features such as non-spherical protein shapes and electrostatics, without requiring detailed, computationally expensive specifications. This framework should be useful in predicting behavior of engineered protein systems including binding and dissociation reactions.

  12. Dynamics simulations for engineering macromolecular interactions.

    PubMed

    Robinson-Mosher, Avi; Shinar, Tamar; Silver, Pamela A; Way, Jeffrey

    2013-06-01

    The predictable engineering of well-behaved transcriptional circuits is a central goal of synthetic biology. The artificial attachment of promoters to transcription factor genes usually results in noisy or chaotic behaviors, and such systems are unlikely to be useful in practical applications. Natural transcriptional regulation relies extensively on protein-protein interactions to insure tightly controlled behavior, but such tight control has been elusive in engineered systems. To help engineer protein-protein interactions, we have developed a molecular dynamics simulation framework that simplifies features of proteins moving by constrained Brownian motion, with the goal of performing long simulations. The behavior of a simulated protein system is determined by summation of forces that include a Brownian force, a drag force, excluded volume constraints, relative position constraints, and binding constraints that relate to experimentally determined on-rates and off-rates for chosen protein elements in a system. Proteins are abstracted as spheres. Binding surfaces are defined radially within a protein. Peptide linkers are abstracted as small protein-like spheres with rigid connections. To address whether our framework could generate useful predictions, we simulated the behavior of an engineered fusion protein consisting of two 20,000 Da proteins attached by flexible glycine/serine-type linkers. The two protein elements remained closely associated, as if constrained by a random walk in three dimensions of the peptide linker, as opposed to showing a distribution of distances expected if movement were dominated by Brownian motion of the protein domains only. We also simulated the behavior of fluorescent proteins tethered by a linker of varying length, compared the predicted Förster resonance energy transfer with previous experimental observations, and obtained a good correspondence. Finally, we simulated the binding behavior of a fusion of two ligands that could simultaneously bind to distinct cell-surface receptors, and explored the landscape of linker lengths and stiffnesses that could enhance receptor binding of one ligand when the other ligand has already bound to its receptor, thus, addressing potential mechanisms for improving targeted signal transduction proteins. These specific results have implications for the design of targeted fusion proteins and artificial transcription factors involving fusion of natural domains. More broadly, the simulation framework described here could be extended to include more detailed system features such as non-spherical protein shapes and electrostatics, without requiring detailed, computationally expensive specifications. This framework should be useful in predicting behavior of engineered protein systems including binding and dissociation reactions.

  13. Computational analysis of protein-protein interfaces involving an alpha helix: insights for terphenyl-like molecules binding.

    PubMed

    Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A

    2013-06-14

    Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.

  14. Approaching Pharmacological Space: Events and Components.

    PubMed

    Vistoli, Giulio; Pedretti, Alessandro; Mazzolari, Angelica; Testa, Bernard

    2018-01-01

    With a view to introducing the concept of pharmacological space and its potential applications in investigating and predicting the toxic mechanisms of xenobiotics, this opening chapter describes the logical relations between conformational behavior, physicochemical properties and binding spaces, which are seen as the three key elements composing the pharmacological space. While the concept of conformational space is routinely used to encode molecular flexibility, the concepts of property spaces and, particularly, of binding spaces are more innovative. Indeed, their descriptors can find fruitful applications (a) in describing the dynamic adaptability a given ligand experiences when inserted into a specific environment, and (b) in parameterizing the flexibility a ligand retains when bound to a biological target. Overall, these descriptors can conveniently account for the often disregarded entropic factors and as such they prove successful when inserted in ligand- or structure-based predictive models. Notably, and although binding space parameters can clearly be derived from MD simulations, the chapter will illustrate how docking calculations, despite their static nature, are able to evaluate ligand's flexibility by analyzing several poses for each ligand. Such an approach, which represents the founding core of the binding space concept, can find various applications in which the related descriptors show an impressive enhancing effect on the statistical performances of the resulting predictive models.

  15. Use of thermodynamic coupling between antibody-antigen binding and phospholipid acyl chain phase transition energetics to predict immunoliposome targeting affinity.

    PubMed

    Klegerman, Melvin E; Zou, Yuejiao; Golunski, Eva; Peng, Tao; Huang, Shao-Ling; McPherson, David D

    2014-09-01

    Thermodynamic analysis of ligand-target binding has been a useful tool for dissecting the nature of the binding mechanism and, therefore, potentially can provide valuable information regarding the utility of targeted formulations. Based on a consistent coupling of antibody-antigen binding and gel-liquid crystal transition energetics observed for antibody-phosphatidylethanolamine (Ab-PE) conjugates, we hypothesized that the thermodynamic parameters and the affinity for antigen of the Ab-PE conjugates could be effectively predicted once the corresponding information for the unconjugated antibody is determined. This hypothesis has now been tested in nine different antibody-targeted echogenic liposome (ELIP) preparations, where antibody is conjugated to dipalmitoylphosphatidylethanolamine (DPPE) head groups through a thioether linkage. Predictions were satisfactory (affinity not significantly different from the population of values found) in five cases (55.6%), but the affinity of the unconjugated antibody was not significantly different from the population of values found in six cases (66.7%), indicating that the affinities of the conjugated antibody tended not to deviate appreciably from those of the free antibody. While knowledge of the affinities of free antibodies may be sufficient to judge their suitability as targeting agents, thermodynamic analysis may still provide valuable information regarding their usefulness for specific applications.

  16. Detecting Local Ligand-Binding Site Similarity in Non-Homologous Proteins by Surface Patch Comparison

    PubMed Central

    Sael, Lee; Kihara, Daisuke

    2012-01-01

    Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. PMID:22275074

  17. Detecting local ligand-binding site similarity in nonhomologous proteins by surface patch comparison.

    PubMed

    Sael, Lee; Kihara, Daisuke

    2012-04-01

    Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. Copyright © 2011 Wiley Periodicals, Inc.

  18. Structural characterization of acyl-CoA oxidases reveals a direct link between pheromone biosynthesis and metabolic state in Caenorhabditis elegans

    PubMed Central

    Zhang, Xinxing; Jones, Rachel A.; Bruner, Steven D.; Butcher, Rebecca A.

    2016-01-01

    Caenorhabditis elegans secretes ascarosides as pheromones to communicate with other worms and to coordinate the development and behavior of the population. Peroxisomal β-oxidation cycles shorten the side chains of ascaroside precursors to produce the short-chain ascaroside pheromones. Acyl-CoA oxidases, which catalyze the first step in these β-oxidation cycles, have different side chain-length specificities and enable C. elegans to regulate the production of specific ascaroside pheromones. Here, we determine the crystal structure of the acyl-CoA oxidase 1 (ACOX-1) homodimer and the ACOX-2 homodimer bound to its substrate. Our results provide a molecular basis for the substrate specificities of the acyl-CoA oxidases and reveal why some of these enzymes have a very broad substrate range, whereas others are quite specific. Our results also enable predictions to be made for the roles of uncharacterized acyl-CoA oxidases in C. elegans and in other nematode species. Remarkably, we show that most of the C. elegans acyl-CoA oxidases that participate in ascaroside biosynthesis contain a conserved ATP-binding pocket that lies at the dimer interface, and we identify key residues in this binding pocket. ATP binding induces a structural change that is associated with tighter binding of the FAD cofactor. Mutations that disrupt ATP binding reduce FAD binding and reduce enzyme activity. Thus, ATP may serve as a regulator of acyl-CoA oxidase activity, thereby directly linking ascaroside biosynthesis to ATP concentration and metabolic state. PMID:27551084

  19. Informative priors based on transcription factor structural class improve de novo motif discovery.

    PubMed

    Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J

    2006-07-15

    An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.

  20. An automated decision-tree approach to predicting protein interaction hot spots.

    PubMed

    Darnell, Steven J; Page, David; Mitchell, Julie C

    2007-09-01

    Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface. 2007 Wiley-Liss, Inc.

  1. Estrogen receptor expert system overview and examples

    EPA Science Inventory

    The estrogen receptor expert system (ERES) is a rule-based system developed to prioritize chemicals based upon their potential for binding to the ER. The ERES was initially developed to predict ER affinity of chemicals from two specific EPA chemical inventories, antimicrobial pe...

  2. Modulation of FadR Binding Capacity for Acyl-CoA Fatty Acids Through Structure-Guided Mutagenesis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bacik, John-Paul; Yeager, Chris M.; Twary, Scott N.

    FadR is a versatile global regulator in Escherichia coli that controls fatty acid metabolism and thereby modulates the ability of this bacterium to grow using fatty acids or acetate as the sole carbon source. FadR regulates fatty acid metabolism in response to intra-cellular concentrations of acyl-CoA lipids. The ability of FadR to bind acyl-CoA fatty acids is hence of significant interest for the engineering of biosynthetic pathways for the production of lipid-based biofuels and commodity chemicals. Based on the available crystal structure of E. coli bound to myristoyl- CoA, we predicted amino acid positions within the effector binding pocket thatmore » would alter the ability of FadR to bind acyl-CoA fatty acids without affecting DNA binding. We utilized fluorescence polarization to characterize the in-vitro binding properties of wild type and mutant FadR. We found that a Leu102Ala mutant enhanced binding of the effector, likely by increasing the size of the binding pocket for the acyl moiety of the molecule. Conversely, the elimination of the guanidine side chain (Arg213Ala and Arg213Met mutants) of the CoA moiety binding site severely diminished the ability of FadR to bind the acyl-CoA effector. These results demonstrate the ability to fine tune FadR binding capacity. The validation of an efficient method to fully characterize all the binding events involved in the specific activity (effector and DNA operator binding) of FadR has allowed us to increase our understanding of the role of specific amino acids in the binding and recognition of acyl-CoA fatty acids and will greatly facilitate efforts aimed at engineering tunable FadR regulators for synthetic biology.« less

  3. Modulation of FadR Binding Capacity for Acyl-CoA Fatty Acids Through Structure-Guided Mutagenesis

    DOE PAGES

    Bacik, John-Paul; Yeager, Chris M.; Twary, Scott N.; ...

    2015-09-18

    FadR is a versatile global regulator in Escherichia coli that controls fatty acid metabolism and thereby modulates the ability of this bacterium to grow using fatty acids or acetate as the sole carbon source. FadR regulates fatty acid metabolism in response to intra-cellular concentrations of acyl-CoA lipids. The ability of FadR to bind acyl-CoA fatty acids is hence of significant interest for the engineering of biosynthetic pathways for the production of lipid-based biofuels and commodity chemicals. Based on the available crystal structure of E. coli bound to myristoyl- CoA, we predicted amino acid positions within the effector binding pocket thatmore » would alter the ability of FadR to bind acyl-CoA fatty acids without affecting DNA binding. We utilized fluorescence polarization to characterize the in-vitro binding properties of wild type and mutant FadR. We found that a Leu102Ala mutant enhanced binding of the effector, likely by increasing the size of the binding pocket for the acyl moiety of the molecule. Conversely, the elimination of the guanidine side chain (Arg213Ala and Arg213Met mutants) of the CoA moiety binding site severely diminished the ability of FadR to bind the acyl-CoA effector. These results demonstrate the ability to fine tune FadR binding capacity. The validation of an efficient method to fully characterize all the binding events involved in the specific activity (effector and DNA operator binding) of FadR has allowed us to increase our understanding of the role of specific amino acids in the binding and recognition of acyl-CoA fatty acids and will greatly facilitate efforts aimed at engineering tunable FadR regulators for synthetic biology.« less

  4. Fragmentation cross sections and binding energies of neutron-rich nuclei

    NASA Astrophysics Data System (ADS)

    Tsang, M. B.; Lynch, W. G.; Friedman, W. A.; Mocko, M.; Sun, Z. Y.; Aoi, N.; Cook, J. M.; Delaunay, F.; Famiano, M. A.; Hui, H.; Imai, N.; Iwasaki, H.; Motobayashi, T.; Niikura, M.; Onishi, T.; Rogers, A. M.; Sakurai, H.; Suzuki, H.; Takeshita, E.; Takeuchi, S.; Wallace, M. S.

    2007-10-01

    An exponential dependence of the fragmentation cross section on the average binding energy is observed and reproduced with a statistical model. The observed functional dependence is robust and allows the extraction of binding energies from measured cross sections. From the systematics of Cu isotope cross sections, the binding energies of Cu76,77,78,79 have been extracted. They are 636.94±0.4,647.1±0.4,651.6±0.4, and 657.8±0.5 MeV, respectively. Specifically, the uncertainty of the binding energy of Cu75 is reduced from 980 keV, as listed in the 2003 mass table of Audi, Wapstra, and Thibault to 400 keV. The predicted cross sections of two near drip-line nuclei, Na39 and Mg40 from the fragmentation of Ca48 are discussed.

  5. Prediction of site-specific interactions in antibody-antigen complexes: the proABC method and server.

    PubMed

    Olimpieri, Pier Paolo; Chailyan, Anna; Tramontano, Anna; Marcatili, Paolo

    2013-09-15

    Antibodies or immunoglobulins are proteins of paramount importance in the immune system. They are extremely relevant as diagnostic, biotechnological and therapeutic tools. Their modular structure makes it easy to re-engineer them for specific purposes. Short of undergoing a trial and error process, these experiments, as well as others, need to rely on an understanding of the specific determinants of the antibody binding mode. In this article, we present a method to identify, on the basis of the antibody sequence alone, which residues of an antibody directly interact with its cognate antigen. The method, based on the random forest automatic learning techniques, reaches a recall and specificity as high as 80% and is implemented as a free and easy-to-use server, named prediction of Antibody Contacts. We believe that it can be of great help in re-design experiments as well as a guide for molecular docking experiments. The results that we obtained also allowed us to dissect which features of the antibody sequence contribute most to the involvement of specific residues in binding to the antigen. http://www.biocomputing.it/proABC. anna.tramontano@uniroma1.it or paolo.marcatili@gmail.com Supplementary data are available at Bioinformatics online.

  6. Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    PubMed Central

    2012-01-01

    Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826

  7. The epitope of monoclonal antibodies blocking erythrocyte invasion by Plasmodium falciparum map to the dimerization and receptor glycan binding sites of EBA-175.

    PubMed

    Ambroggio, Xavier; Jiang, Lubin; Aebig, Joan; Obiakor, Harold; Lukszo, Jan; Narum, David L

    2013-01-01

    The malaria parasite, Plasmodium falciparum, and related parasites use a variety of proteins with Duffy-Binding Like (DBL) domains to bind glycoproteins on the surface of host cells. Among these proteins, the 175 kDa erythrocyte binding antigen, EBA-175, specifically binds to glycophorin A on the surface of human erythrocytes during the process of merozoite invasion. The domain responsible for glycophorin A binding was identified as region II (RII) which contains two DBL domains, F1 and F2. The crystal structure of this region revealed a dimer that is presumed to represent the glycophorin A binding conformation as sialic acid binding sites and large cavities are observed at the dimer interface. The dimer interface is largely composed of two loops from within each monomer, identified as the F1 and F2 β-fingers that contact depressions in the opposing monomers in a similar manner. Previous studies have identified a panel of five monoclonal antibodies (mAbs) termed R215 to R218 and R256 that bind to RII and inhibit invasion of erythrocytes to varying extents. In this study, we predict the F2 β-finger region as the conformational epitope for mAbs, R215, R217, and R256, and confirm binding for the most effective blocking mAb R217 and R215 to a synthetic peptide mimic of the F2 β-finger. Localization of the epitope to the dimerization and glycan binding sites of EBA-175 RII and site-directed mutagenesis within the predicted epitope are consistent with R215 and R217 blocking erythrocyte invasion by Plasmodium falciparum by preventing formation of the EBA-175- glycophorin A complex.

  8. Tuning the specificity of a Two-in-One Fab against three angiogenic antigens by fully utilizing the information of deep mutational scanning.

    PubMed

    Koenig, Patrick; Sanowar, Sarah; Lee, Chingwei V; Fuh, Germaine

    Monoclonal antibodies developed for therapeutic or diagnostic purposes need to demonstrate highly defined binding specificity profiles. Engineering of an antibody to enhance or reduce binding to related antigens is often needed to achieve the desired biologic activity without safety concern. Here, we describe a deep sequencing-aided engineering strategy to fine-tune the specificity of an angiopoietin-2 (Ang2)/vascular endothelial growth factor (VEGF) dual action Fab, 5A12.1 for the treatment of age-related macular degeneration. This antibody utilizes overlapping complementarity-determining region (CDR) sites for dual Ang2/VEGF interaction with K D in the sub-nanomolar range. However, it also exhibits significant (K D of 4 nM) binding to angiopoietin-1, which has high sequence identity with Ang2. We generated a large phage-displayed library of 5A12.1 Fab variants with all possible single mutations in the 6 CDRs. By tracking the change of prevalence of each mutation during various selection conditions, we identified 35 mutations predicted to decrease the affinity for Ang1 while maintaining the affinity for Ang2 and VEGF. We confirmed the specificity profiles for 25 of these single mutations as Fab protein. Structural analysis showed that some of the Fab mutations cluster near a potential Ang1/2 epitope residue that differs in the 2 proteins, while others are up to 15 Å away from the antigen-binding site and likely influence the binding interaction remotely. The approach presented here provides a robust and efficient method for specificity engineering that does not require prior knowledge of the antigen antibody interaction and can be broadly applied to antibody specificity engineering projects.

  9. Synthetic oligonucleotide antigens modified with locked nucleic acids detect disease specific antibodies

    NASA Astrophysics Data System (ADS)

    Samuelsen, Simone V.; Solov'Yov, Ilia A.; Balboni, Imelda M.; Mellins, Elizabeth; Nielsen, Christoffer Tandrup; Heegaard, Niels H. H.; Astakhova, Kira

    2016-10-01

    New techniques to detect and quantify antibodies to nucleic acids would provide a significant advance over current methods, which often lack specificity. We investigate the potential of novel antigens containing locked nucleic acids (LNAs) as targets for antibodies. Particularly, employing molecular dynamics we predict optimal nucleotide composition for targeting DNA-binding antibodies. As a proof of concept, we address a problem of detecting anti-DNA antibodies that are characteristic of systemic lupus erythematosus, a chronic autoimmune disease with multiple manifestations. We test the best oligonucleotide binders in surface plasmon resonance studies to analyze binding and kinetic aspects of interactions between antigens and target DNA. These DNA and LNA/DNA sequences showed improved binding in enzyme-linked immunosorbent assay using human samples of pediatric lupus patients. Our results suggest that the novel method is a promising tool to create antigens for research and point-of-care monitoring of anti-DNA antibodies.

  10. Coarse-graining, Electrostatics and pH effects in phospholipid systems

    NASA Astrophysics Data System (ADS)

    Travesset, Alex; Vangaveti, Sweta

    2010-03-01

    We introduce a minimal free energy describing the interaction of charged groups and counterions including both classical electrostatic and specific interactions. The predictions of the model are compared against the standard model for describing ions next to charged interfaces, consisting of Poisson-Boltzmann theory with additional constants describing ion binding, which are specific to the counterion and the interfacial charge (``chemical binding''). It is shown that the ``chemical'' model can be appropriately described by an underlying ``physical'' model over several decades in concentration, but the extracted binding constants are not uniquely defined, as they differ depending on the particular observable quantity being studied. It is also shown that electrostatic correlations for divalent (or higher valence) ions enhance the surface charge by increasing deprotonation, an effect not properly accounted within chemical models. The model is applied to the charged phospholipids phosphatidylserine, Phosphatidc acid and Phosphoinositides and implications for different biological processes are discussed.

  11. ETMB-RBF: discrimination of metal-binding sites in electron transporters based on RBF networks with PSSM profiles and significant amino acid pairs.

    PubMed

    Ou, Yu-Yen; Chen, Shu-An; Wu, Sheng-Cheng

    2013-01-01

    Cellular respiration is the process by which cells obtain energy from glucose and is a very important biological process in living cell. As cells do cellular respiration, they need a pathway to store and transport electrons, the electron transport chain. The function of the electron transport chain is to produce a trans-membrane proton electrochemical gradient as a result of oxidation-reduction reactions. In these oxidation-reduction reactions in electron transport chains, metal ions play very important role as electron donor and acceptor. For example, Fe ions are in complex I and complex II, and Cu ions are in complex IV. Therefore, to identify metal-binding sites in electron transporters is an important issue in helping biologists better understand the workings of the electron transport chain. We propose a method based on Position Specific Scoring Matrix (PSSM) profiles and significant amino acid pairs to identify metal-binding residues in electron transport proteins. We have selected a non-redundant set of 55 metal-binding electron transport proteins as our dataset. The proposed method can predict metal-binding sites in electron transport proteins with an average 10-fold cross-validation accuracy of 93.2% and 93.1% for metal-binding cysteine and histidine, respectively. Compared with the general metal-binding predictor from A. Passerini et al., the proposed method can improve over 9% of sensitivity, and 14% specificity on the independent dataset in identifying metal-binding cysteines. The proposed method can also improve almost 76% sensitivity with same specificity in metal-binding histidine, and MCC is also improved from 0.28 to 0.88. We have developed a novel approach based on PSSM profiles and significant amino acid pairs for identifying metal-binding sites from electron transport proteins. The proposed approach achieved a significant improvement with independent test set of metal-binding electron transport proteins.

  12. ETMB-RBF: Discrimination of Metal-Binding Sites in Electron Transporters Based on RBF Networks with PSSM Profiles and Significant Amino Acid Pairs

    PubMed Central

    Ou, Yu-Yen; Chen, Shu-An; Wu, Sheng-Cheng

    2013-01-01

    Background Cellular respiration is the process by which cells obtain energy from glucose and is a very important biological process in living cell. As cells do cellular respiration, they need a pathway to store and transport electrons, the electron transport chain. The function of the electron transport chain is to produce a trans-membrane proton electrochemical gradient as a result of oxidation–reduction reactions. In these oxidation–reduction reactions in electron transport chains, metal ions play very important role as electron donor and acceptor. For example, Fe ions are in complex I and complex II, and Cu ions are in complex IV. Therefore, to identify metal-binding sites in electron transporters is an important issue in helping biologists better understand the workings of the electron transport chain. Methods We propose a method based on Position Specific Scoring Matrix (PSSM) profiles and significant amino acid pairs to identify metal-binding residues in electron transport proteins. Results We have selected a non-redundant set of 55 metal-binding electron transport proteins as our dataset. The proposed method can predict metal-binding sites in electron transport proteins with an average 10-fold cross-validation accuracy of 93.2% and 93.1% for metal-binding cysteine and histidine, respectively. Compared with the general metal-binding predictor from A. Passerini et al., the proposed method can improve over 9% of sensitivity, and 14% specificity on the independent dataset in identifying metal-binding cysteines. The proposed method can also improve almost 76% sensitivity with same specificity in metal-binding histidine, and MCC is also improved from 0.28 to 0.88. Conclusions We have developed a novel approach based on PSSM profiles and significant amino acid pairs for identifying metal-binding sites from electron transport proteins. The proposed approach achieved a significant improvement with independent test set of metal-binding electron transport proteins. PMID:23405059

  13. modPDZpep: a web resource for structure based analysis of human PDZ-mediated interaction networks.

    PubMed

    Sain, Neetu; Mohanty, Debasisa

    2016-09-21

    PDZ domains recognize short sequence stretches usually present in C-terminal of their interaction partners. Because of the involvement of PDZ domains in many important biological processes, several attempts have been made for developing bioinformatics tools for genome-wide identification of PDZ interaction networks. Currently available tools for prediction of interaction partners of PDZ domains utilize machine learning approach. Since, they have been trained using experimental substrate specificity data for specific PDZ families, their applicability is limited to PDZ families closely related to the training set. These tools also do not allow analysis of PDZ-peptide interaction interfaces. We have used a structure based approach to develop modPDZpep, a program to predict the interaction partners of human PDZ domains and analyze structural details of PDZ interaction interfaces. modPDZpep predicts interaction partners by using structural models of PDZ-peptide complexes and evaluating binding energy scores using residue based statistical pair potentials. Since, it does not require training using experimental data on peptide binding affinity, it can predict substrates for diverse PDZ families. Because of the use of simple scoring function for binding energy, it is also fast enough for genome scale structure based analysis of PDZ interaction networks. Benchmarking using artificial as well as real negative datasets indicates good predictive power with ROC-AUC values in the range of 0.7 to 0.9 for a large number of human PDZ domains. Another novel feature of modPDZpep is its ability to map novel PDZ mediated interactions in human protein-protein interaction networks, either by utilizing available experimental phage display data or by structure based predictions. In summary, we have developed modPDZpep, a web-server for structure based analysis of human PDZ domains. It is freely available at http://www.nii.ac.in/modPDZpep.html or http://202.54.226.235/modPDZpep.html . This article was reviewed by Michael Gromiha and Zoltán Gáspári.

  14. Dissecting the protein architecture of DNA-binding transcription factors in bacteria and archaea.

    PubMed

    Rivera-Gómez, Nancy; Martínez-Núñez, Mario Alberto; Pastor, Nina; Rodriguez-Vazquez, Katya; Perez-Rueda, Ernesto

    2017-08-01

    Gene regulation at the transcriptional level is a central process in all organisms where DNA-binding transcription factors play a fundamental role. This class of proteins binds specifically at DNA sequences, activating or repressing gene expression as a function of the cell's metabolic status, operator context and ligand-binding status, among other factors, through the DNA-binding domain (DBD). In addition, TFs may contain partner domains (PaDos), which are involved in ligand binding and protein-protein interactions. In this work, we systematically evaluated the distribution, abundance and domain organization of DNA-binding TFs in 799 non-redundant bacterial and archaeal genomes. We found that the distributions of the DBDs and their corresponding PaDos correlated with the size of the genome. We also identified specific combinations between the DBDs and their corresponding PaDos. Within each class of DBDs there are differences in the actual angle formed at the dimerization interface, responding to the presence/absence of ligands and/or crystallization conditions, setting the orientation of the resulting helices and wings facing the DNA. Our results highlight the importance of PaDos as central elements that enhance the diversity of regulatory functions in all bacterial and archaeal organisms, and our results also demonstrate the role of PaDos in sensing diverse signal compounds. The highly specific interactions between DBDs and PaDos observed in this work, together with our structural analysis highlighting the difficulty in predicting both inter-domain geometry and quaternary structure, suggest that these systems appeared once and evolved with diverse duplication events in all the analysed organisms.

  15. Comparison of S. cerevisiae F-BAR domain structures reveals a conserved inositol phosphate binding site

    PubMed Central

    Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R.; Kenniston, Jon A.; Mendrola, Jeannine M.; Ferguson, Kathryn M.; Lemmon, Mark A.

    2015-01-01

    SUMMARY F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here, we compare membrane-binding properties of the S. cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound to an inositol phosphate. The structures explain phospholipid-binding selectivity differences, and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip, and is partly retained in certain other F-BAR domains. Our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity, and provide a basis for its prediction from sequence. PMID:25620000

  16. A search for specificity in DNA-drug interactions.

    PubMed

    Cruciani, G; Goodford, P J

    1994-06-01

    The GRID force field and a principal component analysis have been used in order to predict the interactions of small chemical groups with all 64 different triplet sequences of B-DNA. Factors that favor binding to guanine-cytosine base pairs have been identified, and a dictionary of ligand groups and their locations is presented as a guide to the design of specific DNA ligands.

  17. The DNA-recognition mode shared by archaeal feast/famine-regulatory proteins revealed by the DNA-binding specificities of TvFL3, FL10, FL11 and Ss-LrpB

    PubMed Central

    Yokoyama, Katsushi; Nogami, Hideki; Kabasawa, Mamiko; Ebihara, Sonomi; Shimowasa, Ai; Hashimoto, Keiko; Kawashima, Tsuyoshi; Ishijima, Sanae A.; Suzuki, Masashi

    2009-01-01

    The DNA-binding mode of archaeal feast/famine-regulatory proteins (FFRPs), i.e. paralogs of the Esherichia coli leucine-responsive regulatory protein (Lrp), was studied. Using the method of systematic evolution of ligands by exponential enrichment (SELEX), optimal DNA duplexes for interacting with TvFL3, FL10, FL11 and Ss-LrpB were identified as TACGA[AAT/ATT]TCGTA, GTTCGA[AAT/ATT]TCGAAC, CCGAAA[AAT/ATT]TTTCGG and TTGCAA[AAT/ATT]TTGCAA, respectively, all fitting into the form abcdeWWWedcba. Here W is A or T, and e.g. a and a are bases complementary to each other. Apparent equilibrium binding constants of the FFRPs and various DNA duplexes were determined, thereby confirming the DNA-binding specificities of the FFRPs. It is likely that these FFRPs recognize DNA in essentially the same way, since their DNA-binding specificities were all explained by the same pattern of relationship between amino-acid positions and base positions to form chemical interactions. As predicted from this relationship, when Gly36 of TvFL3 was replaced by Thr, the b base in the optimal DNA duplex changed from A to T, and, when Thr36 of FL10 was replaced by Ser, the b base changed from T to G/A. DNA-binding characteristics of other archaeal FFRPs, Ptr1, Ptr2, Ss-Lrp and LysM, are also consistent with the relationship. PMID:19468044

  18. Characterization of the hupSL promoter activity in Nostoc punctiforme ATCC 29133

    PubMed Central

    2009-01-01

    Background In cyanobacteria three enzymes are directly involved in the hydrogen metabolism; a nitrogenase that produces molecular hydrogen, H2, as a by-product of nitrogen fixation, an uptake hydrogenase that recaptures H2 and oxidize it, and a bidirectional hydrogenase that can both oxidize and produce H2.Nostoc punctiforme ATCC 29133 is a filamentous dinitrogen fixing cyanobacterium containing a nitrogenase and an uptake hydrogenase but no bidirectional hydrogenase. Generally, little is known about the transcriptional regulation of the cyanobacterial uptake hydrogenases. In this study gel shift assays showed that NtcA has a specific affinity to a region of the hupSL promoter containing a predicted NtcA binding site. The predicted NtcA binding site is centred at 258.5 bp upstream the transcription start point (tsp). To further investigate the hupSL promoter, truncated versions of the hupSL promoter were fused to either gfp or luxAB, encoding the reporter proteins Green Fluorescent Protein and Luciferase, respectively. Results Interestingly, all hupsSL promoter deletion constructs showed heterocyst specific expression. Unexpectedly the shortest promoter fragment, a fragment covering 57 bp upstream and 258 bp downstream the tsp, exhibited the highest promoter activity. Deletion of the NtcA binding site neither affected the expression to any larger extent nor the heterocyst specificity. Conclusion Obtained data suggest that the hupSL promoter in N. punctiforme is not strictly dependent on the upstream NtcA cis element and that the shortest promoter fragment (-57 to tsp) is enough for a high and heterocyst specific expression of hupSL. This is highly interesting because it indicates that the information that determines heterocyst specific gene expression might be confined to this short sequence or in the downstream untranslated leader sequence. PMID:19284581

  19. Knowledge-based fragment binding prediction.

    PubMed

    Tang, Grace W; Altman, Russ B

    2014-04-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening.

  20. Knowledge-based Fragment Binding Prediction

    PubMed Central

    Tang, Grace W.; Altman, Russ B.

    2014-01-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening. PMID:24762971

  1. CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

    PubMed

    Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

    2017-09-01

    Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.

  2. MicroRNA Biomarkers to Generate Sensitivity to Abiraterone-Resistant Prostate Cancer

    DTIC Science & Technology

    2016-09-01

    approach, employing Abiraterone (Abi) plus RNA therapy. For this, we will use an aptamer specific for PSMA (aptPSMA) to specifically target CRPC...develop RNA aptamer therapy. We will test 8 of the recently identified Abi regulated miRNAs for therapeutic utility in vitro. We will design an...as an independent marker for predicting disease relapse. We will use an RNA aptamer which binds specifically to PCa cells to deliver the miRNA. miRNA

  3. T-Epitope Designer: A HLA-peptide binding prediction server.

    PubMed

    Kangueane, Pandjassarame; Sakharkar, Meena Kishore

    2005-05-15

    The current challenge in synthetic vaccine design is the development of a methodology to identify and test short antigen peptides as potential T-cell epitopes. Recently, we described a HLA-peptide binding model (using structural properties) capable of predicting peptides binding to any HLA allele. Consequently, we have developed a web server named T-EPITOPE DESIGNER to facilitate HLA-peptide binding prediction. The prediction server is based on a model that defines peptide binding pockets using information gleaned from X-ray crystal structures of HLA-peptide complexes, followed by the estimation of peptide binding to binding pockets. Thus, the prediction server enables the calculation of peptide binding to HLA alleles. This model is superior to many existing methods because of its potential application to any given HLA allele whose sequence is clearly defined. The web server finds potential application in T cell epitope vaccine design. http://www.bioinformation.net/ted/

  4. MotifMark: Finding regulatory motifs in DNA sequences.

    PubMed

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

    2017-07-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.

  5. Demonstration and Characterization of Biomolecular Enrichment on Microfluidic Aptamer-Functionalized Surfaces

    PubMed Central

    Nguyen, Thai Huu; Pei, Renjun; Stojanovic, Milan; Lin, Qiao

    2010-01-01

    This paper demonstrates and systematically characterizes the enrichment of biomolecular compounds using aptamer-functionalized surfaces within a microfluidic device. The device consists of a microchamber packed with aptamer-functionalized microbeads and integrated with a microheater and temperature sensor to enable thermally controlled binding and release of biomolecules by the aptamer. We first present an equilibrium binding-based analytical model to understand the enrichment process. The characteristics of the aptamer-analyte binding and enrichment are then experimentally studied, using adenosine monophosphate (AMP) and a specific RNA aptamer as a model system. The temporal process of AMP binding to the aptamer is found to be primarily determined by the aptamer-AMP binding kinetics. The temporal process of aptamer-AMP dissociation at varying temperatures is also obtained and observed to occur relatively rapidly (< 2 s). The specificity of the enrichment is next confirmed by performing selective enrichment of AMP from a sample containing biomolecular impurities. Finally, we investigate the enrichment of AMP by either discrete or continuous introduction of a dilute sample into the microchamber, demonstrating enrichment factors ranging from 566 to 686×, which agree with predictions of the analytical model. PMID:21765612

  6. Mutations altering the cleavage specificity of a homing endonuclease

    PubMed Central

    Seligman, Lenny M.; Chisholm, Karen M.; Chevalier, Brett S.; Chadsey, Meggen S.; Edwards, Samuel T.; Savage, Jeremiah H.; Veillet, Adeline L.

    2002-01-01

    The homing endonuclease I-CreI recognizes and cleaves a particular 22 bp DNA sequence. The crystal structure of I-CreI bound to homing site DNA has previously been determined, leading to a number of predictions about specific protein–DNA contacts. We test these predictions by analyzing a set of endonuclease mutants and a complementary set of homing site mutants. We find evidence that all structurally predicted I-CreI/DNA contacts contribute to DNA recognition and show that these contacts differ greatly in terms of their relative importance. We also describe the isolation of a collection of altered specificity I-CreI derivatives. The in vitro DNA-binding and cleavage properties of two such endonucleases demonstrate that our genetic approach is effective in identifying homing endonucleases that recognize and cleave novel target sequences. PMID:12202772

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ratto, T V; Rudd, R E; Langry, K C

    We present evidence of multivalent interactions between a single protein molecule and multiple carbohydrates at a pH where the protein can bind four ligands. The evidence is based not only on measurements of the force required to rupture the bonds formed between ConcanavalinA (ConA) and {alpha}-D-mannose, but also on an analysis of the polymer-extension force curves to infer the polymer architecture that binds the protein to the cantilever and the ligands to the substrate. We find that although the rupture forces for multiple carbohydrate connections to a single protein are larger than the rupture force for a single connection, theymore » do not scale additively with increasing number. Specifically, the most common rupture forces are approximately 46, 66, and 85 pN, which we argue corresponds to 1, 2, and 3 ligands being pulled simultaneously from a single protein as corroborated by an analysis of the linkage architecture. As in our previous work polymer tethers allow us to discriminate between specific and non-specific binding. We analyze the binding configuration (i.e. serial versus parallel connections) through fitting the polymer stretching data with modified Worm-Like Chain (WLC) models that predict how the effective stiffness of the tethers is affected by multiple connections. This analysis establishes that the forces we measure are due to single proteins interacting with multiple ligands, the first force spectroscopy study that establishes single-molecule multivalent binding unambiguously.« less

  8. Radiation-induced oxidative damage to the DNA-binding domain of the lactose repressor

    PubMed Central

    Gillard, Nathalie; Goffinont, Stephane; Buré, Corinne; Davidkova, Marie; Maurizot, Jean-Claude; Cadene, Martine; Spotheim-Maurizot, Melanie

    2007-01-01

    Understanding the cellular effects of radiation-induced oxidation requires the unravelling of key molecular events, particularly damage to proteins with important cellular functions. The Escherichia coli lactose operon is a classical model of gene regulation systems. Its functional mechanism involves the specific binding of a protein, the repressor, to a specific DNA sequence, the operator. We have shown previously that upon irradiation with γ-rays in solution, the repressor loses its ability to bind the operator. Water radiolysis generates hydroxyl radicals (OH· radicals) which attack the protein. Damage of the repressor DNA-binding domain, called the headpiece, is most likely to be responsible of this loss of function. Using CD, fluorescence spectroscopy and a combination of proteolytic cleavage with MS, we have examined the state of the irradiated headpiece. CD measurements revealed a dose-dependent conformational change involving metastable intermediate states. Fluorescence measurements showed a gradual degradation of tyrosine residues. MS was used to count the number of oxidations in different regions of the headpiece and to narrow down the parts of the sequence bearing oxidized residues. By calculating the relative probabilities of reaction of each amino acid with OH· radicals, we can predict the most probable oxidation targets. By comparing the experimental results with the predictions we conclude that Tyr7, Tyr12, Tyr17, Met42 and Tyr47 are the most likely hotspots of oxidation. The loss of repressor function is thus correlated with chemical modifications and conformational changes of the headpiece. PMID:17263689

  9. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.

    PubMed

    Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.

  10. Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome

    PubMed Central

    Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.

    2016-01-01

    A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274

  11. Homology modelling of frequent HLA class-II alleles: A perspective to improve prediction of HLA binding peptide and understand the HLA associated disease susceptibility.

    PubMed

    Kashyap, Manju; Farooq, Umar; Jaiswal, Varun

    2016-10-01

    Human leukocyte antigen (HLA) plays significant role via the regulation of immune system and contribute in the progression and protection of many diseases. HLA molecules bind and present peptides to T- cell receptors which generate the immune response. HLA peptide interaction and molecular function of HLA molecule is the key to predict peptide binding and understanding its role in different diseases. The availability of accurate three dimensional (3D) structures is the initial step towards this direction. In the present work, homology modelling of important and frequent HLA-DRB1 alleles (07:01, 11:01 and 09:01) was done and acceptable models were generated. These modelled alleles were further refined and cross validated by using several methods including Ramachandran plot, Z-score, ERRAT analysis and root mean square deviation (RMSD) calculations. It is known that numbers of allelic variants are related to the susceptibility or protection of various infectious diseases. Difference in amino acid sequences and structures of alleles were also studied to understand the association of HLA with disease susceptibility and protection. Susceptible alleles showed more amino acid variations than protective alleles in three selected diseases caused by different pathogens. Amino acid variations at binding site were found to be more than other part of alleles. RMSD values were also higher at variable positions within binding site. Higher RMSD values indicate that mutations occurring at peptide binding site alter protein structure more than rest of the protein. Hence, these findings and modelled structures can be used to design HLA-DRB1 binding peptides to overcome low prediction accuracy of HLA class II binding peptides. Furthermore, it may help to understand the allele specific molecular mechanisms involved in susceptibility/resistance against pathogenic diseases. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Computational Framework for Prediction of Peptide Sequences That May Mediate Multiple Protein Interactions in Cancer-Associated Hub Proteins.

    PubMed

    Sarkar, Debasree; Patra, Piya; Ghosh, Abhirupa; Saha, Sudipto

    2016-01-01

    A considerable proportion of protein-protein interactions (PPIs) in the cell are estimated to be mediated by very short peptide segments that approximately conform to specific sequence patterns known as linear motifs (LMs), often present in the disordered regions in the eukaryotic proteins. These peptides have been found to interact with low affinity and are able bind to multiple interactors, thus playing an important role in the PPI networks involving date hubs. In this work, PPI data and de novo motif identification based method (MEME) were used to identify such peptides in three cancer-associated hub proteins-MYC, APC and MDM2. The peptides corresponding to the significant LMs identified for each hub protein were aligned, the overlapping regions across these peptides being termed as overlapping linear peptides (OLPs). These OLPs were thus predicted to be responsible for multiple PPIs of the corresponding hub proteins and a scoring system was developed to rank them. We predicted six OLPs in MYC and five OLPs in MDM2 that scored higher than OLP predictions from randomly generated protein sets. Two OLP sequences from the C-terminal of MYC were predicted to bind with FBXW7, component of an E3 ubiquitin-protein ligase complex involved in proteasomal degradation of MYC. Similarly, we identified peptides in the C-terminal of MDM2 interacting with FKBP3, which has a specific role in auto-ubiquitinylation of MDM2. The peptide sequences predicted in MYC and MDM2 look promising for designing orthosteric inhibitors against possible disease-associated PPIs. Since these OLPs can interact with other proteins as well, these inhibitors should be specific to the targeted interactor to prevent undesired side-effects. This computational framework has been designed to predict and rank the peptide regions that may mediate multiple PPIs and can be applied to other disease-associated date hub proteins for prediction of novel therapeutic targets of small molecule PPI modulators.

  13. Dynamic Changes in Nucleosome Occupancy Are Not Predictive of Gene Expression Dynamics but Are Linked to Transcription and Chromatin Regulators

    PubMed Central

    Huebert, Dana J.; Kuan, Pei-Fen; Keleş, Sündüz

    2012-01-01

    The response to stressful stimuli requires rapid, precise, and dynamic gene expression changes that must be coordinated across the genome. To gain insight into the temporal ordering of genome reorganization, we investigated dynamic relationships between changing nucleosome occupancy, transcription factor binding, and gene expression in Saccharomyces cerevisiae yeast responding to oxidative stress. We applied deep sequencing to nucleosomal DNA at six time points before and after hydrogen peroxide treatment and revealed many distinct dynamic patterns of nucleosome gain and loss. The timing of nucleosome repositioning was not predictive of the dynamics of downstream gene expression change but instead was linked to nucleosome position relative to transcription start sites and specific cis-regulatory elements. We measured genome-wide binding of the stress-activated transcription factor Msn2p over time and found that Msn2p binds different loci with different dynamics. Nucleosome eviction from Msn2p binding sites was common across the genome; however, we show that, contrary to expectation, nucleosome loss occurred after Msn2p binding and in fact required Msn2p. This negates the prevailing model that nucleosomes obscuring Msn2p sites regulate DNA access and must be lost before Msn2p can bind DNA. Together, these results highlight the complexities of stress-dependent chromatin changes and their effects on gene expression. PMID:22354995

  14. UNC-18 Promotes Both the Anterograde Trafficking and Synaptic Function of Syntaxin

    PubMed Central

    McEwen, Jason M.

    2008-01-01

    The SM protein UNC-18 has been proposed to regulate several aspects of secretion, including synaptic vesicle docking, priming, and fusion. Here, we show that UNC-18 has a chaperone function in neurons, promoting anterograde transport of the plasma membrane soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) protein Syntaxin-1. In unc-18 mutants, UNC-64 (Caenorhabditis elegans Syntaxin-1) accumulates in neuronal cell bodies. Colocalization studies and analysis of carbohydrate modifications both suggest that this accumulation occurs in the endoplasmic reticulum. This trafficking defect is specific for UNC-64 Syntaxin-1, because 14 other SNARE proteins and two active zone markers were unaffected. UNC-18 binds to Syntaxin through at least two mechanisms: binding to closed Syntaxin, or to the N terminus of Syntaxin. It is unclear which of these binding modes mediates UNC-18 function in neurons. The chaperone function of UNC-18 was eliminated in double mutants predicted to disrupt both modes of Syntaxin binding, but it was unaffected in single mutants. By contrast, mutations predicted to disrupt UNC-18 binding to the N terminus of Syntaxin caused significant defects in locomotion behavior and responsiveness to cholinesterase inhibitors. Collectively, these results demonstrate the UNC-18 acts as a molecular chaperone for Syntaxin transport in neurons and that the two modes of UNC-18 binding to Syntaxin are involved in different aspects of UNC-18 function. PMID:18596236

  15. Analysis of the DNA-Binding Activities of the Arabidopsis R2R3-MYB Transcription Factor Family by One-Hybrid Experiments in Yeast

    PubMed Central

    Kelemen, Zsolt; Sebastian, Alvaro; Xu, Wenjia; Grain, Damaris; Salsac, Fabien; Avon, Alexandra; Berger, Nathalie; Tran, Joseph; Dubreucq, Bertrand; Lurin, Claire; Lepiniec, Loïc; Contreras-Moreira, Bruno; Dubos, Christian

    2015-01-01

    The control of growth and development of all living organisms is a complex and dynamic process that requires the harmonious expression of numerous genes. Gene expression is mainly controlled by the activity of sequence-specific DNA binding proteins called transcription factors (TFs). Amongst the various classes of eukaryotic TFs, the MYB superfamily is one of the largest and most diverse, and it has considerably expanded in the plant kingdom. R2R3-MYBs have been extensively studied over the last 15 years. However, DNA-binding specificity has been characterized for only a small subset of these proteins. Therefore, one of the remaining challenges is the exhaustive characterization of the DNA-binding specificity of all R2R3-MYB proteins. In this study, we have developed a library of Arabidopsis thaliana R2R3-MYB open reading frames, whose DNA-binding activities were assayed in vivo (yeast one-hybrid experiments) with a pool of selected cis-regulatory elements. Altogether 1904 interactions were assayed leading to the discovery of specific patterns of interactions between the various R2R3-MYB subgroups and their DNA target sequences and to the identification of key features that govern these interactions. The present work provides a comprehensive in vivo analysis of R2R3-MYB binding activities that should help in predicting new DNA motifs and identifying new putative target genes for each member of this very large family of TFs. In a broader perspective, the generated data will help to better understand how TF interact with their target DNA sequences. PMID:26484765

  16. The Distribution of Lectins across the Phylum Nematoda: A Genome-Wide Search

    PubMed Central

    Bauters, Lander; Naalden, Diana; Gheysen, Godelieve

    2017-01-01

    Nematodes are a very diverse phylum that has adapted to nearly every ecosystem. They have developed specialized lifestyles, dividing the phylum into free-living, animal, and plant parasitic species. Their sheer abundance in numbers and presence in nearly every ecosystem make them the most prevalent animals on earth. In this research nematode-specific profiles were designed to retrieve predicted lectin-like domains from the sequence data of nematode genomes and transcriptomes. Lectins are carbohydrate-binding proteins that play numerous roles inside and outside the cell depending on their sugar specificity and associated protein domains. The sugar-binding properties of the retrieved lectin-like proteins were predicted in silico. Although most research has focused on C-type lectin-like, galectin-like, and calreticulin-like proteins in nematodes, we show that the lectin-like repertoire in nematodes is far more diverse. We focused on C-type lectins, which are abundantly present in all investigated nematode species, but seem to be far more abundant in free-living species. Although C-type lectin-like proteins are omnipresent in nematodes, we have shown that only a small part possesses the residues that are thought to be essential for carbohydrate binding. Curiously, hevein, a typical plant lectin domain not reported in animals before, was found in some nematode species. PMID:28054982

  17. The Distribution of Lectins across the Phylum Nematoda: A Genome-Wide Search.

    PubMed

    Bauters, Lander; Naalden, Diana; Gheysen, Godelieve

    2017-01-04

    Nematodes are a very diverse phylum that has adapted to nearly every ecosystem. They have developed specialized lifestyles, dividing the phylum into free-living, animal, and plant parasitic species. Their sheer abundance in numbers and presence in nearly every ecosystem make them the most prevalent animals on earth. In this research nematode-specific profiles were designed to retrieve predicted lectin-like domains from the sequence data of nematode genomes and transcriptomes. Lectins are carbohydrate-binding proteins that play numerous roles inside and outside the cell depending on their sugar specificity and associated protein domains. The sugar-binding properties of the retrieved lectin-like proteins were predicted in silico. Although most research has focused on C-type lectin-like, galectin-like, and calreticulin-like proteins in nematodes, we show that the lectin-like repertoire in nematodes is far more diverse. We focused on C-type lectins, which are abundantly present in all investigated nematode species, but seem to be far more abundant in free-living species. Although C-type lectin-like proteins are omnipresent in nematodes, we have shown that only a small part possesses the residues that are thought to be essential for carbohydrate binding. Curiously, hevein, a typical plant lectin domain not reported in animals before, was found in some nematode species.

  18. Fragmentation cross sections and binding energies of neutron-rich nuclei

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsang, M. B.; Lynch, W. G.; Mocko, M.

    An exponential dependence of the fragmentation cross section on the average binding energy is observed and reproduced with a statistical model. The observed functional dependence is robust and allows the extraction of binding energies from measured cross sections. From the systematics of Cu isotope cross sections, the binding energies of {sup 76,77,78,79}Cu have been extracted. They are 636.94{+-}0.4,647.1{+-}0.4,651.6{+-}0.4, and 657.8{+-}0.5 MeV, respectively. Specifically, the uncertainty of the binding energy of {sup 75}Cu is reduced from 980 keV, as listed in the 2003 mass table of Audi, Wapstra, and Thibault to 400 keV. The predicted cross sections of two near drip-linemore » nuclei, {sup 39}Na and {sup 40}Mg from the fragmentation of {sup 48}Ca are discussed.« less

  19. An Aryl Hydrocarbon Receptor from the Salamander Ambystoma mexicanum Exhibits Low Sensitivity to 2,3,7,8-Tetrachlorodibenzo-p-dioxin.

    PubMed

    Shoots, Jenny; Fraccalvieri, Domenico; Franks, Diana G; Denison, Michael S; Hahn, Mark E; Bonati, Laura; Powell, Wade H

    2015-06-02

    Structural features of the aryl hydrocarbon receptor (AHR) can underlie species- and population-specific differences in its affinity for 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). These differences often explain variations in TCDD toxicity. Frogs are relatively insensitive to dioxin, and Xenopus AHRs bind TCDD with low affinity. Weak TCDD binding results from the combination of three residues in the ligand-binding domain: A354 and A370, and N325. Here we sought to determine whether this mechanism of weak TCDD binding is shared by other amphibian AHRs. We isolated an AHR cDNA from the Mexican axolotl (Ambystoma mexicanum). The encoded polypeptide contains identical residues at positions that confer low TCDD affinity to X. laevis AHRs (A364, A380, and N335), and homology modeling predicts they protrude into the binding cavity. Axolotl AHR bound one-tenth the TCDD of mouse AHR in velocity sedimentation analysis, and in transactivation assays, the EC50 for TCDD was 23 nM, similar to X. laevis AHR1β (27 nM) and greater than AHR containing the mouse ligand-binding domain (0.08 nM). Sequence, modeled structure, and function indicate that axolotl AHR binds TCDD weakly, predicting that A. mexicanum lacks sensitivity toTCDD toxicity. We hypothesize that this characteristic of axolotl and Xenopus AHRs arose in a common ancestor of the Caudata and Anura.

  20. Prospective evaluation of shape similarity based pose prediction method in D3R Grand Challenge 2015

    NASA Astrophysics Data System (ADS)

    Kumar, Ashutosh; Zhang, Kam Y. J.

    2016-09-01

    Evaluation of ligand three-dimensional (3D) shape similarity is one of the commonly used approaches to identify ligands similar to one or more known active compounds from a library of small molecules. Apart from using ligand shape similarity as a virtual screening tool, its role in pose prediction and pose scoring has also been reported. We have recently developed a method that utilizes ligand 3D shape similarity with known crystallographic ligands to predict binding poses of query ligands. Here, we report the prospective evaluation of our pose prediction method through the participation in drug design data resource (D3R) Grand Challenge 2015. Our pose prediction method was used to predict binding poses of heat shock protein 90 (HSP90) and mitogen activated protein kinase kinase kinase kinase (MAP4K4) ligands and it was able to predict the pose within 2 Å root mean square deviation (RMSD) either as the top pose or among the best of five poses in a majority of cases. Specifically for HSP90 protein, a median RMSD of 0.73 and 0.68 Å was obtained for the top and the best of five predictions respectively. For MAP4K4 target, although the median RMSD for our top prediction was only 2.87 Å but the median RMSD of 1.67 Å for the best of five predictions was well within the limit for successful prediction. Furthermore, the performance of our pose prediction method for HSP90 and MAP4K4 ligands was always among the top five groups. Particularly, for MAP4K4 protein our pose prediction method was ranked number one both in terms of mean and median RMSD when the best of five predictions were considered. Overall, our D3R Grand Challenge 2015 results demonstrated that ligand 3D shape similarity with the crystal ligand is sufficient to predict binding poses of new ligands with acceptable accuracy.

  1. Changes in signal transducer and activator of transcription 3 (STAT3) dynamics induced by complexation with pharmacological inhibitors of Src homology 2 (SH2) domain dimerization.

    PubMed

    Resetca, Diana; Haftchenary, Sina; Gunning, Patrick T; Wilson, Derek J

    2014-11-21

    The activity of the transcription factor signal transducer and activator of transcription 3 (STAT3) is dysregulated in a number of hematological and solid malignancies. Development of pharmacological STAT3 Src homology 2 (SH2) domain interaction inhibitors holds great promise for cancer therapy, and a novel class of salicylic acid-based STAT3 dimerization inhibitors that includes orally bioavailable drug candidates has been recently developed. The compounds SF-1-066 and BP-1-102 are predicted to bind to the STAT3 SH2 domain. However, given the highly unstructured and dynamic nature of the SH2 domain, experimental confirmation of this prediction was elusive. We have interrogated the protein-ligand interaction of STAT3 with these small molecule inhibitors by means of time-resolved electrospray ionization hydrogen-deuterium exchange mass spectrometry. Analysis of site-specific evolution of deuterium uptake induced by the complexation of STAT3 with SF-1-066 or BP-1-102 under physiological conditions enabled the mapping of the in silico predicted inhibitor binding site to the STAT3 SH2 domain. The binding of both inhibitors to the SH2 domain resulted in significant local decreases in dynamics, consistent with solvent exclusion at the inhibitor binding site and increased rigidity of the inhibitor-complexed SH2 domain. Interestingly, inhibitor binding induced hot spots of allosteric perturbations outside of the SH2 domain, manifesting mainly as increased deuterium uptake, in regions of STAT3 important for DNA binding and nuclear localization. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  2. Computational design of an endo-1,4-[beta]-xylanase ligand binding site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morin, Andrew; Kaufmann, Kristian W.; Fortenberry, Carie

    2012-09-05

    The field of computational protein design has experienced important recent success. However, the de novo computational design of high-affinity protein-ligand interfaces is still largely an open challenge. Using the Rosetta program, we attempted the in silico design of a high-affinity protein interface to a small peptide ligand. We chose the thermophilic endo-1,4-{beta}-xylanase from Nonomuraea flexuosa as the protein scaffold on which to perform our designs. Over the course of the study, 12 proteins derived from this scaffold were produced and assayed for binding to the target ligand. Unfortunately, none of the designed proteins displayed evidence of high-affinity binding. Structural characterizationmore » of four designed proteins revealed that although the predicted structure of the protein model was highly accurate, this structural accuracy did not translate into accurate prediction of binding affinity. Crystallographic analyses indicate that the lack of binding affinity is possibly due to unaccounted for protein dynamics in the 'thumb' region of our design scaffold intrinsic to the family 11 {beta}-xylanase fold. Further computational analysis revealed two specific, single amino acid substitutions responsible for an observed change in backbone conformation, and decreased dynamic stability of the catalytic cleft. These findings offer new insight into the dynamic and structural determinants of the {beta}-xylanase proteins.« less

  3. A label-free approach to detect ligand binding to cell surface proteins in real time.

    PubMed

    Burtscher, Verena; Hotka, Matej; Li, Yang; Freissmuth, Michael; Sandtner, Walter

    2018-04-26

    Electrophysiological recordings allow for monitoring the operation of proteins with high temporal resolution down to the single molecule level. This technique has been exploited to track either ion flow arising from channel opening or the synchronized movement of charged residues and/or ions within the membrane electric field. Here, we describe a novel type of current by using the serotonin transporter (SERT) as a model. We examined transient currents elicited on rapid application of specific SERT inhibitors. Our analysis shows that these currents originate from ligand binding and not from a long-range conformational change. The Gouy-Chapman model predicts that adsorption of charged ligands to surface proteins must produce displacement currents and related apparent changes in membrane capacitance. Here we verified these predictions with SERT. Our observations demonstrate that ligand binding to a protein can be monitored in real time and in a label-free manner by recording the membrane capacitance. © 2018, Burtscher et al.

  4. Systematic characterization of the specificity of the SH2 domains of cytoplasmic tyrosine kinases.

    PubMed

    Zhao, Bing; Tan, Pauline H; Li, Shawn S C; Pei, Dehua

    2013-04-09

    Cytoplasmic tyrosine kinases (CTK) generally contain a Src-homology 2 (SH2) domain, whose role in the CTK family is not fully understood. Here we report the determination of the specificity of 25 CTK SH2 domains by screening one-bead-one-compound (OBOC) peptide libraries. Based on the peptide sequences selected by the SH2 domains, we built Support Vector Machine (SVM) models for the prediction of binding ligands for the SH2 domains. These models yielded support for the progressive phosphorylation model for CTKs in which the overlapping specificity of the CTK SH2 and kinase domains has been proposed to facilitate targeting of the CTK substrates with at least two potential phosphotyrosine (pTyr) sites. We curated 93 CTK substrates with at least two pTyr sites catalyzed by the same CTK, and showed that 71% of these substrates had at least two pTyr sites predicted to bind a common CTK SH2 domain. More importantly, we found 34 instances where there was at least one pTyr site predicted to be recognized by the SH2 domain of the same CTK, suggesting that the SH2 and kinase domains of the CTKs may cooperate to achieve progressive phosphorylation of a protein substrate. This article is part of a Special Issue entitled: From protein structures to clinical applications. Copyright © 2012 Elsevier B.V. All rights reserved.

  5. First Principles Predictions of the Structure and Function of G-Protein-Coupled Receptors: Validation for Bovine Rhodopsin

    PubMed Central

    Trabanino, Rene J.; Hall, Spencer E.; Vaidehi, Nagarajan; Floriano, Wely B.; Kam, Victor W. T.; Goddard, William A.

    2004-01-01

    G-protein-coupled receptors (GPCRs) are involved in cell communication processes and with mediating such senses as vision, smell, taste, and pain. They constitute a prominent superfamily of drug targets, but an atomic-level structure is available for only one GPCR, bovine rhodopsin, making it difficult to use structure-based methods to design receptor-specific drugs. We have developed the MembStruk first principles computational method for predicting the three-dimensional structure of GPCRs. In this article we validate the MembStruk procedure by comparing its predictions with the high-resolution crystal structure of bovine rhodopsin. The crystal structure of bovine rhodopsin has the second extracellular (EC-II) loop closed over the transmembrane regions by making a disulfide linkage between Cys-110 and Cys-187, but we speculate that opening this loop may play a role in the activation process of the receptor through the cysteine linkage with helix 3. Consequently we predicted two structures for bovine rhodopsin from the primary sequence (with no input from the crystal structure)—one with the EC-II loop closed as in the crystal structure, and the other with the EC-II loop open. The MembStruk-predicted structure of bovine rhodopsin with the closed EC-II loop deviates from the crystal by 2.84 Å coordinate root mean-square (CRMS) in the transmembrane region main-chain atoms. The predicted three-dimensional structures for other GPCRs can be validated only by predicting binding sites and energies for various ligands. For such predictions we developed the HierDock first principles computational method. We validate HierDock by predicting the binding site of 11-cis-retinal in the crystal structure of bovine rhodopsin. Scanning the whole protein without using any prior knowledge of the binding site, we find that the best scoring conformation in rhodopsin is 1.1 Å CRMS from the crystal structure for the ligand atoms. This predicted conformation has the carbonyl O only 2.82 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 0.62 Å CRMS from the crystal structure. We also used HierDock to predict the binding site of 11-cis-retinal in the MembStruk-predicted structure of bovine rhodopsin (closed loop). Scanning the whole protein structure leads to a structure in which the carbonyl O is only 2.85 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 2.92 Å CRMS from the crystal structure. The good agreement of the ab initio-predicted protein structures and ligand binding site with experiment validates the use of the MembStruk and HierDock first principles' methods. Since these methods are generic and applicable to any GPCR, they should be useful in predicting the structures of other GPCRs and the binding site of ligands to these proteins. PMID:15041637

  6. Widespread Site-Dependent Buffering of Human Regulatory Polymorphism

    PubMed Central

    Kutyavin, Tanya; Stamatoyannopoulos, John A.

    2012-01-01

    The average individual is expected to harbor thousands of variants within non-coding genomic regions involved in gene regulation. However, it is currently not possible to interpret reliably the functional consequences of genetic variation within any given transcription factor recognition sequence. To address this, we comprehensively analyzed heritable genome-wide binding patterns of a major sequence-specific regulator (CTCF) in relation to genetic variability in binding site sequences across a multi-generational pedigree. We localized and quantified CTCF occupancy by ChIP-seq in 12 related and unrelated individuals spanning three generations, followed by comprehensive targeted resequencing of the entire CTCF–binding landscape across all individuals. We identified hundreds of variants with reproducible quantitative effects on CTCF occupancy (both positive and negative). While these effects paralleled protein–DNA recognition energetics when averaged, they were extensively buffered by striking local context dependencies. In the significant majority of cases buffering was complete, resulting in silent variants spanning every position within the DNA recognition interface irrespective of level of binding energy or evolutionary constraint. The prevalence of complex partial or complete buffering effects severely constrained the ability to predict reliably the impact of variation within any given binding site instance. Surprisingly, 40% of variants that increased CTCF occupancy occurred at positions of human–chimp divergence, challenging the expectation that the vast majority of functional regulatory variants should be deleterious. Our results suggest that, even in the presence of “perfect” genetic information afforded by resequencing and parallel studies in multiple related individuals, genomic site-specific prediction of the consequences of individual variation in regulatory DNA will require systematic coupling with empirical functional genomic measurements. PMID:22457641

  7. Deciphering the Arginine-Binding Preferences at the Substrate-Binding Groove of Ser/Thr Kinases by Computational Surface Mapping

    PubMed Central

    Ben-Shimon, Avraham; Niv, Masha Y.

    2011-01-01

    Protein kinases are key signaling enzymes that catalyze the transfer of γ-phosphate from an ATP molecule to a phospho-accepting residue in the substrate. Unraveling the molecular features that govern the preference of kinases for particular residues flanking the phosphoacceptor is important for understanding kinase specificities toward their substrates and for designing substrate-like peptidic inhibitors. We applied ANCHORSmap, a new fragment-based computational approach for mapping amino acid side chains on protein surfaces, to predict and characterize the preference of kinases toward Arginine binding. We focus on positions P−2 and P−5, commonly occupied by Arginine (Arg) in substrates of basophilic Ser/Thr kinases. The method accurately identified all the P−2/P−5 Arg binding sites previously determined by X-ray crystallography and produced Arg preferences that corresponded to those experimentally found by peptide arrays. The predicted Arg-binding positions and their associated pockets were analyzed in terms of shape, physicochemical properties, amino acid composition, and in-silico mutagenesis, providing structural rationalization for previously unexplained trends in kinase preferences toward Arg moieties. This methodology sheds light on several kinases that were described in the literature as having non-trivial preferences for Arg, and provides some surprising departures from the prevailing views regarding residues that determine kinase specificity toward Arg. In particular, we found that the preference for a P−5 Arg is not necessarily governed by the 170/230 acidic pair, as was previously assumed, but by several different pairs of acidic residues, selected from positions 133, 169, and 230 (PKA numbering). The acidic residue at position 230 serves as a pivotal element in recognizing Arg from both the P−2 and P−5 positions. PMID:22125489

  8. Predicted RNA Binding Proteins Pes4 and Mip6 Regulate mRNA Levels, Translation, and Localization during Sporulation in Budding Yeast.

    PubMed

    Jin, Liang; Zhang, Kai; Sternglanz, Rolf; Neiman, Aaron M

    2017-05-01

    In response to starvation, diploid cells of Saccharomyces cerevisiae undergo meiosis and form haploid spores, a process collectively referred to as sporulation. The differentiation into spores requires extensive changes in gene expression. The transcriptional activator Ndt80 is a central regulator of this process, which controls many genes essential for sporulation. Ndt80 induces ∼300 genes coordinately during meiotic prophase, but different mRNAs within the NDT80 regulon are translated at different times during sporulation. The protein kinase Ime2 and RNA binding protein Rim4 are general regulators of meiotic translational delay, but how differential timing of individual transcripts is achieved was not known. This report describes the characterization of two related NDT80 -induced genes, PES4 and MIP6 , encoding predicted RNA binding proteins. These genes are necessary to regulate the steady-state expression, translational timing, and localization of a set of mRNAs that are transcribed by NDT80 but not translated until the end of meiosis II. Mutations in the predicted RNA binding domains within PES4 alter the stability of target mRNAs. PES4 and MIP6 affect only a small portion of the NDT80 regulon, indicating that they act as modulators of the general Ime2/Rim4 pathway for specific transcripts. Copyright © 2017 American Society for Microbiology.

  9. An evolution-based DNA-binding residue predictor using a dynamic query-driven learning scheme.

    PubMed

    Chai, H; Zhang, J; Yang, G; Ma, Z

    2016-11-15

    DNA-binding proteins play a pivotal role in various biological activities. Identification of DNA-binding residues (DBRs) is of great importance for understanding the mechanism of gene regulations and chromatin remodeling. Most traditional computational methods usually construct their predictors on static non-redundant datasets. They excluded many homologous DNA-binding proteins so as to guarantee the generalization capability of their models. However, those ignored samples may potentially provide useful clues when studying protein-DNA interactions, which have not obtained enough attention. In view of this, we propose a novel method, namely DQPred-DBR, to fill the gap of DBR predictions. First, a large-scale extensible sample pool was compiled. Second, evolution-based features in the form of a relative position specific score matrix and covariant evolutionary conservation descriptors were used to encode the feature space. Third, a dynamic query-driven learning scheme was designed to make more use of proteins with known structure and functions. In comparison with a traditional static model, the introduction of dynamic models could obviously improve the prediction performance. Experimental results from the benchmark and independent datasets proved that our DQPred-DBR had promising generalization capability. It was capable of producing decent predictions and outperforms many state-of-the-art methods. For the convenience of academic use, our proposed method was also implemented as a web server at .

  10. Novel Modeling of Combinatorial miRNA Targeting Identifies SNP with Potential Role in Bone Density

    PubMed Central

    Coronnello, Claudia; Hartmaier, Ryan; Arora, Arshi; Huleihel, Luai; Pandit, Kusum V.; Bais, Abha S.; Butterworth, Michael; Kaminski, Naftali; Stormo, Gary D.; Oesterreich, Steffi; Benos, Panayiotis V.

    2012-01-01

    MicroRNAs (miRNAs) are post-transcriptional regulators that bind to their target mRNAs through base complementarity. Predicting miRNA targets is a challenging task and various studies showed that existing algorithms suffer from high number of false predictions and low to moderate overlap in their predictions. Until recently, very few algorithms considered the dynamic nature of the interactions, including the effect of less specific interactions, the miRNA expression level, and the effect of combinatorial miRNA binding. Addressing these issues can result in a more accurate miRNA:mRNA modeling with many applications, including efficient miRNA-related SNP evaluation. We present a novel thermodynamic model based on the Fermi-Dirac equation that incorporates miRNA expression in the prediction of target occupancy and we show that it improves the performance of two popular single miRNA target finders. Modeling combinatorial miRNA targeting is a natural extension of this model. Two other algorithms show improved prediction efficiency when combinatorial binding models were considered. ComiR (Combinatorial miRNA targeting), a novel algorithm we developed, incorporates the improved predictions of the four target finders into a single probabilistic score using ensemble learning. Combining target scores of multiple miRNAs using ComiR improves predictions over the naïve method for target combination. ComiR scoring scheme can be used for identification of SNPs affecting miRNA binding. As proof of principle, ComiR identified rs17737058 as disruptive to the miR-488-5p:NCOA1 interaction, which we confirmed in vitro. We also found rs17737058 to be significantly associated with decreased bone mineral density (BMD) in two independent cohorts indicating that the miR-488-5p/NCOA1 regulatory axis is likely critical in maintaining BMD in women. With increasing availability of comprehensive high-throughput datasets from patients ComiR is expected to become an essential tool for miRNA-related studies. PMID:23284279

  11. DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

    PubMed

    Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford

    2017-10-01

    Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  12. Raising an Antibody Specific to Breast Cancer Subpopulations Using Phage Display on Tissue Sections.

    PubMed

    Larsen, Simon Asbjørn; Meldgaard, Theresa; Fridriksdottir, Agla Jael Rubner; Lykkemark, Simon; Poulsen, Pi Camilla; Overgaard, Laura Falkensteen; Petersen, Helene Bundgaard; Petersen, Ole William; Kristensen, Peter

    2016-01-01

    Primary tumors display a great level of intra-tumor heterogeneity in breast cancer. The current lack of prognostic and predictive biomarkers limits accurate stratification and the ability to predict response to therapy. The aim of the present study was to select recombinant antibody fragments specific against breast cancer subpopulations, aiding the discovery of novel biomarkers. Recombinant antibody fragments were selected by phage display. A novel shadowstick technology enabled the direct selection using tissue sections of antibody fragments specific against small subpopulations of breast cancer cells. Selections were performed against a subpopulation of breast cancer cells expressing CD271+, as these previously have been indicated to be potential breast cancer stem cells. The selected antibody fragments were screened by phage ELISA on both breast cancer and myoepithelial cells. The antibody fragments were validated and evaluated by immunohistochemistry experiments. Our study revealed an antibody fragment, LH8, specific for breast cancer cells. Immunohistochemistry results indicate that this particular antibody fragment binds an antigen that exhibits differential expression in different breast cancer subpopulations. Further studies characterizing this antibody fragment, the subpopulation it binds and the cognate antigen may unearth novel biomarkers of clinical relevance. Copyright© 2016, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.

  13. The Specificity of Innate Immune Responses Is Enforced by Repression of Interferon Response Elements by NF-κB p50

    PubMed Central

    Cheng, Christine S.; Feldman, Kristyn E.; Lee, James; Verma, Shilpi; Huang, De-Bin; Huynh, Kim; Chang, Mikyoung; Ponomarenko, Julia V.; Sun, Shao-Cong; Benedict, Chris A.; Ghosh, Gourisankar; Hoffmann, Alexander

    2011-01-01

    The specific binding of transcription factors to cognate sequence elements is thought to be critical for the generation of specific gene expression programs. Members of the nuclear factor κB (NF-κB) and interferon (IFN) regulatory factor (IRF) transcription factor families bind to the κB site and the IFN response element (IRE), respectively, of target genes, and they are activated in macrophages after exposure to pathogens. However, how these factors produce pathogen-specific inflammatory and immune responses remains poorly understood. Combining top-down and bottom-up systems biology approaches, we have identified the NF-κB p50 homodimer as a regulator of IRF responses. Unbiased genome-wide expression and biochemical and structural analyses revealed that the p50 homodimer repressed a subset of IFN-inducible genes through a previously uncharacterized subclass of guanine-rich IRE (G-IRE) sequences. Mathematical modeling predicted that the p50 homodimer might enforce the stimulus specificity of composite promoters. Indeed, the production of the antiviral regulator IFN-β was rendered stimulus-specific by the binding of the p50 homodimer to the G-IRE–containing IFNβ enhancer to suppress cytotoxic IFN signaling. Specifically, a deficiency in p50 resulted in the inappropriate production of IFN-β in response to bacterial DNA sensed by Toll-like receptor 9. This role for the NF-κB p50 homodimer in enforcing the specificity of the cellular response to pathogens by binding to a subset of IRE sequences alters our understanding of how the NF-κB and IRF signaling systems cooperate to regulate antimicrobial immunity. PMID:21343618

  14. Establishment of HLA-DR4 Transgenic Mice for the Identification of CD4+ T Cell Epitopes of Tumor-Associated Antigens

    PubMed Central

    Harada, Kumiko; Michibata, Yayoi; Tsukamoto, Hirotake; Senju, Satoru; Tomita, Yusuke; Yuno, Akira; Hirayama, Masatoshi; Abu Sayem, Mohammad; Takeda, Naoki; Shibuya, Isao; Sogo, Shinji; Fujiki, Fumihiro; Sugiyama, Haruo; Eto, Masatoshi; Nishimura, Yasuharu

    2013-01-01

    Reports have shown that activation of tumor-specific CD4+ helper T (Th) cells is crucial for effective anti-tumor immunity and identification of Th-cell epitopes is critical for peptide vaccine-based cancer immunotherapy. Although computer algorithms are available to predict peptides with high binding affinity to a specific HLA class II molecule, the ability of those peptides to induce Th-cell responses must be evaluated. We have established HLA-DR4 (HLA-DRA*01:01/HLA-DRB1*04:05) transgenic mice (Tgm), since this HLA-DR allele is most frequent (13.6%) in Japanese population, to evaluate HLA-DR4-restricted Th-cell responses to tumor-associated antigen (TAA)-derived peptides predicted to bind to HLA-DR4. To avoid weak binding between mouse CD4 and HLA-DR4, Tgm were designed to express chimeric HLA-DR4/I-Ed, where I-Ed α1 and β1 domains were replaced with those from HLA-DR4. Th cells isolated from Tgm immunized with adjuvant and HLA-DR4-binding cytomegalovirus-derived peptide proliferated when stimulated with peptide-pulsed HLA-DR4-transduced mouse L cells, indicating chimeric HLA-DR4/I-Ed has equivalent antigen presenting capacity to HLA-DR4. Immunization with CDCA155-78 peptide, a computer algorithm-predicted HLA-DR4-binding peptide derived from TAA CDCA1, successfully induced Th-cell responses in Tgm, while immunization of HLA-DR4-binding Wilms' tumor 1 antigen-derived peptide with identical amino acid sequence to mouse ortholog failed. This was overcome by using peptide-pulsed syngeneic bone marrow-derived dendritic cells (BM-DC) followed by immunization with peptide/CFA booster. BM-DC-based immunization of KIF20A494-517 peptide from another TAA KIF20A, with an almost identical HLA-binding core amino acid sequence to mouse ortholog, successfully induced Th-cell responses in Tgm. Notably, both CDCA155-78 and KIF20A494-517 peptides induced human Th-cell responses in PBMCs from HLA-DR4-positive donors. Finally, an HLA-DR4 binding DEPDC1191-213 peptide from a new TAA DEPDC1 overexpressed in bladder cancer induced strong Th-cell responses both in Tgm and in PBMCs from an HLA-DR4-positive donor. Thus, the HLA-DR4 Tgm combined with computer algorithm was useful for preliminary screening of candidate peptides for vaccination. PMID:24386437

  15. Protein docking prediction using predicted protein-protein interface.

    PubMed

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  16. De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

    PubMed

    Keilwagen, Jens; Grau, Jan; Paponov, Ivan A; Posch, Stefan; Strickert, Marc; Grosse, Ivo

    2011-02-10

    Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom.

  17. Structure- and Modeling-based Identification of the Adenovirus E4orf4 Binding Site in the Protein Phosphatase 2A B55α Subunit*

    PubMed Central

    Horowitz, Ben; Sharf, Rakefet; Avital-Shacham, Meirav; Pechkovsky, Antonina; Kleinberger, Tamar

    2013-01-01

    The adenovirus E4orf4 protein regulates the progression of viral infection and when expressed outside the context of the virus it induces nonclassical, cancer cell-specific apoptosis. All E4orf4 functions known to date require an interaction between E4orf4 and protein phosphatase 2A (PP2A), which is mediated through PP2A regulatory B subunits. Specifically, an interaction with the B55α subunit is required for induction of cell death by E4orf4. To gain a better insight into the E4orf4-PP2A interaction, mapping of the E4orf4 interaction site in PP2A-B55α has been undertaken. To this end we used a combination of bioinformatics analyses of PP2A-B55α and of E4orf4, which led to the prediction of E4orf4 binding sites on the surface of PP2A-B55α. Mutation analysis, immunoprecipitation, and GST pulldown assays based on the theoretical predictions revealed that the E4orf4 binding site included the α1 and α2 helices described in the B55α structure and involved at least three residues located in these helices facing each other. Loss of E4orf4 binding was accompanied by reduced contribution of the B55α mutants to E4orf4-induced cell death. The identified E4orf4 binding domain lies above the previously described substrate binding site and does not overlap it, although its location could be consistent with direct or indirect effects on substrate binding. This work assigns for the first time a functional significance to the α1,α2 helices of B55α, and we suggest that the binding site defined by these helices could also contribute to interactions between PP2A and some of its cellular regulators. PMID:23530045

  18. miREE: miRNA recognition elements ensemble

    PubMed Central

    2011-01-01

    Background Computational methods for microRNA target prediction are a fundamental step to understand the miRNA role in gene regulation, a key process in molecular biology. In this paper we present miREE, a novel microRNA target prediction tool. miREE is an ensemble of two parts entailing complementary but integrated roles in the prediction. The Ab-Initio module leverages upon a genetic algorithmic approach to generate a set of candidate sites on the basis of their microRNA-mRNA duplex stability properties. Then, a Support Vector Machine (SVM) learning module evaluates the impact of microRNA recognition elements on the target gene. As a result the prediction takes into account information regarding both miRNA-target structural stability and accessibility. Results The proposed method significantly improves the state-of-the-art prediction tools in terms of accuracy with a better balance between specificity and sensitivity, as demonstrated by the experiments conducted on several large datasets across different species. miREE achieves this result by tackling two of the main challenges of current prediction tools: (1) The reduced number of false positives for the Ab-Initio part thanks to the integration of a machine learning module (2) the specificity of the machine learning part, obtained through an innovative technique for rich and representative negative records generation. The validation was conducted on experimental datasets where the miRNA:mRNA interactions had been obtained through (1) direct validation where even the binding site is provided, or through (2) indirect validation, based on gene expression variations obtained from high-throughput experiments where the specific interaction is not validated in detail and consequently the specific binding site is not provided. Conclusions The coupling of two parts: a sensitive Ab-Initio module and a selective machine learning part capable of recognizing the false positives, leads to an improved balance between sensitivity and specificity. miREE obtains a reasonable trade-off between filtering false positives and identifying targets. miREE tool is available online at http://didattica-online.polito.it/eda/miREE/ PMID:22115078

  19. Binding stability of peptides on major histocompatibility complex class I proteins: role of entropy and dynamics.

    PubMed

    Gul, Ahmet; Erman, Burak

    2018-01-16

    Prediction of peptide binding on specific human leukocyte antigens (HLA) has long been studied with successful results. We herein describe the effects of entropy and dynamics by investigating the binding stabilities of 10 nanopeptides on various HLA Class I alleles using a theoretical model based on molecular dynamics simulations. The fluctuational entropies of the peptides are estimated over a temperature range of 310-460 K. The estimated entropies correlate well with experimental binding affinities of the peptides: peptides that have higher binding affinities have lower entropies compared to non-binders, which have significantly larger entropies. The computation of the entropies is based on a simple model that requires short molecular dynamics trajectories and allows for approximate but rapid determination. The paper draws attention to the long neglected dynamic aspects of peptide binding, and provides a fast computation scheme that allows for rapid scanning of large numbers of peptides on selected HLA antigens, which may be useful in defining the right peptides for personal immunotherapy.

  20. Characterizing Active Pharmaceutical Ingredient Binding to Human Serum Albumin by Spin-Labeling and EPR Spectroscopy.

    PubMed

    Hauenschild, Till; Reichenwallner, Jörg; Enkelmann, Volker; Hinderberger, Dariush

    2016-08-26

    Drug binding to human serum albumin (HSA) has been characterized by a spin-labeling and continuous-wave (CW) EPR spectroscopic approach. Specifically, the contribution of functional groups (FGs) in a compound on its albumin-binding capabilities is quantitatively described. Molecules from different drug classes are labeled with EPR-active nitroxide radicals (spin-labeled pharmaceuticals (SLPs)) and in a screening approach CW-EPR spectroscopy is used to investigate HSA binding under physiological conditions and at varying ratios of SLP to protein. Spectral simulations of the CW-EPR spectra allow extraction of association constants (KA ) and the maximum number (n) of binding sites per protein. By comparison of data from 23 SLPs, the mechanisms of drug-protein association and the impact of chemical modifications at individual positions on drug uptake can be rationalized. Furthermore, new drug modifications with predictable protein binding tendency may be envisaged. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Binding stability of peptides on major histocompatibility complex class I proteins: role of entropy and dynamics

    NASA Astrophysics Data System (ADS)

    Gul, Ahmet; Erman, Burak

    2018-03-01

    Prediction of peptide binding on specific human leukocyte antigens (HLA) has long been studied with successful results. We herein describe the effects of entropy and dynamics by investigating the binding stabilities of 10 nanopeptides on various HLA Class I alleles using a theoretical model based on molecular dynamics simulations. The fluctuational entropies of the peptides are estimated over a temperature range of 310-460 K. The estimated entropies correlate well with experimental binding affinities of the peptides: peptides that have higher binding affinities have lower entropies compared to non-binders, which have significantly larger entropies. The computation of the entropies is based on a simple model that requires short molecular dynamics trajectories and allows for approximate but rapid determination. The paper draws attention to the long neglected dynamic aspects of peptide binding, and provides a fast computation scheme that allows for rapid scanning of large numbers of peptides on selected HLA antigens, which may be useful in defining the right peptides for personal immunotherapy.

  2. Allele-Specific Transcription Factor Binding in Pig Calpastatin Promoter Regions

    USDA-ARS?s Scientific Manuscript database

    The identification of predictive DNA markers for pork quality would allow U.S. pork producers and breeders to more quickly and efficiently select genetically superior animals for production of consistent, high quality meat. Genome scans have identified QTL for tenderness on pig chromosome 2 which ha...

  3. Specificity profiling of protein-binding domains using one-bead-one-compound Peptide libraries.

    PubMed

    Kunys, Andrew R; Lian, Wenlong; Pei, Dehua

    2012-12-01

    One-bead-one-compound (OBOC) libraries consist of structurally related compounds (e.g., peptides) covalently attached to a solid support, with each resin bead carrying a unique compound. OBOC libraries of high structural diversity can be rapidly synthesized and screened without the need for any special equipment, and therefore can be employed in any chemical or biochemical laboratory. OBOC peptide libraries have been widely used to map the ligand specificity of proteins, to determine the substrate specificity of enzymes, and to develop inhibitors against macromolecular targets. They have proven particularly useful in profiling the binding specificity of protein modular domains (e.g., SH2 domains, BIR domains, and PDZ domains); subsequently, the specificity information can be used to predict the protein targets of these domains. The protocols outlined in this article describe the methodologies for synthesizing and screening OBOC peptide libraries against SH2 and PDZ domains, and the related data analysis. Curr. Protoc. Chem. Biol. 4:331-355 © 2012 by John Wiley & Sons, Inc.

  4. Fast and automated functional classification with MED-SuMo: an application on purine-binding proteins.

    PubMed

    Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G

    2010-04-01

    Ligand-protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects.

  5. Fast and automated functional classification with MED-SuMo: An application on purine-binding proteins

    PubMed Central

    Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G

    2010-01-01

    Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects. PMID:20162627

  6. Prediction of the binding affinities of peptides to class II MHC using a regularized thermodynamic model

    PubMed Central

    2010-01-01

    Background The binding of peptide fragments of extracellular peptides to class II MHC is a crucial event in the adaptive immune response. Each MHC allotype generally binds a distinct subset of peptides and the enormous number of possible peptide epitopes prevents their complete experimental characterization. Computational methods can utilize the limited experimental data to predict the binding affinities of peptides to class II MHC. Results We have developed the Regularized Thermodynamic Average, or RTA, method for predicting the affinities of peptides binding to class II MHC. RTA accounts for all possible peptide binding conformations using a thermodynamic average and includes a parameter constraint for regularization to improve accuracy on novel data. RTA was shown to achieve higher accuracy, as measured by AUC, than SMM-align on the same data for all 17 MHC allotypes examined. RTA also gave the highest accuracy on all but three allotypes when compared with results from 9 different prediction methods applied to the same data. In addition, the method correctly predicted the peptide binding register of 17 out of 18 peptide-MHC complexes. Finally, we found that suboptimal peptide binding registers, which are often ignored in other prediction methods, made significant contributions of at least 50% of the total binding energy for approximately 20% of the peptides. Conclusions The RTA method accurately predicts peptide binding affinities to class II MHC and accounts for multiple peptide binding registers while reducing overfitting through regularization. The method has potential applications in vaccine design and in understanding autoimmune disorders. A web server implementing the RTA prediction method is available at http://bordnerlab.org/RTA/. PMID:20089173

  7. Interolog interfaces in protein–protein docking

    PubMed Central

    Alsop, James D.

    2015-01-01

    ABSTRACT Proteins are essential elements of biological systems, and their function typically relies on their ability to successfully bind to specific partners. Recently, an emphasis of study into protein interactions has been on hot spots, or residues in the binding interface that make a significant contribution to the binding energetics. In this study, we investigate how conservation of hot spots can be used to guide docking prediction. We show that the use of evolutionary data combined with hot spot prediction highlights near‐native structures across a range of benchmark examples. Our approach explores various strategies for using hot spots and evolutionary data to score protein complexes, using both absolute and chemical definitions of conservation along with refinements to these strategies that look at windowed conservation and filtering to ensure a minimum number of hot spots in each binding partner. Finally, structure‐based models of orthologs were generated for comparison with sequence‐based scoring. Using two data sets of 22 and 85 examples, a high rate of top 10 and top 1 predictions are observed, with up to 82% of examples returning a top 10 hit and 35% returning top 1 hit depending on the data set and strategy applied; upon inclusion of the native structure among the decoys, up to 55% of examples yielded a top 1 hit. The 20 common examples between data sets show that more carefully curated interolog data yields better predictions, particularly in achieving top 1 hits. Proteins 2015; 83:1940–1946. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. PMID:25740680

  8. Dynamics simulations for engineering macromolecular interactions

    PubMed Central

    Robinson-Mosher, Avi; Shinar, Tamar; Silver, Pamela A.; Way, Jeffrey

    2013-01-01

    The predictable engineering of well-behaved transcriptional circuits is a central goal of synthetic biology. The artificial attachment of promoters to transcription factor genes usually results in noisy or chaotic behaviors, and such systems are unlikely to be useful in practical applications. Natural transcriptional regulation relies extensively on protein-protein interactions to insure tightly controlled behavior, but such tight control has been elusive in engineered systems. To help engineer protein-protein interactions, we have developed a molecular dynamics simulation framework that simplifies features of proteins moving by constrained Brownian motion, with the goal of performing long simulations. The behavior of a simulated protein system is determined by summation of forces that include a Brownian force, a drag force, excluded volume constraints, relative position constraints, and binding constraints that relate to experimentally determined on-rates and off-rates for chosen protein elements in a system. Proteins are abstracted as spheres. Binding surfaces are defined radially within a protein. Peptide linkers are abstracted as small protein-like spheres with rigid connections. To address whether our framework could generate useful predictions, we simulated the behavior of an engineered fusion protein consisting of two 20 000 Da proteins attached by flexible glycine/serine-type linkers. The two protein elements remained closely associated, as if constrained by a random walk in three dimensions of the peptide linker, as opposed to showing a distribution of distances expected if movement were dominated by Brownian motion of the protein domains only. We also simulated the behavior of fluorescent proteins tethered by a linker of varying length, compared the predicted Förster resonance energy transfer with previous experimental observations, and obtained a good correspondence. Finally, we simulated the binding behavior of a fusion of two ligands that could simultaneously bind to distinct cell-surface receptors, and explored the landscape of linker lengths and stiffnesses that could enhance receptor binding of one ligand when the other ligand has already bound to its receptor, thus, addressing potential mechanisms for improving targeted signal transduction proteins. These specific results have implications for the design of targeted fusion proteins and artificial transcription factors involving fusion of natural domains. More broadly, the simulation framework described here could be extended to include more detailed system features such as non-spherical protein shapes and electrostatics, without requiring detailed, computationally expensive specifications. This framework should be useful in predicting behavior of engineered protein systems including binding and dissociation reactions. PMID:23822508

  9. Conserved and species-specific transcription factor co-binding patterns drive divergent gene regulation in human and mouse

    PubMed Central

    Diehl, Adam G

    2018-01-01

    Abstract The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to ‘regulatory sentences’ that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways. PMID:29361190

  10. BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns.

    PubMed

    Wei, Qing; La, David; Kihara, Daisuke

    2017-01-01

    Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .

  11. Text Mining Improves Prediction of Protein Functional Sites

    PubMed Central

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  12. Quantum-Mechanics Methodologies in Drug Discovery: Applications of Docking and Scoring in Lead Optimization.

    PubMed

    Crespo, Alejandro; Rodriguez-Granillo, Agustina; Lim, Victoria T

    2017-01-01

    The development and application of quantum mechanics (QM) methodologies in computer- aided drug design have flourished in the last 10 years. Despite the natural advantage of QM methods to predict binding affinities with a higher level of theory than those methods based on molecular mechanics (MM), there are only a few examples where diverse sets of protein-ligand targets have been evaluated simultaneously. In this work, we review recent advances in QM docking and scoring for those cases in which a systematic analysis has been performed. In addition, we introduce and validate a simplified QM/MM expression to compute protein-ligand binding energies. Overall, QMbased scoring functions are generally better to predict ligand affinities than those based on classical mechanics. However, the agreement between experimental activities and calculated binding energies is highly dependent on the specific chemical series considered. The advantage of more accurate QM methods is evident in cases where charge transfer and polarization effects are important, for example when metals are involved in the binding process or when dispersion forces play a significant role as in the case of hydrophobic or stacking interactions. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  13. Template-Based Modeling of Protein-RNA Interactions.

    PubMed

    Zheng, Jinfang; Kundrotas, Petras J; Vakser, Ilya A; Liu, Shiyong

    2016-09-01

    Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes.

  14. Using physics-based pose predictions and free energy perturbation calculations to predict binding poses and relative binding affinities for FXR ligands in the D3R Grand Challenge 2

    NASA Astrophysics Data System (ADS)

    Athanasiou, Christina; Vasilakaki, Sofia; Dellis, Dimitris; Cournia, Zoe

    2018-01-01

    Computer-aided drug design has become an integral part of drug discovery and development in the pharmaceutical and biotechnology industry, and is nowadays extensively used in the lead identification and lead optimization phases. The drug design data resource (D3R) organizes challenges against blinded experimental data to prospectively test computational methodologies as an opportunity for improved methods and algorithms to emerge. We participated in Grand Challenge 2 to predict the crystallographic poses of 36 Farnesoid X Receptor (FXR)-bound ligands and the relative binding affinities for two designated subsets of 18 and 15 FXR-bound ligands. Here, we present our methodology for pose and affinity predictions and its evaluation after the release of the experimental data. For predicting the crystallographic poses, we used docking and physics-based pose prediction methods guided by the binding poses of native ligands. For FXR ligands with known chemotypes in the PDB, we accurately predicted their binding modes, while for those with unknown chemotypes the predictions were more challenging. Our group ranked #1st (based on the median RMSD) out of 46 groups, which submitted complete entries for the binding pose prediction challenge. For the relative binding affinity prediction challenge, we performed free energy perturbation (FEP) calculations coupled with molecular dynamics (MD) simulations. FEP/MD calculations displayed a high success rate in identifying compounds with better or worse binding affinity than the reference (parent) compound. Our studies suggest that when ligands with chemical precedent are available in the literature, binding pose predictions using docking and physics-based methods are reliable; however, predictions are challenging for ligands with completely unknown chemotypes. We also show that FEP/MD calculations hold predictive value and can nowadays be used in a high throughput mode in a lead optimization project provided that crystal structures of sufficiently high quality are available.

  15. Lectin binding assays for in-process monitoring of sialylation in protein production.

    PubMed

    Xu, Weiduan; Chen, Jianmin; Yamasaki, Glenn; Murphy, John E; Mei, Baisong

    2010-07-01

    Many therapeutic proteins require appropriate glycosylation for their biological activities and plasma half life. Coagulation factor VIII (FVIII) is a glycoprotein which has extensive post-translational modification by N-linked glycosylation. The terminal sialic acid in the N-linked glycans of FVIII is required for maximal circulatory half life. The extent of FVIII sialylation can be determined by high pH anion-exchange chromatography coupled with a pulse electrochemical detector (HPAEC-PED), but this requires a large amount of purified protein. Using FVIII as a model, the objective of the present study was to develop assays that enable detection and prediction of sialylation deficiency at an early stage in the process and thus prevent downstream product quality excursions. Lectin ECA (Erythrina Cristagalli) binds to unsialylated Galbeta1-4 GlcNAc and the ECA-binding level (i.e., terminal Gal(beta1-4) exposure) is inversely proportional to the level of sialylation. By using ECA, a cell-based assay was developed to measure the global sialylation profile in FVIII producing cells. To examine the Galbeta1-4 exposure on the FVIII molecule in bioreactor tissue culture fluid (TCF), an ELISA-based ECA-FVIII binding assay was developed. The ECA-binding specificity in both assays was assessed by ECA-specific sugar inhibitors and neuraminidase digestion. The ECA-binding specificity was also independently confirmed by a ST3GAL4 siRNA knockdown experiment. To establish the correlation between Galbeta1-4 exposure and the HPAEC-PED determined FVIII sialylation value, the FVIII containing bioreactor TCF and the purified FVIII samples were tested with ECA ELISA binding assay. The results indicated an inverse correlation between ECA binding and the corresponding HPAEC-PED sialylation value. The ECA-binding assays are cost effective and can be rapidly performed, thereby making them effective for in-process monitoring of protein sialylation.

  16. In silico simulations of STAT1 and STAT3 inhibitors predict SH2 domain cross-binding specificity.

    PubMed

    Szelag, Malgorzata; Sikorski, Krzysztof; Czerwoniec, Anna; Szatkowska, Katarzyna; Wesoly, Joanna; Bluyssen, Hans A R

    2013-11-15

    Signal transducers and activators of transcription (STATs) comprise a family of transcription factors that are structurally related and which participate in signaling pathways activated by cytokines, growth factors and pathogens. Activation of STAT proteins is mediated by the highly conserved Src homology 2 (SH2) domain, which interacts with phosphotyrosine motifs for specific contacts between STATs and receptors and for STAT dimerization. By generating new models for human (h)STAT1, hSTAT2 and hSTAT3 we applied comparative in silico docking to determine SH2-binding specificity of the STAT3 inhibitor stattic, and of fludarabine (STAT1 inhibitor). Thus, we provide evidence that by primarily targeting the highly conserved phosphotyrosine (pY+0) SH2 binding pocket stattic is not a specific hSTAT3 inhibitor, but is equally effective towards hSTAT1 and hSTAT2. This was confirmed in Human Micro-vascular Endothelial Cells (HMECs) in vitro, in which stattic inhibited interferon-α-induced phosphorylation of all three STATs. Likewise, fludarabine inhibits both hSTAT1 and hSTAT3 phosphorylation, but not hSTAT2, by competing with the highly conserved pY+0 and pY-X binding sites, which are less well-preserved in hSTAT2. Moreover we observed that in HMECs in vitro fludarabine inhibits cytokine and lipopolysaccharide-induced phosphorylation of hSTAT1 and hSTAT3 but does not affect hSTAT2. Finally, multiple sequence alignment of STAT-SH2 domain sequences confirmed high conservation between hSTAT1 and hSTAT3, but not hSTAT2, with respect to stattic and fludarabine binding sites. Together our data offer a molecular basis that explains STAT cross-binding specificity of stattic and fludarabine, thereby questioning the present selection strategies of SH2 domain-based competitive small inhibitors. © 2013 Elsevier B.V. All rights reserved.

  17. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  18. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

    PubMed

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A

    2013-07-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.

  19. Tactics for preclinical validation of receptor-binding radiotracers

    PubMed Central

    Lever, Susan Z.; Fan, Kuo-Hsien; Lever, John R.

    2016-01-01

    Introduction Aspects of radiopharmaceutical development are illustrated through preclinical studies of [125I]-(E)-1-(2-(2,3-dihydrobenzofuran-5-yl)ethyl)-4-(iodoallyl)piperazine ([125I]-E-IA- BF-PE-PIPZE), a radioligand for sigma-1 (σ1) receptors, coupled with examples from the recent literature. Findings are compared to those previously observed for [125I]-(E)-1-(2-(2,3-dimethoxy-5-yl)ethyl)-4-(iodoallyl)piperazine ([125I]-E-IA-DM-PE-PIPZE). Methods Syntheses of E-IA-BF-PE-PIPZE and [125I]-E-IA-BF-PE-PIPZE were accomplished by standard methods. In vitro receptor binding studies and autoradiography were performed, and binding potential was predicted. Measurements of lipophilicity and protein binding were obtained. In vivo studies were conducted in mice to evaluate radioligand stability, as well as specific binding to σ1 sites in brain, brain regions and peripheral organs in the presence and absence of potential blockers. Results E-IA-BF-PE-PIPZE exhibited high affinity and selectivity for σ1 receptors (Ki = 0.43 ± 0.03 nM, σ2 / σ1 = 173). [125I]-E-IA-BF-PE-PIPZE was prepared in good yield and purity, with high specific activity. Radioligand binding provided dissociation (koff) and association (kon) rate constants, along with a measured Kd of 0.24 ± 0.01 nM and Bmax of 472 ± 13 fmol / mg protein. The radioligand proved suitable for quantitative autoradiography in vitro using brain sections. Moderate lipophilicity, Log D7.4 2.69 ± 0.28, was determined, and protein binding was 71 ± 0.3%. In vivo, high initial whole brain uptake, > 6% injected dose / g, cleared slowly over 24 h. Specific binding represented 75% to 93% of total binding from 15 min to 24 h. Findings were confirmed and extended by regional brain biodistribution. Radiometabolites were not observed in brain (1%). Conclusions Substitution of dihydrobenzofuranylethyl for dimethoxyphenethyl increased radioligand affinity for σ1 receptors by 16-fold. While high specific binding to σ1 receptors was observed for both radioligands in vivo, [125I]-E-IA-BF-PE-PIPZE displayed much slower clearance kinetics than [125I]-E-IA-DM-PE-PIPZE. Thus, minor structural modifications of σ1 receptor radioligands lead to major differences in binding properties in vitro and in vivo. PMID:27755986

  20. Comparison of Saccharomyces cerevisiae F-BAR domain structures reveals a conserved inositol phosphate binding site.

    PubMed

    Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R; Kenniston, Jon A; Mendrola, Jeannine M; Ferguson, Kathryn M; Lemmon, Mark A

    2015-02-03

    F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although they are generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here, we compare membrane-binding properties of the Saccharomyces cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound to an inositol phosphate. The structures explain phospholipid-binding selectivity differences and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip and is partly retained in certain other F-BAR domains. Our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity and provide a basis for its prediction from sequence. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Comparison of Saccharomyces cerevisiae F-BAR Domain Structures Reveals a Conserved Inositol Phosphate Binding Site

    DOE PAGES

    Moravcevic, Katarina; Alvarado, Diego; Schmitz, Karl R.; ...

    2015-01-22

    F-BAR domains control membrane interactions in endocytosis, cytokinesis, and cell signaling. Although they are generally thought to bind curved membranes containing negatively charged phospholipids, numerous functional studies argue that differences in lipid-binding selectivities of F-BAR domains are functionally important. Here in this paper, we compare membrane-binding properties of the Saccharomyces cerevisiae F-BAR domains in vitro and in vivo. Whereas some F-BAR domains (such as Bzz1p and Hof1p F-BARs) bind equally well to all phospholipids, the F-BAR domain from the RhoGAP Rgd1p preferentially binds phosphoinositides. We determined X-ray crystal structures of F-BAR domains from Hof1p and Rgd1p, the latter bound tomore » an inositol phosphate. The structures explain phospholipid-binding selectivity differences and reveal an F-BAR phosphoinositide binding site that is fully conserved in a mammalian RhoGAP called Gmip and is partly retained in certain other F-BAR domains. In conclusion, our findings reveal previously unappreciated determinants of F-BAR domain lipid-binding specificity and provide a basis for its prediction from sequence.« less

  2. Fine epitope signature of antibody neutralization breadth at the HIV-1 envelope CD4-binding site.

    PubMed

    Cheng, Hao D; Grimm, Sebastian K; Gilman, Morgan Sa; Gwom, Luc Christian; Sok, Devin; Sundling, Christopher; Donofrio, Gina; Hedestam, Gunilla B Karlsson; Bonsignori, Mattia; Haynes, Barton F; Lahey, Timothy P; Maro, Isaac; von Reyn, C Fordham; Gorny, Miroslaw K; Zolla-Pazner, Susan; Walker, Bruce D; Alter, Galit; Burton, Dennis R; Robb, Merlin L; Krebs, Shelly J; Seaman, Michael S; Bailey-Kellogg, Chris; Ackerman, Margaret E

    2018-03-08

    Major advances in donor identification, antigen probe design, and experimental methods to clone pathogen-specific antibodies have led to an exponential growth in the number of newly characterized broadly neutralizing antibodies (bnAbs) that recognize the HIV-1 envelope glycoprotein. Characterization of these bnAbs has defined new epitopes and novel modes of recognition that can result in potent neutralization of HIV-1. However, the translation of envelope recognition profiles in biophysical assays into an understanding of in vivo activity has lagged behind, and identification of subjects and mAbs with potent antiviral activity has remained reliant on empirical evaluation of neutralization potency and breadth. To begin to address this discrepancy between recombinant protein recognition and virus neutralization, we studied the fine epitope specificity of a panel of CD4-binding site (CD4bs) antibodies to define the molecular recognition features of functionally potent humoral responses targeting the HIV-1 envelope site bound by CD4. Whereas previous studies have used neutralization data and machine-learning methods to provide epitope maps, here, this approach was reversed, demonstrating that simple binding assays of fine epitope specificity can prospectively identify broadly neutralizing CD4bs-specific mAbs. Building on this result, we show that epitope mapping and prediction of neutralization breadth can also be accomplished in the assessment of polyclonal serum responses. Thus, this study identifies a set of CD4bs bnAb signature amino acid residues and demonstrates that sensitivity to mutations at signature positions is sufficient to predict neutralization breadth of polyclonal sera with a high degree of accuracy across cohorts and across clades.

  3. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  4. Electrostatically Biased Binding of Kinesin to Microtubules

    PubMed Central

    Zheng, Wenjun; Alonso, Maria; Huber, Gary; Dlugosz, Maciej; McCammon, J. Andrew; Cross, Robert A.

    2011-01-01

    The minimum motor domain of kinesin-1 is a single head. Recent evidence suggests that such minimal motor domains generate force by a biased binding mechanism, in which they preferentially select binding sites on the microtubule that lie ahead in the progress direction of the motor. A specific molecular mechanism for biased binding has, however, so far been lacking. Here we use atomistic Brownian dynamics simulations combined with experimental mutagenesis to show that incoming kinesin heads undergo electrostatically guided diffusion-to-capture by microtubules, and that this produces directionally biased binding. Kinesin-1 heads are initially rotated by the electrostatic field so that their tubulin-binding sites face inwards, and then steered towards a plus-endwards binding site. In tethered kinesin dimers, this bias is amplified. A 3-residue sequence (RAK) in kinesin helix alpha-6 is predicted to be important for electrostatic guidance. Real-world mutagenesis of this sequence powerfully influences kinesin-driven microtubule sliding, with one mutant producing a 5-fold acceleration over wild type. We conclude that electrostatic interactions play an important role in the kinesin stepping mechanism, by biasing the diffusional association of kinesin with microtubules. PMID:22140358

  5. The type III effector HsvG of the gall-forming Pantoea agglomerans mediates expression of the host gene HSVGT.

    PubMed

    Nissan, Gal; Manulis-Sasson, Shulamit; Chalupowicz, Laura; Teper, Doron; Yeheskel, Adva; Pasmanik-Chor, Metsada; Sessa, Guido; Barash, Isaac

    2012-02-01

    The type III effector HsvG of the gall-forming Pantoea agglomerans pv. gypsophilae is a DNA-binding protein that is imported to the host nucleus and involved in host specificity. The DNA-binding region of HsvG was delineated to 266 amino acids located within a secondary structure region near the N-terminus of the protein but did not display any homology to canonical DNA-binding motifs. A binding site selection procedure was used to isolate a target gene of HsvG, named HSVGT, in Gypsophila paniculata. HSVGT is a predicted acidic protein of the DnaJ family with 244 amino acids. It harbors characteristic conserved motifs of a eukaryotic transcription factor, including a bipartite nuclear localization signal, zinc finger, and leucine zipper DNA-binding motifs. Quantitative real-time polymerase chain reaction analysis demonstrated that HSVGT transcription is specifically induced in planta within 2 h after inoculation with the wild-type P. agglomerans pv. gypsophilae compared with the hsvG mutant. Induction of HSVGT reached a peak of sixfold at 4 h after inoculation and progressively declined thereafter. Gel-shift assay demonstrated that HsvG binds to the HSVGT promoter, indicating that HSVGT is a direct target of HsvG. Our results support the hypothesis that HsvG functions as a transcription factor in gypsophila.

  6. Quantifying the Effect of DNA Packaging on Gene Expression Level

    NASA Astrophysics Data System (ADS)

    Kim, Harold

    2010-10-01

    Gene expression, the process by which the genetic code comes alive in the form of proteins, is one of the most important biological processes in living cells, and begins when transcription factors bind to specific DNA sequences in the promoter region upstream of a gene. The relationship between gene expression output and transcription factor input which is termed the gene regulation function is specific to each promoter, and predicting this gene regulation function from the locations of transcription factor binding sites is one of the challenges in biology. In eukaryotic organisms (for example, animals, plants, fungi etc), DNA is highly compacted into nucleosomes, 147-bp segments of DNA tightly wrapped around histone protein core, and therefore, the accessibility of transcription factor binding sites depends on their locations with respect to nucleosomes - sites inside nucleosomes are less accessible than those outside nucleosomes. To understand how transcription factor binding sites contribute to gene expression in a quantitative manner, we obtain gene regulation functions of promoters with various configurations of transcription factor binding sites by using fluorescent protein reporters to measure transcription factor input and gene expression output in single yeast cells. In this talk, I will show that the affinity of a transcription factor binding site inside and outside the nucleosome controls different aspects of the gene regulation function, and explain this finding based on a mass-action kinetic model that includes competition between nucleosomes and transcription factors.

  7. Computational approach to analyze isolated ssDNA aptamers against angiotensin II.

    PubMed

    Heiat, Mohammad; Najafi, Ali; Ranjbar, Reza; Latifi, Ali Mohammad; Rasaee, Mohammad Javad

    2016-07-20

    Aptamers are oligonucleotides with highly structured molecules that can bind to their targets through specific 3-D conformation. Commonly, not all the nucleotides such as primer binding fixed region and some other sequences are vital for aptamers folding and interaction. Elimination of unnecessary regions needs trustworthy prediction tools to reduce experimental efforts and errors. Here we introduced a manipulated in-silico approach to predict the 3-D structure of aptamers and their target interactions. To design an approach for computational analysis of isolated ssDNA aptamers (FLC112, FLC125 and their truncated core region including CRC112 and CRC125), their secondary and tertiary structures were modeled by Mfold and RNA composer respectively. Output PDB files were modified from RNA to DNA in the discovery studio visualizer software. Using ZDOCK server, the aptamer-target interactions were predicted. Finally, the interaction scores were compared with the experimental results. In-silico interaction scores and the experimental outcomes were in the same descending arrangement of FLC112>CRC125>CRC112>FLC125 with similar intensity. The consistent results of innovative in-silico method with experimental outputs, affirmed that the present method may be a reliable approach. Also, it showed that the exact in-silico predictions can be utilized as a credible reference to find aptameric fragments binding potency. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

    PubMed Central

    Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

    2016-01-01

    Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825

  9. Limb-Enhancer Genie: An accessible resource of accurate enhancer predictions in the developing limb

    DOE PAGES

    Monti, Remo; Barozzi, Iros; Osterwalder, Marco; ...

    2017-08-21

    Epigenomic mapping of enhancer-associated chromatin modifications facilitates the genome-wide discovery of tissue-specific enhancers in vivo. However, reliance on single chromatin marks leads to high rates of false-positive predictions. More sophisticated, integrative methods have been described, but commonly suffer from limited accessibility to the resulting predictions and reduced biological interpretability. Here we present the Limb-Enhancer Genie (LEG), a collection of highly accurate, genome-wide predictions of enhancers in the developing limb, available through a user-friendly online interface. We predict limb enhancers using a combination of > 50 published limb-specific datasets and clusters of evolutionarily conserved transcription factor binding sites, taking advantage ofmore » the patterns observed at previously in vivo validated elements. By combining different statistical models, our approach outperforms current state-of-the-art methods and provides interpretable measures of feature importance. Our results indicate that including a previously unappreciated score that quantifies tissue-specific nuclease accessibility significantly improves prediction performance. We demonstrate the utility of our approach through in vivo validation of newly predicted elements. Moreover, we describe general features that can guide the type of datasets to include when predicting tissue-specific enhancers genome-wide, while providing an accessible resource to the general biological community and facilitating the functional interpretation of genetic studies of limb malformations.« less

  10. Computational estimation of rainbow trout estrogen receptor binding affinities for environmental estrogens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shyu, Conrad; Cavileer, Timothy D.; Nagler, James J.

    2011-02-01

    Environmental estrogens have been the subject of intense research due to their documented detrimental effects on the health of fish and wildlife and their potential to negatively impact humans. A complete understanding of how these compounds affect health is complicated because environmental estrogens are a structurally heterogeneous group of compounds. In this work, computational molecular dynamics simulations were utilized to predict the binding affinity of different compounds using rainbow trout (Oncorhynchus mykiss) estrogen receptors (ERs) as a model. Specifically, this study presents a comparison of the binding affinity of the natural ligand estradiol-17{beta} to the four rainbow trout ER isoformsmore » with that of three known environmental estrogens 17{alpha}-ethinylestradiol, bisphenol A, and raloxifene. Two additional compounds, atrazine and testosterone, that are known to be very weak or non-binders to ERs were tested. The binding affinity of these compounds to the human ER{alpha} subtype is also included for comparison. The results of this study suggest that, when compared to estradiol-17{beta}, bisphenol A binds less strongly to all four receptors, 17{alpha}-ethinylestradiol binds more strongly, and raloxifene has a high affinity for the {alpha} subtype only. The results also show that atrazine and testosterone are weak or non-binders to the ERs. All of the results are in excellent qualitative agreement with the known in vivo estrogenicity of these compounds in the rainbow trout and other fishes. Computational estimation of binding affinities could be a valuable tool for predicting the impact of environmental estrogens in fish and other animals.« less

  11. The activity of CouR, a MarR family transcriptional regulator, is modulated through a novel molecular mechanism

    DOE PAGES

    Otani, Hiroshi; Stogios, Peter J.; Xu, Xiaohui; ...

    2015-09-22

    CouR, a MarR-type transcriptional repressor, regulates the cou genes, encoding p-hydroxycinnamate catabolism in the soil bacterium Rhodococcus jostii RHA1. The CouR dimer bound two molecules of the catabolite p-coumaroyl–CoA (K d = 11 ± 1 μM). The presence of p-coumaroyl–CoA, but neither p-coumarate nor CoASH, abrogated CouR's binding to its operator DNA in vitro. The crystal structures of ligand-free CouR and its p-coumaroyl–CoA-bound form showed no significant conformational differences, in contrast to other MarR regulators. The CouR– p-coumaroyl–CoA structure revealed two ligand molecules bound to the CouR dimer with their phenolic moieties occupying equivalent hydrophobic pockets in each protomer andmore » their CoA moieties adopting non-equivalent positions to mask the regulator's predicted DNA-binding surface. More specifically, the CoA phosphates formed salt bridges with predicted DNA-binding residues Arg36 and Arg38, changing the overall charge of the DNA-binding surface. The substitution of either arginine with alanine completely abrogated the ability of CouR to bind DNA. By contrast, the R36A/R38A double variant retained a relatively high affinity for p-coumaroyl–CoA (K d = 89 ± 6 μM). Altogether, our data point to a novel mechanism of action in which the ligand abrogates the repressor's ability to bind DNA by steric occlusion of key DNA-binding residues and charge repulsion of the DNA backbone.« less

  12. OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design.

    PubMed

    Ojewole, Adegoke; Lowegard, Anna; Gainza, Pablo; Reeve, Stephanie M; Georgiev, Ivelin; Anderson, Amy C; Donald, Bruce R

    2017-01-01

    Drug resistance in protein targets is an increasingly common phenomenon that reduces the efficacy of both existing and new antibiotics. However, knowledge of future resistance mutations during pre-clinical phases of drug development would enable the design of novel antibiotics that are robust against not only known resistant mutants, but also against those that have not yet been clinically observed. Computational structure-based protein design (CSPD) is a transformative field that enables the prediction of protein sequences with desired biochemical properties such as binding affinity and specificity to a target. The use of CSPD to predict previously unseen resistance mutations represents one of the frontiers of computational protein design. In a recent study (Reeve et al. Proc Natl Acad Sci U S A 112(3):749-754, 2015), we used our OSPREY (Open Source Protein REdesign for You) suite of CSPD algorithms to prospectively predict resistance mutations that arise in the active site of the dihydrofolate reductase enzyme from methicillin-resistant Staphylococcus aureus (SaDHFR) in response to selective pressure from an experimental competitive inhibitor. We demonstrated that our top predicted candidates are indeed viable resistant mutants. Since that study, we have significantly enhanced the capabilities of OSPREY with not only improved modeling of backbone flexibility, but also efficient multi-state design, fast sparse approximations, partitioned continuous rotamers for more accurate energy bounds, and a computationally efficient representation of molecular-mechanics and quantum-mechanical energy functions. Here, using SaDHFR as an example, we present a protocol for resistance prediction using the latest version of OSPREY. Specifically, we show how to use a combination of positive and negative design to predict active site escape mutations that maintain the enzyme's catalytic function but selectively ablate binding of an inhibitor.

  13. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Luo, Heng; Ye, Hao; Ng, Hui Wen

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less

  14. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    PubMed Central

    Luo, Heng; Ye, Hao; Ng, Hui Wen; Sakkiah, Sugunadevi; Mendrick, Donna L.; Hong, Huixiao

    2016-01-01

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. This algorithm can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system. PMID:27558848

  15. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    DOE PAGES

    Luo, Heng; Ye, Hao; Ng, Hui Wen; ...

    2016-08-25

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less

  16. Steroid ligands bind human sex hormone-binding globulin in specific orientations and produce distinct changes in protein conformation.

    PubMed

    Grishkovskaya, Irina; Avvakumov, George V; Hammond, Geoffrey L; Catalano, Maria G; Muller, Yves A

    2002-08-30

    The amino-terminal laminin G-like domain of human sex hormone-binding globulin (SHBG) contains a single high affinity steroid-binding site. Crystal structures of this domain in complex with several different steroid ligands have revealed that estradiol occupies the SHBG steroid-binding site in an opposite orientation when compared with 5 alpha-dihydrotestosterone or C19 androgen metabolites (5 alpha-androstan-3 beta,17 beta-diol and 5 alpha-androstan-3 beta,17 alpha-diol) or the synthetic progestin levonorgestrel. Substitution of specific residues within the SHBG steroid-binding site confirmed that Ser(42) plays a key role in determining high affinity interactions by hydrogen bonding to functional groups at C3 of the androstanediols and levonorgestrel and the hydroxyl at C17 of estradiol. Among residues participating in the hydrogen bond network with hydroxy groups at C17 of C19 steroids or C3 of estradiol, Asp(65) appears to be the most important. The different binding mode of estradiol is associated with a difference in the position/orientation of residues (Leu(131) and Lys(134)) in the loop segment (Leu(131)-His(136)) that covers the steroid-binding site as well as others (Leu(171)-Lys(173) and Trp(84)) on the surface of human SHBG and may provide a basis for ligand-dependent interactions between SHBG and other macromolecules. These new crystal structures have also enabled us to construct a simple space-filling model that can be used to predict the characteristics of novel SHBG ligands.

  17. iTAK: A program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators and protein kinases

    USDA-ARS?s Scientific Manuscript database

    Transcription factors (TFs) are proteins that regulate the expression of target genes by binding to specific elements in their regulatory regions. Transcriptional regulators (TRs) also regulate the expression of target genes; however, they operate indirectly via interaction with the basal transcript...

  18. Bacillus subtilis RapA phosphatase domain interaction with its substrate, phosphorylated Spo0F, and its inhibitor, the PhrA peptide.

    PubMed

    Diaz, Alejandra R; Core, Leighton J; Jiang, Min; Morelli, Michela; Chiang, Christina H; Szurmant, Hendrik; Perego, Marta

    2012-03-01

    Rap proteins in Bacillus subtilis regulate the phosphorylation level or the DNA-binding activity of response regulators such as Spo0F, involved in sporulation initiation, or ComA, regulating competence development. Rap proteins can be inhibited by specific peptides generated by the export-import processing pathway of the Phr proteins. Rap proteins have a modular organization comprising an amino-terminal alpha-helical domain connected to a domain formed by six tetratricopeptide repeats (TPR). In this study, the molecular basis for the specificity of the RapA phosphatase for its substrate, phosphorylated Spo0F (Spo0F∼P), and its inhibitor pentapeptide, PhrA, was analyzed in part by generating chimeric proteins with RapC, which targets the DNA-binding domain of ComA, rather than Spo0F∼P, and is inhibited by the PhrC pentapeptide. In vivo analysis of sporulation efficiency or competence-induced gene expression, as well as in vitro biochemical assays, allowed the identification of the amino-terminal 60 amino acids as sufficient to determine Rap specificity for its substrate and the central TPR3 to TPR5 (TPR3-5) repeats as providing binding specificity toward the Phr peptide inhibitor. The results allowed the prediction and testing of key residues in RapA that are essential for PhrA binding and specificity, thus demonstrating how the widespread structural fold of the TPR is highly versatile, using a common interaction mechanism for a variety of functions in eukaryotic and prokaryotic organisms.

  19. Bacillus subtilis RapA Phosphatase Domain Interaction with Its Substrate, Phosphorylated Spo0F, and Its Inhibitor, the PhrA Peptide

    PubMed Central

    Diaz, Alejandra R.; Core, Leighton J.; Jiang, Min; Morelli, Michela; Chiang, Christina H.; Szurmant, Hendrik

    2012-01-01

    Rap proteins in Bacillus subtilis regulate the phosphorylation level or the DNA-binding activity of response regulators such as Spo0F, involved in sporulation initiation, or ComA, regulating competence development. Rap proteins can be inhibited by specific peptides generated by the export-import processing pathway of the Phr proteins. Rap proteins have a modular organization comprising an amino-terminal alpha-helical domain connected to a domain formed by six tetratricopeptide repeats (TPR). In this study, the molecular basis for the specificity of the RapA phosphatase for its substrate, phosphorylated Spo0F (Spo0F∼P), and its inhibitor pentapeptide, PhrA, was analyzed in part by generating chimeric proteins with RapC, which targets the DNA-binding domain of ComA, rather than Spo0F∼P, and is inhibited by the PhrC pentapeptide. In vivo analysis of sporulation efficiency or competence-induced gene expression, as well as in vitro biochemical assays, allowed the identification of the amino-terminal 60 amino acids as sufficient to determine Rap specificity for its substrate and the central TPR3 to TPR5 (TPR3-5) repeats as providing binding specificity toward the Phr peptide inhibitor. The results allowed the prediction and testing of key residues in RapA that are essential for PhrA binding and specificity, thus demonstrating how the widespread structural fold of the TPR is highly versatile, using a common interaction mechanism for a variety of functions in eukaryotic and prokaryotic organisms. PMID:22267516

  20. RBind: computational network method to predict RNA binding sites.

    PubMed

    Wang, Kaili; Jian, Yiren; Wang, Huiwen; Zeng, Chen; Zhao, Yunjie

    2018-04-26

    Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. The codes and datasets are available at https://zhaolab.com.cn/RBind. yjzhaowh@mail.ccnu.edu.cn. Supplementary data are available at Bioinformatics online.

  1. Regulation of the mouse Treacher Collins syndrome homolog (Tcof1) promoter through differential repression of constitutive expression.

    PubMed

    Shows, Kathryn H; Shiang, Rita

    2008-11-01

    Treacher Collins syndrome is an autosomal-dominant mandibulofacial dysostosis caused by haploinsufficiency of the TCOF1 gene product treacle. Mouse Tcof1 protein is approximately 61% identical and 71% similar to treacle, and heterozygous knockout of Tcof1 causes craniofacial malformation. Tcof1 expression is high in developing neural crest, but much lower in other tissues. To investigate this dual regulation, highly conserved regions upstream of TCOF1 homologs were tested through deletion and mutation reporter assays, and conserved predicted transcription factor binding sites were assessed through chromatin binding studies. Assays were performed in mouse P19 embryonic carcinoma cells and in HEK293 cells to determine differential activation in cell types at different stages of differentiation. Binding of Cebpb, Zfp161, and Sp1 transcription factors was specific to the Tcof1 regulatory region in P19 cells. The Zfp161 binding site demonstrated P19 cell-specific repression, while the Sp1/Sp3 candidate site demonstrated HEK293 cell-specific activation. Moreover, presence of c-myb and Zfp161 transcripts was specific to P19 cells. A minimal promoter fragment from -253 to +43 bp directs constitutive expression in both cell types, and dual regulation of Tcof1 appears to be through differential repression of this minimal promoter. The CpG island at the transcription start site remains unmethylated in P19 cells, 11.5 dpc mouse embryonic tissue, and adult mouse ear, which supports constitutive activation of the Tcof1 promoter.

  2. CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation.

    PubMed

    Nikulova, Anna A; Favorov, Alexander V; Sutormin, Roman A; Makeev, Vsevolod J; Mironov, Andrey A

    2012-07-01

    Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory 'grammar', or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.

  3. Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.

    PubMed

    Hattotuwagama, Channa K; Doytchinova, Irini A; Flower, Darren R

    2007-01-01

    Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*6801, HLA-A*6802, HLA-B*3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB1*0101, HLA-DRB1*0401, HLA-DRB1*0701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made are freely available online at the URL http://www.jenner.ac.uk/MHCPred.

  4. Convection, diffusion and reaction in a surface-based biosensor: modeling of cooperativity and binding site competition on the surface and in the hydrogel.

    PubMed

    Lebedev, Konstantin; Mafé, Salvador; Stroeve, Pieter

    2006-04-15

    We study theoretically the transport and kinetic processes underlying the operation of a biosensor (particularly the surface plasmon sensor "Biacore") used to study the surface binding kinetics of biomolecules in solution to immobilized receptors. Unlike previous studies, we concentrate mainly on the modeling of system-specific phenomena rather than on the influence of mass transport limitations on the intrinsic kinetic rate constants determined from binding data. In the first problem, the case of two-site binding where each receptor unit on the surface can accommodate two analyte molecules on two different sites is considered. One analyte molecule always binds first to a specific site. Subsequently, the second analyte molecule can bind to the adjacent unoccupied site. In the second problem, two different analytes compete for one binding site on the same surface receptor. Finally, the third problem considers the case of positive cooperativity among bound molecules in the hydrogel using a simple mean-field approach. The transport in both the flow channel and the hydrogel phases of the biosensor is taken into account in this case (with few exceptions, most previous studies assume a simpler model in which the hydrogel is treated as a planar surface with the receptors). We consider simultaneously diffusion and convection through the flow channel together with diffusion and cooperativity binding on the surface and in the hydrogel. In each case, typical results for the concentration contours of the free and bound molecules in the flow channel and hydrogel regions are presented together with the time-dependent association/dissociation curves and reaction rates. For binding site competition, the analysis predicts overshoot phenomena.

  5. Computational assessment of the cooperativity between RNA binding proteins and MicroRNAs in Transcript Decay.

    PubMed

    Jiang, Peng; Singh, Mona; Coller, Hilary A

    2013-01-01

    Transcript degradation is a widespread and important mechanism for regulating protein abundance. Two major regulators of transcript degradation are RNA Binding Proteins (RBPs) and microRNAs (miRNAs). We computationally explored whether RBPs and miRNAs cooperate to promote transcript decay. We defined five RBP motifs based on the evolutionary conservation of their recognition sites in 3'UTRs as the binding motifs for Pumilio (PUM), U1A, Fox-1, Nova, and UAUUUAU. Recognition sites for some of these RBPs tended to localize at the end of long 3'UTRs. A specific group of miRNA recognition sites were enriched within 50 nts from the RBP recognition sites for PUM and UAUUUAU. The presence of both a PUM recognition site and a recognition site for preferentially co-occurring miRNAs was associated with faster decay of the associated transcripts. For PUM and its co-occurring miRNAs, binding of the RBP to its recognition sites was predicted to release nearby miRNA recognition sites from RNA secondary structures. The mammalian miRNAs that preferentially co-occur with PUM binding sites have recognition seeds that are reverse complements to the PUM recognition motif. Their binding sites have the potential to form hairpin secondary structures with proximal PUM binding sites that would normally limit RISC accessibility, but would be more accessible to miRNAs in response to the binding of PUM. In sum, our computational analyses suggest that a specific set of RBPs and miRNAs work together to affect transcript decay, with the rescue of miRNA recognition sites via RBP binding as one possible mechanism of cooperativity.

  6. Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites

    PubMed Central

    Prouse, Michael B.; Campbell, Malcolm M.

    2013-01-01

    Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators. PMID:23741471

  7. HLA Class I Binding 9mer Peptides from Influenza A Virus Induce CD4+ T Cell Responses

    PubMed Central

    Wang, Mingjun; Larsen, Mette V.; Nielsen, Morten; Harndahl, Mikkel; Justesen, Sune; Dziegiel, Morten H.; Buus, Søren; Tang, Sheila T.; Lund, Ole; Claesson, Mogens H.

    2010-01-01

    Background Identification of human leukocyte antigen class I (HLA-I) restricted cytotoxic T cell (CTL) epitopes from influenza virus is of importance for the development of new effective peptide-based vaccines. Methodology/Principal Findings In the present work, bioinformatics was used to predict 9mer peptides derived from available influenza A viral proteins with binding affinity for at least one of the 12 HLA-I supertypes. The predicted peptides were then selected in a way that ensured maximal coverage of the available influenza A strains. One hundred and thirty one peptides were synthesized and their binding affinities for the HLA-I supertypes were measured in a biochemical assay. Influenza-specific T cell responses towards the peptides were quantified using IFNγ ELISPOT assays with peripheral blood mononuclear cells (PBMC) from adult healthy HLA-I typed donors as responder cells. Of the 131 peptides, 21 were found to induce T cell responses in 19 donors. In the ELISPOT assay, five peptides induced responses that could be totally blocked by the pan-specific anti-HLA-I antibody W6/32, whereas 15 peptides induced responses that could be completely blocked in the presence of the pan-specific anti-HLA class II (HLA-II) antibody IVA12. Blocking of HLA-II subtype reactivity revealed that 8 and 6 peptide responses were blocked by anti-HLA-DR and -DP antibodies, respectively. Peptide reactivity of PBMC depleted of CD4+ or CD8+ T cells prior to the ELISPOT culture revealed that effectors are either CD4+ (the majority of reactivities) or CD8+ T cells, never a mixture of these subsets. Three of the peptides, recognized by CD4+ T cells showed binding to recombinant DRA1*0101/DRB1*0401 or DRA1*0101/DRB5*0101 molecules in a recently developed biochemical assay. Conclusions/Significance HLA-I binding 9mer influenza virus-derived peptides induce in many cases CD4+ T cell responses restricted by HLA-II molecules. PMID:20479886

  8. An RNA-Binding Multimer Specifies Nematode Sperm Fate.

    PubMed

    Aoki, Scott T; Porter, Douglas F; Prasad, Aman; Wickens, Marvin; Bingman, Craig A; Kimble, Judith

    2018-06-26

    FOG-3 is a master regulator of sperm fate in Caenorhabditis elegans and homologous to Tob/BTG proteins, which in mammals are monomeric adaptors that recruit enzymes to RNA binding proteins. Here, we determine the FOG-3 crystal structure and in vitro demonstrate that FOG-3 forms dimers that can multimerize. The FOG-3 multimeric structure has a basic surface potential, suggestive of binding nucleic acid. Consistent with that prediction, FOG-3 binds directly to nearly 1,000 RNAs in nematode spermatogenic germ cells. Most binding is to the 3' UTR, and most targets (94%) are oogenic mRNAs, even though assayed in spermatogenic cells. When tethered to a reporter mRNA, FOG-3 represses its expression. Together these findings elucidate the molecular mechanism of sperm fate specification and reveal the evolution of a protein from monomeric to multimeric form with acquisition of a distinct mode of mRNA repression. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Elimination of a ligand gating site generates a supersensitive olfactory receptor.

    PubMed

    Sharma, Kanika; Ahuja, Gaurav; Hussain, Ashiq; Balfanz, Sabine; Baumann, Arnd; Korsching, Sigrun I

    2016-06-21

    Olfaction poses one of the most complex ligand-receptor matching problems in biology due to the unparalleled multitude of odor molecules facing a large number of cognate olfactory receptors. We have recently deorphanized an olfactory receptor, TAAR13c, as a specific receptor for the death-associated odor cadaverine. Here we have modeled the cadaverine/TAAR13c interaction, exchanged predicted binding residues by site-directed mutagenesis, and measured the activity of the mutant receptors. Unexpectedly we observed a binding site for cadaverine at the external surface of the receptor, in addition to an internal binding site, whose mutation resulted in complete loss of activity. In stark contrast, elimination of the external binding site generated supersensitive receptors. Modeling suggests this site to act as a gate, limiting access of the ligand to the internal binding site and thereby downregulating the affinity of the native receptor. This constitutes a novel mechanism to fine-tune physiological sensitivity to socially relevant odors.

  10. Elimination of a ligand gating site generates a supersensitive olfactory receptor

    PubMed Central

    Sharma, Kanika; Ahuja, Gaurav; Hussain, Ashiq; Balfanz, Sabine; Baumann, Arnd; Korsching, Sigrun I.

    2016-01-01

    Olfaction poses one of the most complex ligand-receptor matching problems in biology due to the unparalleled multitude of odor molecules facing a large number of cognate olfactory receptors. We have recently deorphanized an olfactory receptor, TAAR13c, as a specific receptor for the death-associated odor cadaverine. Here we have modeled the cadaverine/TAAR13c interaction, exchanged predicted binding residues by site-directed mutagenesis, and measured the activity of the mutant receptors. Unexpectedly we observed a binding site for cadaverine at the external surface of the receptor, in addition to an internal binding site, whose mutation resulted in complete loss of activity. In stark contrast, elimination of the external binding site generated supersensitive receptors. Modeling suggests this site to act as a gate, limiting access of the ligand to the internal binding site and thereby downregulating the affinity of the native receptor. This constitutes a novel mechanism to fine-tune physiological sensitivity to socially relevant odors. PMID:27323929

  11. Molecular tweezers modulate 14-3-3 protein-protein interactions

    NASA Astrophysics Data System (ADS)

    Bier, David; Rose, Rolf; Bravo-Rodriguez, Kenny; Bartel, Maria; Ramirez-Anguita, Juan Manuel; Dutt, Som; Wilch, Constanze; Klärner, Frank-Gerrit; Sanchez-Garcia, Elsa; Schrader, Thomas; Ottmann, Christian

    2013-03-01

    Supramolecular chemistry has recently emerged as a promising way to modulate protein functions, but devising molecules that will interact with a protein in the desired manner is difficult as many competing interactions exist in a biological environment (with solvents, salts or different sites for the target biomolecule). We now show that lysine-specific molecular tweezers bind to a 14-3-3 adapter protein and modulate its interaction with partner proteins. The tweezers inhibit binding between the 14-3-3 protein and two partner proteins—a phosphorylated (C-Raf) protein and an unphosphorylated one (ExoS)—in a concentration-dependent manner. Protein crystallography shows that this effect arises from the binding of the tweezers to a single surface-exposed lysine (Lys214) of the 14-3-3 protein in the proximity of its central channel, which normally binds the partner proteins. A combination of structural analysis and computer simulations provides rules for the tweezers' binding preferences, thus allowing us to predict their influence on this type of protein-protein interactions.

  12. Incorporating evolution of transcription factor binding sites into annotated alignments.

    PubMed

    Bais, Abha S; Grossmann, Stefen; Vingron, Martin

    2007-08-01

    Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.

  13. GBshape: a genome browser database for DNA shape annotations

    PubMed Central

    Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J.; Parker, Stephen C.J.; Nuzhdin, Sergey V.; Tullius, Thomas D.; Rohs, Remo

    2015-01-01

    Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. PMID:25326329

  14. Protein-protein interactions between SWCNT/chitosan/EGF and EGF receptor: a model of drug delivery system.

    PubMed

    Rungnim, Chompoonut; Rungrotmongkol, Thanyada; Kungwan, Nawee; Hannongbua, Supot

    2016-09-01

    Epidermal growth factor (EGF) was used as the targeting ligand to enhance the specificity of a cancer drug delivery system (DDS) via its specific interaction with the EGF receptor (EGFR) that is overexpressed on the surface of some cancer cells. To investigate the intermolecular interaction and binding affinity between the EGF-conjugated DDS and the EGFR, 50 ns molecular dynamics simulations were performed on the complex of tethered EGFR and EGF linked to single-wall carbon nanotube (SWCNT) through a biopolymer chitosan wrapping the tube outer surface (EGFR·EGF-CS-SWCNT-Drug complex), and compared to the EGFR·EGF complex and free EGFR. The binding pattern of the EGF-CS-SWCNT-Drug complex to the EGFR was broadly comparable to that for EGF, but the binding affinity of the EGF-CS-SWCNT-Drug complex was predicted to be somewhat better than that for EGF alone. Additionally, the chitosan chain could prevent undesired interactions of SWCNT at the binding pocket region. Therefore, EGF connected to SWCNT via a chitosan linker is a seemingly good formulation for developing a smart DDS served as part of an alternative cancer therapy.

  15. Evaluation of Ochratoxin Recognition by Peptides Using Explicit Solvent Molecular Dynamics

    PubMed Central

    Thyparambil, Aby A.; Bazin, Ingrid; Guiseppi-Elie, Anthony

    2017-01-01

    Biosensing platforms based on peptide recognition provide a cost-effective and stable alternative to antibody-based capture and discrimination of ochratoxin-A (OTA) vs. ochratoxin-B (OTB) in monitoring bioassays. Attempts to engineer peptides with improved recognition efficacy require thorough structural and thermodynamic characterization of the binding-competent conformations. Classical molecular dynamics (MD) approaches alone do not provide a thorough assessment of a peptide’s recognition efficacy. In this study, in-solution binding properties of four different peptides, a hexamer (SNLHPK), an octamer (CSIVEDGK), NFO4 (VYMNRKYYKCCK), and a 13-mer (GPAGIDGPAGIRC), which were previously generated for OTA-specific recognition, were evaluated using an advanced MD simulation approach involving accelerated configurational search and predictive modeling. Peptide configurations relevant to ochratoxin binding were initially generated using biased exchange metadynamics and the dynamic properties associated with the in-solution peptide–ochratoxin binding were derived from Markov State Models. Among the various peptides, NFO4 shows superior in-solution OTA sensing and also shows superior selectivity for OTA vs. OTB due to the lower penalty associated with solvating its bound complex. Advanced MD approaches provide structural and energetic insights critical to the hapten-specific recognition to aid the engineering of peptides with better sensing efficacies. PMID:28505090

  16. Identifying mRNA sequence elements for target recognition by human Argonaute proteins

    PubMed Central

    Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei

    2014-01-01

    It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241

  17. mRNA stability in mammalian cells.

    PubMed Central

    Ross, J

    1995-01-01

    This review concerns how cytoplasmic mRNA half-lives are regulated and how mRNA decay rates influence gene expression. mRNA stability influences gene expression in virtually all organisms, from bacteria to mammals, and the abundance of a particular mRNA can fluctuate manyfold following a change in the mRNA half-life, without any change in transcription. The processes that regulate mRNA half-lives can, in turn, affect how cells grow, differentiate, and respond to their environment. Three major questions are addressed. Which sequences in mRNAs determine their half-lives? Which enzymes degrade mRNAs? Which (trans-acting) factors regulate mRNA stability, and how do they function? The following specific topics are discussed: techniques for measuring eukaryotic mRNA stability and for calculating decay constants, mRNA decay pathways, mRNases, proteins that bind to sequences shared among many mRNAs [like poly(A)- and AU-rich-binding proteins] and proteins that bind to specific mRNAs (like the c-myc coding-region determinant-binding protein), how environmental factors like hormones and growth factors affect mRNA stability, and how translation and mRNA stability are linked. Some perspectives and predictions for future research directions are summarized at the end. PMID:7565413

  18. A report on emergent uranyl binding phenomena by an amidoxime phosphonic acid co-polymer

    DOE PAGES

    Abney, C. W.; Das, S.; Mayes, R. T.; ...

    2016-08-01

    Development of technology to harvest the uranium dissolved in seawater would enable access to vast quantities of this critical metal for nuclear power generation. Amidoxime polymers are the most promising platform for achieving this separation, yet design of advanced adsorbents is hindered by uncertainty regarding the uranium binding mode. In this work we use XAFS to investigate the uranium coordination environment in an amidoxime-phosphonic acid copolymer adsorbent. In contrast to the binding mode predicted computationally and from small molecule studies, a cooperative chelating model is favoured, attributable to emergent behavior resulting from inclusion of amidoxime in a polymer. Samples exposedmore » to seawater also display a feature consistent with a 2-oxo-bridged transition metal, suggesting formation of an in situ specific binding site. As a result, these findings challenge long held assumptions and provide new opportunities for the design of advanced adsorbent materials.« less

  19. Performance of HADDOCK and a simple contact-based protein-ligand binding affinity predictor in the D3R Grand Challenge 2

    NASA Astrophysics Data System (ADS)

    Kurkcuoglu, Zeynep; Koukos, Panagiotis I.; Citro, Nevia; Trellet, Mikael E.; Rodrigues, J. P. G. L. M.; Moreira, Irina S.; Roel-Touris, Jorge; Melquiond, Adrien S. J.; Geng, Cunliang; Schaarschmidt, Jörg; Xue, Li C.; Vangone, Anna; Bonvin, A. M. J. J.

    2018-01-01

    We present the performance of HADDOCK, our information-driven docking software, in the second edition of the D3R Grand Challenge. In this blind experiment, participants were requested to predict the structures and binding affinities of complexes between the Farnesoid X nuclear receptor and 102 different ligands. The models obtained in Stage1 with HADDOCK and ligand-specific protocol show an average ligand RMSD of 5.1 Å from the crystal structure. Only 6/35 targets were within 2.5 Å RMSD from the reference, which prompted us to investigate the limiting factors and revise our protocol for Stage2. The choice of the receptor conformation appeared to have the strongest influence on the results. Our Stage2 models were of higher quality (13 out of 35 were within 2.5 Å), with an average RMSD of 4.1 Å. The docking protocol was applied to all 102 ligands to generate poses for binding affinity prediction. We developed a modified version of our contact-based binding affinity predictor PRODIGY, using the number of interatomic contacts classified by their type and the intermolecular electrostatic energy. This simple structure-based binding affinity predictor shows a Kendall's Tau correlation of 0.37 in ranking the ligands (7th best out of 77 methods, 5th/25 groups). Those results were obtained from the average prediction over the top10 poses, irrespective of their similarity/correctness, underscoring the robustness of our simple predictor. This results in an enrichment factor of 2.5 compared to a random predictor for ranking ligands within the top 25%, making it a promising approach to identify lead compounds in virtual screening.

  20. Discovering rules for protein-ligand specificity using support vector inductive logic programming.

    PubMed

    Kelley, Lawrence A; Shrimpton, Paul J; Muggleton, Stephen H; Sternberg, Michael J E

    2009-09-01

    Structural genomics initiatives are rapidly generating vast numbers of protein structures. Comparative modelling is also capable of producing accurate structural models for many protein sequences. However, for many of the known structures, functions are not yet determined, and in many modelling tasks, an accurate structural model does not necessarily tell us about function. Thus, there is a pressing need for high-throughput methods for determining function from structure. The spatial arrangement of key amino acids in a folded protein, on the surface or buried in clefts, is often the determinants of its biological function. A central aim of molecular biology is to understand the relationship between such substructures or surfaces and biological function, leading both to function prediction and to function design. We present a new general method for discovering the features of binding pockets that confer specificity for particular ligands. Using a recently developed machine-learning technique which couples the rule-discovery approach of inductive logic programming with the statistical learning power of support vector machines, we are able to discriminate, with high precision (90%) and recall (86%) between pockets that bind FAD and those that bind NAD on a large benchmark set given only the geometry and composition of the backbone of the binding pocket without the use of docking. In addition, we learn rules governing this specificity which can feed into protein functional design protocols. An analysis of the rules found suggests that key features of the binding pocket may be tied to conformational freedom in the ligand. The representation is sufficiently general to be applicable to any discriminatory binding problem. All programs and data sets are freely available to non-commercial users at http://www.sbg.bio.ic.ac.uk/svilp_ligand/.

  1. The Protein-DNA Interface database

    PubMed Central

    2010-01-01

    The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 Å or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface. We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes. PMID:20482798

  2. The Protein-DNA Interface database.

    PubMed

    Norambuena, Tomás; Melo, Francisco

    2010-05-18

    The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 A or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface.We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes.

  3. NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11.

    PubMed

    Lundegaard, Claus; Lamberth, Kasper; Harndahl, Mikkel; Buus, Søren; Lund, Ole; Nielsen, Morten

    2008-07-01

    NetMHC-3.0 is trained on a large number of quantitative peptide data using both affinity data from the Immune Epitope Database and Analysis Resource (IEDB) and elution data from SYFPEITHI. The method generates high-accuracy predictions of major histocompatibility complex (MHC): peptide binding. The predictions are based on artificial neural networks trained on data from 55 MHC alleles (43 Human and 12 non-human), and position-specific scoring matrices (PSSMs) for additional 67 HLA alleles. As only the MHC class I prediction server is available, predictions are possible for peptides of length 8-11 for all 122 alleles. artificial neural network predictions are given as actual IC(50) values whereas PSSM predictions are given as a log-odds likelihood scores. The output is optionally available as download for easy post-processing. The training method underlying the server is the best available, and has been used to predict possible MHC-binding peptides in a series of pathogen viral proteomes including SARS, Influenza and HIV, resulting in an average of 75-80% confirmed MHC binders. Here, the performance is further validated and benchmarked using a large set of newly published affinity data, non-redundant to the training set. The server is free of use and available at: http://www.cbs.dtu.dk/services/NetMHC.

  4. Many human accelerated regions are developmental enhancers

    PubMed Central

    Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.

    2013-01-01

    The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637

  5. TEMPLE: analysing population genetic variation at transcription factor binding sites.

    PubMed

    Litovchenko, Maria; Laurent, Stefan

    2016-11-01

    Genetic variation occurring at the level of regulatory sequences can affect phenotypes and fitness in natural populations. This variation can be analysed in a population genetic framework to study how genetic drift and selection affect the evolution of these functional elements. However, doing this requires a good understanding of the location and nature of regulatory regions and has long been a major hurdle. The current proliferation of genomewide profiling experiments of transcription factor occupancies greatly improves our ability to identify genomic regions involved in specific DNA-protein interactions. Although software exists for predicting transcription factor binding sites (TFBS), and the effects of genetic variants on TFBS specificity, there are no tools currently available for inferring this information jointly with the genetic variation at TFBS in natural populations. We developed the software Transcription Elements Mapping at the Population LEvel (TEMPLE), which predicts TFBS, evaluates the effects of genetic variants on TFBS specificity and summarizes the genetic variation occurring at TFBS in intraspecific sequence alignments. We demonstrate that TEMPLE's TFBS prediction algorithms gives identical results to PATSER, a software distribution commonly used in the field. We also illustrate the unique features of TEMPLE by analysing TFBS diversity for the TF Senseless (SENS) in one ancestral and one cosmopolitan population of the fruit fly Drosophila melanogaster. TEMPLE can be used to localize TFBS that are characterized by strong genetic differentiation across natural populations. This will be particularly useful for studies aiming to identify adaptive mutations. TEMPLE is a java-based cross-platform software that easily maps the genetic diversity at predicted TFBSs using a graphical interface, or from the Unix command line. © 2016 John Wiley & Sons Ltd.

  6. Development of a strategy and computational application to select candidate protein analogues with reduced HLA binding and immunogenicity.

    PubMed

    Dhanda, Sandeep Kumar; Grifoni, Alba; Pham, John; Vaughan, Kerrie; Sidney, John; Peters, Bjoern; Sette, Alessandro

    2018-01-01

    Unwanted immune responses against protein therapeutics can reduce efficacy or lead to adverse reactions. T-cell responses are key in the development of such responses, and are directed against immunodominant regions within the protein sequence, often associated with binding to several allelic variants of HLA class II molecules (promiscuous binders). Herein, we report a novel computational strategy to predict 'de-immunized' peptides, based on previous studies of erythropoietin protein immunogenicity. This algorithm (or method) first predicts promiscuous binding regions within the target protein sequence and then identifies residue substitutions predicted to reduce HLA binding. Further, this method anticipates the effect of any given substitution on flanking peptides, thereby circumventing the creation of nascent HLA-binding regions. As a proof-of-principle, the algorithm was applied to Vatreptacog α, an engineered Factor VII molecule associated with unintended immunogenicity. The algorithm correctly predicted the two immunogenic peptides containing the engineered residues. As a further validation, we selected and evaluated the immunogenicity of seven substitutions predicted to simultaneously reduce HLA binding for both peptides, five control substitutions with no predicted reduction in HLA-binding capacity, and additional flanking region controls. In vitro immunogenicity was detected in 21·4% of the cultures of peptides predicted to have reduced HLA binding and 11·4% of the flanking regions, compared with 46% for the cultures of the peptides predicted to be immunogenic. This method has been implemented as an interactive application, freely available online at http://tools.iedb.org/deimmunization/. © 2017 John Wiley & Sons Ltd.

  7. Template-Based Modeling of Protein-RNA Interactions

    PubMed Central

    Zheng, Jinfang; Kundrotas, Petras J.; Vakser, Ilya A.

    2016-01-01

    Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes. PMID:27662342

  8. Novel cyclo-peptides inhibit Ebola pseudotyped virus entry by targeting primed GP protein.

    PubMed

    Li, Quanjie; Ma, Ling; Yi, Dongrong; Wang, Han; Wang, Jing; Zhang, Yongxin; Guo, Ying; Li, Xiaoyu; Zhou, Jinming; Shi, Yi; Gao, George F; Cen, Shan

    2018-07-01

    Ebola virus (EBOV) causes fatal hemorrhagic fever with high death rates in human. Currently, there are no available clinically-approved prophylactic or therapeutic treatments. The recently solved crystal structure of cleavage-primed EBOV glycoprotein (GPcl) in complex with the C domain of endosomal protein Niemann-Pick C1 (NPC1) provides a new target for the development of EBOV entry inhibitors. In this work, a computational approach using docking and molecular dynamic simulations is carried out for the rational design of peptide inhibitors. A novel cyclo-peptide (Pep-3.3) was identified to target at the late stage of EBOV entry and exhibit specific inhibitory activity against EBOV-GP pseudotyped viruses, with 50% inhibitory concentration (IC50) of 5.1 μM. In vitro binding assay and molecular simulations revealed that Pep-3.3 binds to GPcl with a KD value of 69.7 μM, through interacting with predicted residues in the hydrophobic binding pocket of GPcl. Mutation of predicted residues T83 caused resistance to Pep-3.3 inhibition in viral infectivity, providing preliminary support for the model of the peptide binding to GPcl. This study demonstrates the feasibility of inhibiting EBOV entry by targeting GPcl with peptides. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. HLaffy: estimating peptide affinities for Class-1 HLA molecules by learning position-specific pair potentials.

    PubMed

    Mukherjee, Sumanta; Bhattacharyya, Chiranjib; Chandra, Nagasuma

    2016-08-01

    T-cell epitopes serve as molecular keys to initiate adaptive immune responses. Identification of T-cell epitopes is also a key step in rational vaccine design. Most available methods are driven by informatics and are critically dependent on experimentally obtained training data. Analysis of a training set from Immune Epitope Database (IEDB) for several alleles indicates that the sampling of the peptide space is extremely sparse covering a tiny fraction of the possible nonamer space, and also heavily skewed, thus restricting the range of epitope prediction. We present a new epitope prediction method that has four distinct computational modules: (i) structural modelling, estimating statistical pair-potentials and constraint derivation, (ii) implicit modelling and interaction profiling, (iii) feature representation and binding affinity prediction and (iv) use of graphical models to extract peptide sequence signatures to predict epitopes for HLA class I alleles. HLaffy is a novel and efficient epitope prediction method that predicts epitopes for any Class-1 HLA allele, by estimating the binding strengths of peptide-HLA complexes which is achieved through learning pair-potentials important for peptide binding. It relies on the strength of the mechanistic understanding of peptide-HLA recognition and provides an estimate of the total ligand space for each allele. The performance of HLaffy is seen to be superior to the currently available methods. The method is made accessible through a webserver http://proline.biochem.iisc.ernet.in/HLaffy : nchandra@biochem.iisc.ernet.in Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Urban Biomining: Biological Extraction of Metals and Materials from Electronics Waste Using a Synthetic Biology Approach

    NASA Astrophysics Data System (ADS)

    Urbina-Navarrete, J.; Rothschild, L.

    2016-12-01

    End-of-life electronics waste (e-waste) containing toxic and valuable materials is a rapidly progressing human health and environmental issue. Using synthetic biology tools, we have developed a recycling method for e-waste. Our innovation is to use a recombinant version of a naturally-occurring silica-degrading enzyme to depolymerize the silica in metal- and glass- containing e-waste components, and subsequently, to use engineered bacterial surfaces to bind and separate metals from a solution. The bacteria with bound metals can then be used as "bio-ink" to print new circuits using a novel plasma jet electronics printing technology. Here, we present the results from our initial studies that focus on the specificity of metal-binding motifs for a cognate metal. The candidate motifs that show high affinity and specificity will be engineered into bacterial surfaces for downstream applications in biologically-mediated metal recycling. Since the chemistry and role of Cu in metalloproteins is relatively well-characterized, we are using Cu as a proxy to elucidate metal and biological ligand interactions with various metals in e-waste. We assess the binding parameters of 3 representative classes of Cu-binding motifs using isothermal titration calorimetry; 1) natural motifs found in metalloproteins, 2) consensus motifs, and 3) rationally designed peptides that are predicted, in silico, to bind Cu. Our results indicate that naturally-occurring motifs have relative high affinity and specificity for Cu (association constant for Cu Ka 104 M-1, Zn Ka 103 M-1) when competing ions are present in the aqueous milieu. However, motifs developed through rational design by applying quantum mechanical methods that take into account complexation energies of the elemental binding partners and molecular geometry of the cognate metal, not only show high affinity for the cognate metal (Cu Ka 106 M-1), but they show specificity and discrimination against other metal ions that would be competitors for the same binding sites. This is an initial proof-of-concept study that focuses on Cu-binding; however the overall objective of this research is to have peptides that selectively bind many metals from e-waste and this would allow for the separation of the metals from a solution, at ambient temperatures and under non-toxic conditions.

  11. Impact of germline and somatic missense variations on drug binding sites.

    PubMed

    Yan, C; Pattabiraman, N; Goecks, J; Lam, P; Nayak, A; Pan, Y; Torcivia-Rodriguez, J; Voskanian, A; Wan, Q; Mazumder, R

    2017-03-01

    Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.

  12. Modeling alternative binding registers of a minimal immunogenic peptide on two class II major histocompatibility complex (MHC II) molecules predicts polarized T-cell receptor (TCR) contact positions.

    PubMed

    Murray, J S; Fois, S D S; Schountz, T; Ford, S R; Tawde, M D; Brown, J C; Siahaan, T J

    2002-03-01

    Several major histocompatibility complex class II (MHC II) complexes with known minimal immunogenic peptides have now been solved by X-ray crystallography. Specificity pockets within the MHC II binding groove provide distinct peptide contacts that influence peptide conformation and define the binding register within different allelic MHC II molecules. Altering peptide ligands with respect to the residues that contact the T-cell receptor (TCR) can drastically change the nature of the ensuing immune response. Here, we provide an example of how MHC II (I-A) molecules may indirectly effect TCR contacts with a peptide and drive functionally distinct immune responses. We modeled the same immunogenic 12-amino acid peptide into the binding grooves of two allelic MHC II molecules linked to distinct cytokine responses against the peptide. Surprisingly, the favored conformation of the peptide in each molecule was distinct with respect to the exposure of the N- or C-terminus of the peptide above the MHC II binding groove. T-cell clones derived from each allelic MHC II genotype were found to be allele-restricted with respect to the recognition of these N- vs. C-terminal residues on the bound peptide. Taken together, these data suggest that MHC II alleles may influence T-cell functions by restricting TCR access to specific residues of the I-A-bound peptide. Thus, these data are of significance to diseases that display genetic linkage to specific MHC II alleles, e.g. type 1 diabetes and rheumatoid arthritis.

  13. The L7Ae protein binds to two kink-turns in the Pyrococcus furiosus RNase P RNA

    PubMed Central

    Lai, Stella M.; Lai, Lien B.; Foster, Mark P.; Gopalan, Venkat

    2014-01-01

    The RNA-binding protein L7Ae, known for its role in translation (as part of ribosomes) and RNA modification (as part of sn/oRNPs), has also been identified as a subunit of archaeal RNase P, a ribonucleoprotein complex that employs an RNA catalyst for the Mg2+-dependent 5′ maturation of tRNAs. To better understand the assembly and catalysis of archaeal RNase P, we used a site-specific hydroxyl radical-mediated footprinting strategy to pinpoint the binding sites of Pyrococcus furiosus (Pfu) L7Ae on its cognate RNase P RNA (RPR). L7Ae derivatives with single-Cys substitutions at residues in the predicted RNA-binding interface (K42C/C71V, R46C/C71V, V95C/C71V) were modified with an iron complex of EDTA-2-aminoethyl 2-pyridyl disulfide. Upon addition of hydrogen peroxide and ascorbate, these L7Ae-tethered nucleases were expected to cleave the RPR at nucleotides proximal to the EDTA-Fe–modified residues. Indeed, footprinting experiments with an enzyme assembled with the Pfu RPR and five protein cofactors (POP5, RPP21, RPP29, RPP30 and L7Ae–EDTA-Fe) revealed specific RNA cleavages, localizing the binding sites of L7Ae to the RPR's catalytic and specificity domains. These results support the presence of two kink-turns, the structural motifs recognized by L7Ae, in distinct functional domains of the RPR and suggest testable mechanisms by which L7Ae contributes to RNase P catalysis. PMID:25361963

  14. Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.

    PubMed

    Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P

    2018-05-22

    Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.

  15. The development of real-time stability supports visual working memory performance: Young children's feature binding can be improved through perceptual structure.

    PubMed

    Simmering, Vanessa R; Wood, Chelsey M

    2017-08-01

    Working memory is a basic cognitive process that predicts higher-level skills. A central question in theories of working memory development is the generality of the mechanisms proposed to explain improvements in performance. Prior theories have been closely tied to particular tasks and/or age groups, limiting their generalizability. The cognitive dynamics theory of visual working memory development has been proposed to overcome this limitation. From this perspective, developmental improvements arise through the coordination of cognitive processes to meet demands of different behavioral tasks. This notion is described as real-time stability, and can be probed through experiments that assess how changing task demands impact children's performance. The current studies test this account by probing visual working memory for colors and shapes in a change detection task that compares detection of changes to new features versus swaps in color-shape binding. In Experiment 1, 3- to 4-year-old children showed impairments specific to binding swaps, as predicted by decreased real-time stability early in development; 5- to 6-year-old children showed a slight advantage on binding swaps, but 7- to 8-year-old children and adults showed no difference across trial types. Experiment 2 tested the proposed explanation of young children's binding impairment through added perceptual structure, which supported the stability and precision of feature localization in memory-a process key to detecting binding swaps. This additional structure improved young children's binding swap detection, but not new-feature detection or adults' performance. These results provide further evidence for the cognitive dynamics and real-time stability explanation of visual working memory development. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  16. SPEER-SERVER: a web server for prediction of protein specificity determining sites

    PubMed Central

    Chakraborty, Abhijit; Mandloi, Sapan; Lanczycki, Christopher J.; Panchenko, Anna R.; Chakrabarti, Saikat

    2012-01-01

    Sites that show specific conservation patterns within subsets of proteins in a protein family are likely to be involved in the development of functional specificity. These sites, generally termed specificity determining sites (SDS), might play a crucial role in binding to a specific substrate or proteins. Identification of SDS through experimental techniques is a slow, difficult and tedious job. Hence, it is very important to develop efficient computational methods that can more expediently identify SDS. Herein, we present Specificity prediction using amino acids’ Properties, Entropy and Evolution Rate (SPEER)-SERVER, a web server that predicts SDS by analyzing quantitative measures of the conservation patterns of protein sites based on their physico-chemical properties and the heterogeneity of evolutionary changes between and within the protein subfamilies. This web server provides an improved representation of results, adds useful input and output options and integrates a wide range of analysis and data visualization tools when compared with the original standalone version of the SPEER algorithm. Extensive benchmarking finds that SPEER-SERVER exhibits sensitivity and precision performance that, on average, meets or exceeds that of other currently available methods. SPEER-SERVER is available at http://www.hpppi.iicb.res.in/ss/. PMID:22689646

  17. SPEER-SERVER: a web server for prediction of protein specificity determining sites.

    PubMed

    Chakraborty, Abhijit; Mandloi, Sapan; Lanczycki, Christopher J; Panchenko, Anna R; Chakrabarti, Saikat

    2012-07-01

    Sites that show specific conservation patterns within subsets of proteins in a protein family are likely to be involved in the development of functional specificity. These sites, generally termed specificity determining sites (SDS), might play a crucial role in binding to a specific substrate or proteins. Identification of SDS through experimental techniques is a slow, difficult and tedious job. Hence, it is very important to develop efficient computational methods that can more expediently identify SDS. Herein, we present Specificity prediction using amino acids' Properties, Entropy and Evolution Rate (SPEER)-SERVER, a web server that predicts SDS by analyzing quantitative measures of the conservation patterns of protein sites based on their physico-chemical properties and the heterogeneity of evolutionary changes between and within the protein subfamilies. This web server provides an improved representation of results, adds useful input and output options and integrates a wide range of analysis and data visualization tools when compared with the original standalone version of the SPEER algorithm. Extensive benchmarking finds that SPEER-SERVER exhibits sensitivity and precision performance that, on average, meets or exceeds that of other currently available methods. SPEER-SERVER is available at http://www.hpppi.iicb.res.in/ss/.

  18. Discovering ligands for a microRNA precursor with peptoid microarrays

    PubMed Central

    Chirayil, Sara; Chirayil, Rachel; Luebke, Kevin J.

    2009-01-01

    We have screened peptoid microarrays to identify specific ligands for the RNA hairpin precursor of miR-21, a microRNA involved in cancer and heart disease. Microarrays were printed by spotting a library of 7680 N-substituted oligoglycines (peptoids) onto glass slides. Two compounds on the array specifically bind RNA having the sequence and predicted secondary structure of the miR-21 precursor hairpin and have specific affinity for the target in solution. Their binding induces a conformational change around the hairpin loop, and the most specific compound recognizes the loop sequence and a bulged uridine in the proximal duplex. Functional groups contributing affinity and specificity were identified, and by varying a critical methylpyridine group, a compound with a dissociation constant of 1.9 μM for the miR-21 precursor hairpin and a 20-fold discrimination against a closely-related hairpin was created. This work describes a systematic approach to discovery of ligands for specific pre-defined novel RNA structures. It demonstrates discovery of new ligands for an RNA for which no specific lead compounds were previously known by screening a microarray of small molecules. PMID:19561197

  19. pUL34 binding near the human cytomegalovirus origin of lytic replication enhances DNA replication and viral growth.

    PubMed

    Slayton, Mark; Hossain, Tanvir; Biegalke, Bonita J

    2018-05-01

    The human cytomegalovirus (HCMV) UL34 gene encodes sequence-specific DNA-binding proteins (pUL34) which are required for viral replication. Interactions of pUL34 with DNA binding sites represses transcription of two viral immune evasion genes, US3 and US9. 12 additional predicted pUL34-binding sites are present in the HCMV genome (strain AD169) with three binding sites concentrated near the HCMV origin of lytic replication (oriLyt). We used ChIP-seq analysis of pUL34-DNA interactions to confirm that pUL34 binds to the oriLyt region during infection. Mutagenesis of the UL34-binding sites in an oriLyt-containing plasmid significantly reduced viral-mediated oriLyt-dependent DNA replication. Mutagenesis of these sites in the HCMV genome reduced the replication efficiencies of the resulting viruses. Protein-protein interaction analyses demonstrated that pUL34 interacts with the viral proteins IE2, UL44, and UL84, that are essential for viral DNA replication, suggesting that pUL34-DNA interactions in the oriLyt region are involved in the DNA replication cascade. Copyright © 2018 Elsevier Inc. All rights reserved.

  20. Molecular characterization and expression analysis of Lily-type lectin ( SmLTL) in turbot Scophthalmus maximus, and its response to Vibrio anguillarum

    NASA Astrophysics Data System (ADS)

    Xia, Dandan; Ma, Aijun; Huang, Zhihui; Shang, Xiaomei; Cui, Wenxiao; Yang, Zhi; Qu, Jiangbo

    2018-03-01

    A full-length lily-type lectin ( SmLTL) was identified from turbot ( Scophthalmus maximus) in this study. By searching database for protein identification and function prediction, SmLTL were confirmed. The full-length cDNA of SmLTL is composed of 569 bp and contains a 339 bp ORF that encodes 112 amino acid residues. The SmLTL peptide is characterized by a specific β-prism architecture and contains three mannose binding sites in a three-fold internal repeat between amino acids 30-99; two of the repeats share the classical mannose binding domain (QxDxNxVxY) while the third binding site was similar to other fish-specific binding motifs (TxTxGxRxV). The primary, secondary, and tertiary structures of SmLTL were predicted and analyzed, indicating that the SmLTL protein was hydrophilic, contained 5.36% α-helices, 39.29% extended strands, 16.07% β-folds, and 39.29% random coils, and three β-folds. Quantitative realtime polymerase chain reaction (qPCR) analysis revealed that the SmLTL mRNA was abundantly expressed in skin, gill, and intestine. Low levels of SmLTL expression were observed in other tissues. The expression of SmLTL in gill, skin and intestine increased at mRNA level after stimulation of Vibrio anguillarum, our results suggest that SmLTL serve as the first line of defence against microbial infections and play a pivotal role in the innate mucosal immune system. The current study indicates that SmLTL is a member of the lilytype lectin family and the information reported here will provide an important foundation for future research on the role of this protein.

  1. Docking analysis targeted to the whole enzyme: an application to the prediction of inhibition of PTP1B by thiomorpholine and thiazolyl derivatives.

    PubMed

    Ganou, C A; Eleftheriou, P Th; Theodosis-Nobelos, P; Fesatidou, M; Geronikaki, A A; Lialiaris, T; Rekka, E A

    2018-02-01

    PTP1b is a protein tyrosine phosphatase involved in the inactivation of insulin receptor. Since inhibition of PTP1b may prolong the action of the receptor, PTP1b has become a drug target for the treatment of type II diabetes. In the present study, prediction of inhibition using docking analysis targeted specifically to the active or allosteric site was performed on 87 compounds structurally belonging to 10 different groups. Two groups, consisting of 15 thiomorpholine and 10 thiazolyl derivatives exhibiting the best prediction results, were selected for in vitro evaluation. All thiomorpholines showed inhibitory action (with IC 50 = 4-45 μΜ, Ki = 2-23 μM), while only three thiazolyl derivatives showed low inhibition (best IC 50 = 18 μΜ, Ki = 9 μΜ). However, free binding energy (E) was in accordance with the IC 50 values only for some compounds. Docking analysis targeted to the whole enzyme revealed that the compounds exhibiting IC 50 values higher than expected could bind to other peripheral sites with lower free energy, E o , than when bound to the active/allosteric site. A prediction factor, E- (Σ Eo × 0.16), which takes into account lower energy binding to peripheral sites, was proposed and was found to correlate well with the IC 50 values following an asymmetrical sigmoidal equation with r 2 = 0.9692.

  2. Binding Modes of Ligands Using Enhanced Sampling (BLUES): Rapid Decorrelation of Ligand Binding Modes via Nonequilibrium Candidate Monte Carlo.

    PubMed

    Gill, Samuel C; Lim, Nathan M; Grinaway, Patrick B; Rustenburg, Ariën S; Fass, Josh; Ross, Gregory A; Chodera, John D; Mobley, David L

    2018-05-31

    Accurately predicting protein-ligand binding affinities and binding modes is a major goal in computational chemistry, but even the prediction of ligand binding modes in proteins poses major challenges. Here, we focus on solving the binding mode prediction problem for rigid fragments. That is, we focus on computing the dominant placement, conformation, and orientations of a relatively rigid, fragment-like ligand in a receptor, and the populations of the multiple binding modes which may be relevant. This problem is important in its own right, but is even more timely given the recent success of alchemical free energy calculations. Alchemical calculations are increasingly used to predict binding free energies of ligands to receptors. However, the accuracy of these calculations is dependent on proper sampling of the relevant ligand binding modes. Unfortunately, ligand binding modes may often be uncertain, hard to predict, and/or slow to interconvert on simulation time scales, so proper sampling with current techniques can require prohibitively long simulations. We need new methods which dramatically improve sampling of ligand binding modes. Here, we develop and apply a nonequilibrium candidate Monte Carlo (NCMC) method to improve sampling of ligand binding modes. In this technique, the ligand is rotated and subsequently allowed to relax in its new position through alchemical perturbation before accepting or rejecting the rotation and relaxation as a nonequilibrium Monte Carlo move. When applied to a T4 lysozyme model binding system, this NCMC method shows over 2 orders of magnitude improvement in binding mode sampling efficiency compared to a brute force molecular dynamics simulation. This is a first step toward applying this methodology to pharmaceutically relevant binding of fragments and, eventually, drug-like molecules. We are making this approach available via our new Binding modes of ligands using enhanced sampling (BLUES) package which is freely available on GitHub.

  3. Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.

    PubMed

    Majoros, William H; Ohler, Uwe

    2010-12-16

    The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.

  4. CNGA3 is expressed in inner ear hair cells and binds to an intracellular C-terminus domain of EMILIN1.

    PubMed

    Selvakumar, Dakshnamurthy; Drescher, Marian J; Dowdall, Jayme R; Khan, Khalid M; Hatfield, James S; Ramakrishnan, Neeliyath A; Drescher, Dennis G

    2012-04-15

    The molecular characteristics of CNG (cyclic nucleotide-gated) channels in auditory/vestibular hair cells are largely unknown, unlike those of CNG mediating sensory transduction in vision and olfaction. In the present study we report the full-length sequence for three CNGA3 variants in a hair cell preparation from the trout saccule with high identity to CNGA3 in olfactory receptor neurons/cone photoreceptors. A custom antibody targeting the N-terminal sequence immunolocalized CNGA3 to the stereocilia and subcuticular plate region of saccular hair cells. The cytoplasmic C-terminus of CNGA3 was found by yeast two-hybrid analysis to bind the C-terminus of EMILIN1 (elastin microfibril interface-located protein 1) in both the vestibular hair cell model and rat organ of Corti. Specific binding between CNGA3 and EMILIN1 was confirmed with surface plasmon resonance analysis, predicting dependence on Ca2+ with Kd=1.6×10-6 M for trout hair cell proteins and Kd=2.7×10-7 M for organ of Corti proteins at 68 μM Ca2+. Pull-down assays indicated that the binding to organ of Corti CNGA3 was attributable to the EMILIN1 intracellular sequence that follows a predicted transmembrane domain in the C-terminus. Saccular hair cells also express the transcript for PDE6C (phosphodiesterase 6C), which in cone photoreceptors regulates the degradation of cGMP used to gate CNGA3 in phototransduction. Taken together, the evidence supports the existence in saccular hair cells of a molecular pathway linking CNGA3, its binding partner EMILIN1 (and β1 integrin) and cGMP-specific PDE6C, which is potentially replicated in cochlear outer hair cells, given stereociliary immunolocalizations of CNGA3, EMILIN1 and PDE6C.

  5. A Prediction Method of Binding Free Energy of Protein and Ligand

    NASA Astrophysics Data System (ADS)

    Yang, Kun; Wang, Xicheng

    2010-05-01

    Predicting the binding free energy is an important problem in bimolecular simulation. Such prediction would be great benefit in understanding protein functions, and may be useful for computational prediction of ligand binding strengths, e.g., in discovering pharmaceutical drugs. Free energy perturbation (FEP)/thermodynamics integration (TI) is a classical method to explicitly predict free energy. However, this method need plenty of time to collect datum, and that attempts to deal with some simple systems and small changes of molecular structures. Another one for estimating ligand binding affinities is linear interaction energy (LIE) method. This method employs averages of interaction potential energy terms from molecular dynamics simulations or other thermal conformational sampling techniques. Incorporation of systematic deviations from electrostatic linear response, derived from free energy perturbation studies, into the absolute binding free energy expression significantly enhances the accuracy of the approach. However, it also is time-consuming work. In this paper, a new prediction method based on steered molecular dynamics (SMD) with direction optimization is developed to compute binding free energy. Jarzynski's equality is used to derive the PMF or free-energy. The results for two numerical examples are presented, showing that the method has good accuracy and efficiency. The novel method can also simulate whole binding proceeding and give some important structural information about development of new drugs.

  6. Autoantibodies in Serum of Systemic Scleroderma Patients: Peptide-Based Epitope Mapping Indicates Increased Binding to Cytoplasmic Domains of CXCR3.

    PubMed

    Recke, Andreas; Regensburger, Ann-Katrin; Weigold, Florian; Müller, Antje; Heidecke, Harald; Marschner, Gabriele; Hammers, Christoph M; Ludwig, Ralf J; Riemekasten, Gabriela

    2018-01-01

    Systemic sclerosis (SSc) is a severe chronic autoimmune disease with high morbidity and mortality. Sera of patients with SSc contain a large variety of autoantibody (aab) reactivities. Among these are functionally active aab that bind to G protein-coupled receptors (GPCR) such as C-X-C motif chemokine receptor 3 (CXCR3) and 4 (CXCR4). Aab binding to the N-terminal portion of these two GPCRs have been shown to be associated with slower disease progression in SSc, especially deterioration of lung function. Aabs binding to GPCRs exhibit functional activities by stimulating or inhibiting GPCR signaling. The specific functional activity of aabs crucially depends on the epitopes they bind to. To identify the location of important epitopes on CXCR3 recognized by aabs from SSc patients, we applied an array of 36 overlapping 18-20mer peptides covering the entire CXCR3 sequence, comparing epitope specificity of SSc patient sera ( N  = 32, with positive reactivity with CXCR3) to healthy controls ( N  = 30). Binding of SSc patient and control sera to these peptides was determined by ELISA. Using a Bayesian model approach, we found increased binding of SSc patient sera to peptides corresponding to intracellular epitopes within CXCR3, while the binding signal to extracellular portions of CXCR3 was found to be reduced. Experimentally determined epitopes showed a good correspondence to those predicted by the ABCpred tool. To verify these results and to translate them into a novel diagnostic ELISA, we combined the peptides that represent SSc-associated epitopes into a single ELISA and evaluated its potential to discriminate SSc patients ( N  = 31) from normal healthy controls ( N  = 47). This ELISA had a sensitivity of 0.61 and a specificity of 0.85. Our data reveals that SSc sera preferentially bind intracellular epitopes of CXCR3, while an extracellular epitope in the N-terminal domain that appears to be target of aabs in healthy individuals is not bound by SSc sera. Based upon our results, we could devise a novel ELISA concept that may be helpful for monitoring of SSc patients.

  7. Autoantibodies in Serum of Systemic Scleroderma Patients: Peptide-Based Epitope Mapping Indicates Increased Binding to Cytoplasmic Domains of CXCR3

    PubMed Central

    Recke, Andreas; Regensburger, Ann-Katrin; Weigold, Florian; Müller, Antje; Heidecke, Harald; Marschner, Gabriele; Hammers, Christoph M.; Ludwig, Ralf J.; Riemekasten, Gabriela

    2018-01-01

    Systemic sclerosis (SSc) is a severe chronic autoimmune disease with high morbidity and mortality. Sera of patients with SSc contain a large variety of autoantibody (aab) reactivities. Among these are functionally active aab that bind to G protein-coupled receptors (GPCR) such as C-X-C motif chemokine receptor 3 (CXCR3) and 4 (CXCR4). Aab binding to the N-terminal portion of these two GPCRs have been shown to be associated with slower disease progression in SSc, especially deterioration of lung function. Aabs binding to GPCRs exhibit functional activities by stimulating or inhibiting GPCR signaling. The specific functional activity of aabs crucially depends on the epitopes they bind to. To identify the location of important epitopes on CXCR3 recognized by aabs from SSc patients, we applied an array of 36 overlapping 18-20mer peptides covering the entire CXCR3 sequence, comparing epitope specificity of SSc patient sera (N = 32, with positive reactivity with CXCR3) to healthy controls (N = 30). Binding of SSc patient and control sera to these peptides was determined by ELISA. Using a Bayesian model approach, we found increased binding of SSc patient sera to peptides corresponding to intracellular epitopes within CXCR3, while the binding signal to extracellular portions of CXCR3 was found to be reduced. Experimentally determined epitopes showed a good correspondence to those predicted by the ABCpred tool. To verify these results and to translate them into a novel diagnostic ELISA, we combined the peptides that represent SSc-associated epitopes into a single ELISA and evaluated its potential to discriminate SSc patients (N = 31) from normal healthy controls (N = 47). This ELISA had a sensitivity of 0.61 and a specificity of 0.85. Our data reveals that SSc sera preferentially bind intracellular epitopes of CXCR3, while an extracellular epitope in the N-terminal domain that appears to be target of aabs in healthy individuals is not bound by SSc sera. Based upon our results, we could devise a novel ELISA concept that may be helpful for monitoring of SSc patients. PMID:29623076

  8. Measuring specific receptor binding of a PET radioligand in human brain without pharmacological blockade: The genomic plot.

    PubMed

    Veronese, Mattia; Zanotti-Fregonara, Paolo; Rizzo, Gaia; Bertoldo, Alessandra; Innis, Robert B; Turkheimer, Federico E

    2016-04-15

    PET studies allow in vivo imaging of the density of brain receptor species. The PET signal, however, is the sum of the fraction of radioligand that is specifically bound to the target receptor and the non-displaceable fraction (i.e. the non-specifically bound radioligand plus the free ligand in tissue). Therefore, measuring the non-displaceable fraction, which is generally assumed to be constant across the brain, is a necessary step to obtain regional estimates of the specific fractions. The nondisplaceable binding can be directly measured if a reference region, i.e. a region devoid of any specific binding, is available. Many receptors are however widely expressed across the brain, and a true reference region is rarely available. In these cases, the nonspecific binding can be obtained after competitive pharmacological blockade, which is often contraindicated in humans. In this work we introduce the genomic plot for estimating the nondisplaceable fraction using baseline scans only. The genomic plot is a transformation of the Lassen graphical method in which the brain maps of mRNA transcripts of the target receptor obtained from the Allen brain atlas are used as a surrogate measure of the specific binding. Thus, the genomic plot allows the calculation of the specific and nondisplaceable components of radioligand uptake without the need of pharmacological blockade. We first assessed the statistical properties of the method with computer simulations. Then we sought ground-truth validation using human PET datasets of seven different neuroreceptor radioligands, where nonspecific fractions were either obtained separately using drug displacement or available from a true reference region. The population nondisplaceable fractions estimated by the genomic plot were very close to those measured by actual human blocking studies (mean relative difference between 2% and 7%). However, these estimates were valid only when mRNA expressions were predictive of protein levels (i.e. there were no significant post-transcriptional changes). This condition can be readily established a priori by assessing the correlation between PET and mRNA expression. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Measuring specific receptor binding of a PET radioligand in human brain without pharmacological blockade: The genomic plot

    PubMed Central

    Veronese, Mattia; Zanotti-Fregonara, Paolo; Rizzo, Gaia; Bertoldo, Alessandra; Innis, Robert B.; Turkheimer, Federico E.

    2016-01-01

    PET studies allow in vivo imaging of the density of brain receptor species. The PET signal, however, is the sum of the fraction of radioligand that is specifically bound to the target receptor and the non-displaceable fraction (i.e. the non-specifically bound radioligand plus the free ligand in tissue). Therefore, measuring the non-displaceable fraction, which is generally assumed to be constant across the brain, is a necessary step to obtain regional estimates of the specific fractions. The nondisplaceable binding can be directly measured if a reference region, i.e. a region devoid of any specific binding, is available. Many receptors are however widely expressed across the brain, and a true reference region is rarely available. In these cases, the nonspecific binding can be obtained after competitive pharmacological blockade, which is often contraindicated in humans. In this work we introduce the genomic plot for estimating the nondisplaceable fraction using baseline scans only. The genomic plot is a transformation of the Lassen graphical method in which the brain maps of mRNA transcripts of the target receptor obtained from the Allen brain atlas are used as a surrogate measure of the specific binding. Thus, the genomic plot allows the calculation of the specific and nondisplaceable components of radioligand uptake without the need of pharmacological blockade. We first assessed the statistical properties of the method with computer simulations. Then we sought ground-truth validation using human PET datasets of seven different neuroreceptor radioligands, where nonspecific fractions were either obtained separately using drug displacement or available from a true reference region. The population nondisplaceable fractions estimated by the genomic plot were very close to those measured by actual human blocking studies (mean relative difference between 2% and 7%). However, these estimates were valid only when mRNA expressions were predictive of protein levels (i.e. there were no significant post-transcriptional changes). This condition can be readily established a priori by assessing the correlation between PET and mRNA expression. PMID:26850512

  10. The transcription factor titration effect dictates level of gene expression.

    PubMed

    Brewster, Robert C; Weinert, Franz M; Garcia, Hernan G; Song, Dan; Rydenfelt, Mattias; Phillips, Rob

    2014-03-13

    Models of transcription are often built around a picture of RNA polymerase and transcription factors (TFs) acting on a single copy of a promoter. However, most TFs are shared between multiple genes with varying binding affinities. Beyond that, genes often exist at high copy number-in multiple identical copies on the chromosome or on plasmids or viral vectors with copy numbers in the hundreds. Using a thermodynamic model, we characterize the interplay between TF copy number and the demand for that TF. We demonstrate the parameter-free predictive power of this model as a function of the copy number of the TF and the number and affinities of the available specific binding sites; such predictive control is important for the understanding of transcription and the desire to quantitatively design the output of genetic circuits. Finally, we use these experiments to dynamically measure plasmid copy number through the cell cycle. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Evolving serodiagnostics by rationally designed peptide arrays: the Burkholderia paradigm in Cystic Fibrosis

    NASA Astrophysics Data System (ADS)

    Peri, Claudio; Gori, Alessandro; Gagni, Paola; Sola, Laura; Girelli, Daniela; Sottotetti, Samantha; Cariani, Lisa; Chiari, Marcella; Cretich, Marina; Colombo, Giorgio

    2016-09-01

    Efficient diagnosis of emerging and novel bacterial infections is fundamental to guide decisions on therapeutic treatments. Here, we engineered a novel rational strategy to design peptide microarray platforms, which combines structural and genomic analyses to predict the binding interfaces between diverse protein antigens and antibodies against Burkholderia cepacia complex infections present in the sera of Cystic Fibrosis (CF) patients. The predicted binding interfaces on the antigens are synthesized in the form of isolated peptides and chemically optimized for controlled orientation on the surface. Our platform displays multiple Burkholderia-related epitopes and is shown to diagnose infected individuals even in presence of superinfections caused by other prevalent CF pathogens, with limited cost and time requirements. Moreover, our data point out that the specific patterns determined by combined probe responses might provide a characterization of Burkholderia infections even at the subtype level (genomovars). The method is general and immediately applicable to other bacteria.

  12. Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

    PubMed Central

    Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

    2011-01-01

    DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738

  13. Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection.

    PubMed

    Ma, Xin; Guo, Jing; Sun, Xiao

    2015-01-01

    The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.

  14. Regulation of the Mouse Treacher Collins Syndrome Homolog (Tcof1) Promoter Through Differential Repression of Constitutive Expression

    PubMed Central

    Shiang, Rita

    2008-01-01

    Treacher Collins syndrome is an autosomal-dominant mandibulofacial dysostosis caused by haploinsufficiency of the TCOF1 gene product treacle. Mouse Tcof1 protein is approximately 61% identical and 71% similar to treacle, and heterozygous knockout of Tcof1 causes craniofacial malformation. Tcof1 expression is high in developing neural crest, but much lower in other tissues. To investigate this dual regulation, highly conserved regions upstream of TCOF1 homologs were tested through deletion and mutation reporter assays, and conserved predicted transcription factor binding sites were assessed through chromatin binding studies. Assays were performed in mouse P19 embryonic carcinoma cells and in HEK293 cells to determine differential activation in cell types at different stages of differentiation. Binding of Cebpb, Zfp161, and Sp1 transcription factors was specific to the Tcof1 regulatory region in P19 cells. The Zfp161 binding site demonstrated P19 cell–specific repression, while the Sp1/Sp3 candidate site demonstrated HEK293 cell–specific activation. Moreover, presence of c-myb and Zfp161 transcripts was specific to P19 cells. A minimal promoter fragment from −253 to +43 bp directs constitutive expression in both cell types, and dual regulation of Tcof1 appears to be through differential repression of this minimal promoter. The CpG island at the transcription start site remains unmethylated in P19 cells, 11.5 dpc mouse embryonic tissue, and adult mouse ear, which supports constitutive activation of the Tcof1 promoter. PMID:18771418

  15. Maternal Binding and Neutralizing IgG Responses Targeting the C-Terminal Region of the V3 Loop Are Predictive of Reduced Peripartum HIV-1 Transmission Risk.

    PubMed

    Martinez, David R; Vandergrift, Nathan; Douglas, Ayooluwa O; McGuire, Erin; Bainbridge, John; Nicely, Nathan I; Montefiori, David C; Tomaras, Georgia D; Fouda, Genevieve G; Permar, Sallie R

    2017-05-01

    The development of an effective maternal HIV-1 vaccine that could synergize with antiretroviral therapy (ART) to eliminate pediatric HIV-1 infection will require the characterization of maternal immune responses capable of blocking transmission of autologous HIV to the infant. We previously determined that maternal plasma antibody binding to linear epitopes within the variable loop 3 (V3) region of HIV envelope (Env) and neutralizing responses against easy-to-neutralize tier 1 viruses were associated with reduced risk of peripartum HIV infection in the historic U.S. Woman and Infant Transmission Study (WITS) cohort. Here, we defined the fine specificity and function of the potentially protective maternal V3-specific IgG antibodies associated with reduced peripartum HIV transmission risk in this cohort. The V3-specific IgG binding that predicted low risk of mother-to-child-transmission (MTCT) was dependent on the C-terminal flank of the V3 crown and particularly on amino acid position 317, a residue that has also been associated with breakthrough transmission in the RV144 vaccine trial. Remarkably, the fine specificity of potentially protective maternal plasma V3-specific tier 1 virus-neutralizing responses was dependent on the same region in the V3 loop. Our findings suggest that MTCT risk is associated with neutralizing maternal IgG that targets amino acid residues in the C-terminal region of the V3 loop crown, suggesting the importance of the region in immunogen design for maternal vaccines to prevent MTCT. IMPORTANCE Efforts to curb HIV-1 transmission in pediatric populations by antiretroviral therapy (ART) have been highly successful in both developed and developing countries. However, more than 150,000 infants continue to be infected each year, likely due to a combination of late maternal HIV diagnosis, lack of ART access or adherence, and drug-resistant viral strains. Defining the fine specificity of maternal humoral responses that partially protect against MTCT of HIV is required to inform the development of a maternal HIV vaccine that will enhance these responses during pregnancy. In this study, we identified amino acid residues targeted by potentially protective maternal V3-specific IgG binding and neutralizing responses, localizing the potentially protective response in the C-terminal region of the V3 loop crown. Our findings have important implications for the design of maternal vaccination strategies that could synergize with ART during pregnancy to achieve the elimination of pediatric HIV infections. Copyright © 2017 American Society for Microbiology.

  16. New Paradigm for Translational Modeling to Predict Long‐term Tuberculosis Treatment Response

    PubMed Central

    Bartelink, IH; Zhang, N; Keizer, RJ; Strydom, N; Converse, PJ; Dooley, KE; Nuermberger, EL

    2017-01-01

    Abstract Disappointing results of recent tuberculosis chemotherapy trials suggest that knowledge gained from preclinical investigations was not utilized to maximal effect. A mouse‐to‐human translational pharmacokinetics (PKs) – pharmacodynamics (PDs) model built on a rich mouse database may improve clinical trial outcome predictions. The model included Mycobacterium tuberculosis growth function in mice, adaptive immune response effect on bacterial growth, relationships among moxifloxacin, rifapentine, and rifampin concentrations accelerating bacterial death, clinical PK data, species‐specific protein binding, drug‐drug interactions, and patient‐specific pathology. Simulations of recent trials testing 4‐month regimens predicted 65% (95% confidence interval [CI], 55–74) relapse‐free patients vs. 80% observed in the REMox‐TB trial, and 79% (95% CI, 72–87) vs. 82% observed in the Rifaquin trial. Simulation of 6‐month regimens predicted 97% (95% CI, 93–99) vs. 92% and 95% observed in 2RHZE/4RH control arms, and 100% predicted and observed in the 35 mg/kg rifampin arm of PanACEA MAMS. These results suggest that the model can inform regimen optimization and predict outcomes of ongoing trials. PMID:28561946

  17. The feasibility of an efficient drug design method with high-performance computers.

    PubMed

    Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki

    2015-01-01

    In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.

  18. Facet-Specific Adsorption of Tripeptides at Aqueous Au Interfaces: Open Questions in Reconciling Experiment and Simulation.

    PubMed

    Hughes, Zak E; Kochandra, Raji; Walsh, Tiffany R

    2017-04-18

    The adsorption of three homo-tripeptides, HHH, YYY, and SSS, at the aqueous Au interface is investigated, using molecular dynamics simulations. We find that consideration of surface facet effects, relevant to experimental conditions, opens up new questions regarding interpretations of current experimental findings. Our well-tempered metadynamics simulations predict the rank ordering of the tripeptide binding affinities at aqueous Au(111) to be YYY > HHH > SSS. This ranking differs with that obtained from existing experimental data which used surface-immobilized Au nanoparticles as the target substrate. The influence of Au facet on these experimental findings is then considered, via our binding strength predictions of the relevant amino acids at aqueous Au(111) and Au(100)(1 × 1). The Au(111) interface supports an amino acid ranking of Tyr > HisA ≃ HisH > Ser, matching that of the tripeptides on Au(111), while the ranking on Au(100) is HisA > Ser ≃ Tyr ≃ HisH, with only HisA showing non-negligible binding. The substantial reduction in Tyr amino acid affinity for Au(100) vs Au(111) offers one possible explanation for the experimentally observed weaker adsorption of YYY on the nanoparticle-immobilized substrate compared with HHH. In a separate set of simulations, we predict the structures of the adsorbed tripeptides at the two aqueous Au facets, revealing facet-dependent differences in the adsorbed conformations. Our findings suggest that Au facet effects, where relevant, may influence the adsorption structures and energetics of biomolecules, highlighting the possible influence of the structural model used to interpret experimental binding data.

  19. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    PubMed

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  20. Identification of human microRNA targets from isolated argonaute protein complexes.

    PubMed

    Beitzinger, Michaela; Peters, Lasse; Zhu, Jia Yun; Kremmer, Elisabeth; Meister, Gunter

    2007-06-01

    MicroRNAs (miRNAs) constitute a class of small non-coding RNAs that regulate gene expression on the level of translation and/or mRNA stability. Mammalian miRNAs associate with members of the Argonaute (Ago) protein family and bind to partially complementary sequences in the 3' untranslated region (UTR) of specific target mRNAs. Computer algorithms based on factors such as free binding energy or sequence conservation have been used to predict miRNA target mRNAs. Based on such predictions, up to one third of all mammalian mRNAs seem to be under miRNA regulation. However, due to the low degree of complementarity between the miRNA and its target, such computer programs are often imprecise and therefore not very reliable. Here we report the first biochemical identification approach of miRNA targets from human cells. Using highly specific monoclonal antibodies against members of the Ago protein family, we co-immunoprecipitate Ago-bound mRNAs and identify them by cloning. Interestingly, most of the identified targets are also predicted by different computer programs. Moreover, we randomly analyzed six different target candidates and were able to experimentally validate five as miRNA targets. Our data clearly indicate that miRNA targets can be experimentally identified from Ago complexes and therefore provide a new tool to directly analyze miRNA function.

  1. Specific heat and Knight shift of cuprates within the van Hove scenario

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sarkar, S.; Das, A.N.

    1996-12-01

    The jump in the specific heat at {ital T}{sub {ital c}}, the specific heat in both the superconducting and normal states, and the Knight shift in the superconducting state are studied within the van Hove singularity scenario considering density of states for a two-dimensional tight-binding system and with an extended saddle-point singularity. The role of the electron-phonon interaction strength, band narrowing, second-nearest-neighbor hopping, and orthorhombic distortion on such properties is investigated. The experimental results on the specific heat and Knight shift of the Y-123 system are compared with the theoretical predictions. {copyright} {ital 1996 The American Physical Society.}

  2. Utilizing Fibronectin Integrin-Binding Specificity to Control Cellular Responses

    PubMed Central

    Bachman, Haylee; Nicosia, John; Dysart, Marilyn; Barker, Thomas H.

    2015-01-01

    Significance: Cells communicate with the extracellular matrix (ECM) protein fibronectin (Fn) through integrin receptors on the cell surface. Controlling integrin–Fn interactions offers a promising approach to directing cell behavior, such as adhesion, migration, and differentiation, as well as coordinated tissue behaviors such as morphogenesis and wound healing. Recent Advances: Several different groups have developed recombinant fragments of Fn that can control epithelial to mesenchymal transition, sequester growth factors, and promote bone and wound healing. It is thought that these physiological responses are, in part, due to specific integrin engagement. Furthermore, it has been postulated that the integrin-binding domain of Fn is a mechanically sensitive switch that drives binding of one integrin heterodimer over another. Critical Issues: Although computational simulations have predicted the mechano-switch hypothesis and recent evidence supports the existence of varying strain states of Fn in vivo, experimental evidence of the Fn integrin switch is still lacking. Future Directions: Evidence of the integrin mechano-switch will enable the development of new Fn-based peptides in tissue engineering and wound healing, as well as deepen our understanding of ECM pathologies, such as fibrosis. PMID:26244106

  3. Electrostatic correlations at the Stern layer: Physics or chemistry?

    NASA Astrophysics Data System (ADS)

    Travesset, A.; Vangaveti, S.

    2009-11-01

    We introduce a minimal free energy describing the interaction of charged groups and counterions including both classical electrostatic and specific interactions. The predictions of the model are compared against the standard model for describing ions next to charged interfaces, consisting of Poisson-Boltzmann theory with additional constants describing ion binding, which are specific to the counterion and the interfacial charge ("chemical binding"). It is shown that the "chemical" model can be appropriately described by an underlying "physical" model over several decades in concentration, but the extracted binding constants are not uniquely defined, as they differ depending on the particular observable quantity being studied. It is also shown that electrostatic correlations for divalent (or higher valence) ions enhance the surface charge by increasing deprotonation, an effect not properly accounted within chemical models. The charged phospholipid phosphatidylserine is analyzed as a concrete example with good agreement with experimental results. We conclude with a detailed discussion on the limitations of chemical or physical models for describing the rich phenomenology of charged interfaces in aqueous media and its relevance to different systems with a particular emphasis on phospholipids.

  4. Copper binding to soil fulvic and humic acids: NICA-Donnan modeling and conditional affinity spectra.

    PubMed

    Xu, Jinling; Tan, Wenfeng; Xiong, Juan; Wang, Mingxia; Fang, Linchuan; Koopal, Luuk K

    2016-07-01

    Binding of Cu(II) to soil fulvic acid (JGFA), soil humic acids (JGHA, JLHA), and lignite-based humic acid (PAHA) was investigated through NICA-Donnan modeling and conditional affinity spectrum (CAS). It is to extend the knowledge of copper binding by soil humic substances (HS) both in respect of enlarging the database of metal ion binding to HS and obtaining a good insight into Cu binding to the functional groups of FA and HA by using the NICA-Donnan model to unravel the intrinsic and conditional affinity spectra. Results showed that Cu binding to HS increased with increasing pH and decreasing ionic strength. The amount of Cu bound to the HAs was larger than the amount bound to JGFA. Milne's generic parameters did not provide satisfactory predictions for the present soil HS samples, while material-specific NICA-Donnan model parameters described and predicted Cu binding to the HS well. Both the 'low' and 'high' concentration fitting procedures indicated a substantial bidentate structure of the Cu complexes with HS. By means of CAS underlying NICA isotherm, which was scarcely used, the nature of the binding at different solution conditions for a given sample and the differences in binding mode were illustrated. It was indicated that carboxylic group played an indispensable role in Cu binding to HS in that the carboxylic CAS had stronger conditional affinity than the phenolic distribution due to its large degree of proton dissociation. The fact was especially true for JGFA and JLHA which contain much larger amount of carboxylic groups, and the occupation of phenolic sites by Cu was negligible. Comparable amounts of carboxylic and phenolic groups on PAHA and JGHA, increased the occupation of phenolic type sites by Cu. The binding strength of PAHA-Cu and JGHA-Cu was stronger than that of JGFA-Cu and JLHA-Cu. The presence of phenolic groups increased the chance of forming more stable complexes, such as the salicylate-Cu or catechol-Cu type structures. Copyright © 2016. Published by Elsevier Inc.

  5. PatchSurfers: Two methods for local molecular property-based binding ligand prediction.

    PubMed

    Shin, Woong-Hee; Bures, Mark Gregory; Kihara, Daisuke

    2016-01-15

    Protein function prediction is an active area of research in computational biology. Function prediction can help biologists make hypotheses for characterization of genes and help interpret biological assays, and thus is a productive area for collaboration between experimental and computational biologists. Among various function prediction methods, predicting binding ligand molecules for a target protein is an important class because ligand binding events for a protein are usually closely intertwined with the proteins' biological function, and also because predicted binding ligands can often be directly tested by biochemical assays. Binding ligand prediction methods can be classified into two types: those which are based on protein-protein (or pocket-pocket) comparison, and those that compare a target pocket directly to ligands. Recently, our group proposed two computational binding ligand prediction methods, Patch-Surfer, which is a pocket-pocket comparison method, and PL-PatchSurfer, which compares a pocket to ligand molecules. The two programs apply surface patch-based descriptions to calculate similarity or complementarity between molecules. A surface patch is characterized by physicochemical properties such as shape, hydrophobicity, and electrostatic potentials. These properties on the surface are represented using three-dimensional Zernike descriptors (3DZD), which are based on a series expansion of a 3 dimensional function. Utilizing 3DZD for describing the physicochemical properties has two main advantages: (1) rotational invariance and (2) fast comparison. Here, we introduce Patch-Surfer and PL-PatchSurfer with an emphasis on PL-PatchSurfer, which is more recently developed. Illustrative examples of PL-PatchSurfer performance on binding ligand prediction as well as virtual drug screening are also provided. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Binding site and affinity prediction of general anesthetics to protein targets using docking.

    PubMed

    Liu, Renyu; Perez-Aguilar, Jose Manuel; Liang, David; Saven, Jeffery G

    2012-05-01

    The protein targets for general anesthetics remain unclear. A tool to predict anesthetic binding for potential binding targets is needed. In this study, we explored whether a computational method, AutoDock, could serve as such a tool. High-resolution crystal data of water-soluble proteins (cytochrome C, apoferritin, and human serum albumin), and a membrane protein (a pentameric ligand-gated ion channel from Gloeobacter violaceus [GLIC]) were used. Isothermal titration calorimetry (ITC) experiments were performed to determine anesthetic affinity in solution conditions for apoferritin. Docking calculations were performed using DockingServer with the Lamarckian genetic algorithm and the Solis and Wets local search method (http://www.dockingserver.com/web). Twenty general anesthetics were docked into apoferritin. The predicted binding constants were compared with those obtained from ITC experiments for potential correlations. In the case of apoferritin, details of the binding site and their interactions were compared with recent cocrystallization data. Docking calculations for 6 general anesthetics currently used in clinical settings (isoflurane, sevoflurane, desflurane, halothane, propofol, and etomidate) with known 50% effective concentration (EC(50)) values were also performed in all tested proteins. The binding constants derived from docking experiments were compared with known EC(50) values and octanol/water partition coefficients for the 6 general anesthetics. All 20 general anesthetics docked unambiguously into the anesthetic binding site identified in the crystal structure of apoferritin. The binding constants for 20 anesthetics obtained from the docking calculations correlate significantly with those obtained from ITC experiments (P = 0.04). In the case of GLIC, the identified anesthetic binding sites in the crystal structure are among the docking predicted binding sites, but not the top ranked site. Docking calculations suggest a most probable binding site located in the extracellular domain of GLIC. The predicted affinities correlated significantly with the known EC(50) values for the 6 frequently used anesthetics in GLIC for the site identified in the experimental crystal data (P = 0.006). However, predicted affinities in apoferritin, human serum albumin, and cytochrome C did not correlate with these 6 anesthetics' known experimental EC(50) values. A weak correlation between the predicted affinities and the octanol/water partition coefficients was observed for the sites in GLIC. We demonstrated that anesthetic binding sites and relative affinities can be predicted using docking calculations in an automatic docking server (AutoDock) for both water-soluble and membrane proteins. Correlation of predicted affinity and EC(50) for 6 frequently used general anesthetics was only observed in GLIC, a member of a protein family relevant to anesthetic mechanism.

  7. Binding Site and Affinity Prediction of General Anesthetics to Protein Targets Using Docking

    PubMed Central

    Liu, Renyu; Perez-Aguilar, Jose Manuel; Liang, David; Saven, Jeffery G.

    2012-01-01

    Background The protein targets for general anesthetics remain unclear. A tool to predict anesthetic binding for potential binding targets is needed. In this study, we explore whether a computational method, AutoDock, could serve as such a tool. Methods High-resolution crystal data of water soluble proteins (cytochrome C, apoferritin and human serum albumin), and a membrane protein (a pentameric ligand-gated ion channel from Gloeobacter violaceus, GLIC) were used. Isothermal titration calorimetry (ITC) experiments were performed to determine anesthetic affinity in solution conditions for apoferritin. Docking calculations were performed using DockingServer with the Lamarckian genetic algorithm and the Solis and Wets local search method (https://www.dockingserver.com/web). Twenty general anesthetics were docked into apoferritin. The predicted binding constants are compared with those obtained from ITC experiments for potential correlations. In the case of apoferritin, details of the binding site and their interactions were compared with recent co-crystallization data. Docking calculations for six general anesthetics currently used in clinical settings (isoflurane, sevoflurane, desflurane, halothane, propofol, and etomidate) with known EC50 were also performed in all tested proteins. The binding constants derived from docking experiments were compared with known EC50s and octanol/water partition coefficients for the six general anesthetics. Results All 20 general anesthetics docked unambiguously into the anesthetic binding site identified in the crystal structure of apoferritin. The binding constants for 20 anesthetics obtained from the docking calculations correlate significantly with those obtained from ITC experiments (p=0.04). In the case of GLIC, the identified anesthetic binding sites in the crystal structure are among the docking predicted binding sites, but not the top ranked site. Docking calculations suggest a most probable binding site located in the extracellular domain of GLIC. The predicted affinities correlated significantly with the known EC50s for the six commonly used anesthetics in GLIC for the site identified in the experimental crystal data (p=0.006). However, predicted affinities in apoferritin, human serum albumin, and cytochrome C did not correlate with these six anesthetics’ known experimental EC50s. A weak correlation between the predicted affinities and the octanol/water partition coefficients was observed for the sites in GLIC. Conclusion We demonstrated that anesthetic binding sites and relative affinities can be predicted using docking calculations in an automatic docking server (Autodock) for both water soluble and membrane proteins. Correlation of predicted affinity and EC50 for six commonly used general anesthetics was only observed in GLIC, a member of a protein family relevant to anesthetic mechanism. PMID:22392968

  8. The Role of Telomeric Repeat Binding Factor 1 (TRF1) in Telomere Maintenance and as a Potential Prognostic Indicator in Human Breast Cancer

    DTIC Science & Technology

    2006-04-01

    for Specific Aim #3 have yet been initiated, and are proceeding on schedule. The PhD candidate has completed her educational goals. 13 Appendix A ... LEVELS OF TELOMERE PROTEIN MRNAS ARE PREDICTIVE OF TELOMERE CONTENT IN HUMAN BREAST TUMORS Kimberly S. Butler, William C. Hines, Diana Roberts

  9. Contribution of Molecular Allergen Analysis in Diagnosis of Milk Allergy.

    PubMed

    Bartuzi, Zbigniew; Cocco, Renata Rodrigues; Muraro, Antonella; Nowak-Węgrzyn, Anna

    2017-07-01

    We sought to describe the available evidence supporting the utilization of the molecular allergen analysis (MAA) for diagnosis and management of cow milk protein allergy (CMPA). Cow milk proteins are among the most common food allergens in IgE- and non-IgE-mediated food allergic disorders in children. Most individuals with CMPA are sensitized to both caseins and whey proteins. Caseins are more resistant to high temperatures compared to whey proteins. MAA is not superior to the conventional diagnostic tests based on the whole allergen extracts for diagnosis of CMPA. However, MAA can be useful in diagnosing tolerance to extensively heated milk proteins in baked foods. Children with CMPA and high levels of casein IgE are less likely to tolerate baked milk compared to children with low levels of casein IgE. Specific IgE-binding patterns to casein and betalactoglobulin peptides may predict the natural course of CMPA and differentiate subjects who are more likely to develop CMPA at a younger age versus those with a more persistent CMPA. Specific IgE-binding patterns to casein and beta-lactoglobulin peptides may also predict response to milk OITand identify patientsmost likely to benefit fromOIT.

  10. A Novel Predicted Bromodomain-Related Protein Affects Coordination Between Meiosis and Spermiogenesis in Drosophila and Is Required for Male Meiotic Cytokinesis

    PubMed Central

    Bergner, Laura M.; Hickman, F. Edward; Wood, Kathleen H.; Wakeman, Carolyn M.; Stone, Hunter H.; Campbell, Tessa J.; Lightcap, Samantha B.; Favors, Sheena M.; Aldridge, Amanda C.

    2010-01-01

    Temporal coordination of meiosis with spermatid morphogenesis is crucial for successful generation of mature sperm cells. We identified a recessive male sterile Drosophila melanogaster mutant, mitoshell, in which events of spermatid morphogenesis are initiated too early, before meiotic onset. Premature mitochondrial aggregation and fusion lead to an aberrant mitochondrial shell around premeiotic nuclei. Despite successful meiotic karyokinesis, improper mitochondrial localization in mitoshell testes is associated with defective astral central spindles and a lack of contractile rings, leading to meiotic cytokinesis failure. We mapped and cloned the mitoshell gene and found that it encodes a novel protein with a bromodomain-related region. It is conserved in some insect lineages. Bromodomains typically bind to histone acetyl-lysine residues and therefore are often associated with chromatin. The Mitoshell bromodomain-related region is predicted to have an alpha helical structure similar to that of bromodomains, but not all the crucial residues in the ligand-binding loops are conserved. We speculate that Mitoshell may participate in transcriptional regulation of spermatogenesis-specific genes, though perhaps with different ligand specificity compared to traditional bromodomains. PMID:20491580

  11. MusTRD can regulate postnatal fiber-specific expression.

    PubMed

    Issa, Laura L; Palmer, Stephen J; Guven, Kim L; Santucci, Nicole; Hodgson, Vanessa R M; Popovic, Kata; Joya, Josephine E; Hardeman, Edna C

    2006-05-01

    Human MusTRD1alpha1 was isolated as a result of its ability to bind a critical element within the Troponin I slow upstream enhancer (TnIslow USE) and was predicted to be a regulator of slow fiber-specific genes. To test this hypothesis in vivo, we generated transgenic mice expressing hMusTRD1alpha1 in skeletal muscle. Adult transgenic mice show a complete loss of slow fibers and a concomitant replacement by fast IIA fibers, resulting in postural muscle weakness. However, developmental analysis demonstrates that transgene expression has no impact on embryonic patterning of slow fibers but causes a gradual postnatal slow to fast fiber conversion. This conversion was underpinned by a demonstrable repression of many slow fiber-specific genes, whereas fast fiber-specific gene expression was either unchanged or enhanced. These data are consistent with our initial predictions for hMusTRD1alpha1 and suggest that slow fiber genes contain a specific common regulatory element that can be targeted by MusTRD proteins.

  12. Importance of ligand reorganization free energy in protein-ligand binding-affinity prediction.

    PubMed

    Yang, Chao-Yie; Sun, Haiying; Chen, Jianyong; Nikolovska-Coleska, Zaneta; Wang, Shaomeng

    2009-09-30

    Accurate prediction of the binding affinities of small-molecule ligands to their biological targets is fundamental for structure-based drug design but remains a very challenging task. In this paper, we have performed computational studies to predict the binding models of 31 small-molecule Smac (the second mitochondria-derived activator of caspase) mimetics to their target, the XIAP (X-linked inhibitor of apoptosis) protein, and their binding affinities. Our results showed that computational docking was able to reliably predict the binding models, as confirmed by experimentally determined crystal structures of some Smac mimetics complexed with XIAP. However, all the computational methods we have tested, including an empirical scoring function, two knowledge-based scoring functions, and MM-GBSA (molecular mechanics and generalized Born surface area), yield poor to modest prediction for binding affinities. The linear correlation coefficient (r(2)) value between the predicted affinities and the experimentally determined affinities was found to be between 0.21 and 0.36. Inclusion of ensemble protein-ligand conformations obtained from molecular dynamic simulations did not significantly improve the prediction. However, major improvement was achieved when the free-energy change for ligands between their free- and bound-states, or "ligand-reorganization free energy", was included in the MM-GBSA calculation, and the r(2) value increased from 0.36 to 0.66. The prediction was validated using 10 additional Smac mimetics designed and evaluated by an independent group. This study demonstrates that ligand reorganization free energy plays an important role in the overall binding free energy between Smac mimetics and XIAP. This term should be evaluated for other ligand-protein systems and included in the development of new scoring functions. To our best knowledge, this is the first computational study to demonstrate the importance of ligand reorganization free energy for the prediction of protein-ligand binding free energy.

  13. Accurate Binding Free Energy Predictions in Fragment Optimization.

    PubMed

    Steinbrecher, Thomas B; Dahlgren, Markus; Cappel, Daniel; Lin, Teng; Wang, Lingle; Krilov, Goran; Abel, Robert; Friesner, Richard; Sherman, Woody

    2015-11-23

    Predicting protein-ligand binding free energies is a central aim of computational structure-based drug design (SBDD)--improved accuracy in binding free energy predictions could significantly reduce costs and accelerate project timelines in lead discovery and optimization. The recent development and validation of advanced free energy calculation methods represents a major step toward this goal. Accurately predicting the relative binding free energy changes of modifications to ligands is especially valuable in the field of fragment-based drug design, since fragment screens tend to deliver initial hits of low binding affinity that require multiple rounds of synthesis to gain the requisite potency for a project. In this study, we show that a free energy perturbation protocol, FEP+, which was previously validated on drug-like lead compounds, is suitable for the calculation of relative binding strengths of fragment-sized compounds as well. We study several pharmaceutically relevant targets with a total of more than 90 fragments and find that the FEP+ methodology, which uses explicit solvent molecular dynamics and physics-based scoring with no parameters adjusted, can accurately predict relative fragment binding affinities. The calculations afford R(2)-values on average greater than 0.5 compared to experimental data and RMS errors of ca. 1.1 kcal/mol overall, demonstrating significant improvements over the docking and MM-GBSA methods tested in this work and indicating that FEP+ has the requisite predictive power to impact fragment-based affinity optimization projects.

  14. Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.

    PubMed

    Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland

    2009-12-01

    Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.

  15. The cellular transcription factor CREB corresponds to activating transcription factor 47 (ATF-47) and forms complexes with a group of polypeptides related to ATF-43.

    PubMed

    Hurst, H C; Masson, N; Jones, N C; Lee, K A

    1990-12-01

    Promoter elements containing the sequence motif CGTCA are important for a variety of inducible responses at the transcriptional level. Multiple cellular factors specifically bind to these elements and are encoded by a multigene family. Among these factors, polypeptides termed activating transcription factor 43 (ATF-43) and ATF-47 have been purified from HeLa cells and a factor referred to as cyclic AMP response element-binding protein (CREB) has been isolated from PC12 cells and rat brain. We demonstrated that CREB and ATF-47 are identical and that CREB and ATF-43 form protein-protein complexes. We also found that the cis requirements for stable DNA binding by ATF-43 and CREB are different. Using antibodies to ATF-43 we have identified a group of polypeptides (ATF-43) in the size range from 40 to 43 kDa. ATF-43 polypeptides are related by their reactivity with anti-ATF-43, DNA-binding specificity, complex formation with CREB, heat stability, and phosphorylation by protein kinase A. Certain cell types vary in their ATF-43 complement, suggesting that CREB activity is modulated in a cell-type-specific manner through interaction with ATF-43. ATF-43 polypeptides do not appear simply to correspond to the gene products of the ATF multigene family, suggesting that the size of the ATF family at the protein level is even larger than predicted from cDNA-cloning studies.

  16. Computational identification of epitopes in the glycoproteins of novel bunyavirus (SFTS virus) recognized by a human monoclonal antibody (MAb 4-5)

    NASA Astrophysics Data System (ADS)

    Zhang, Wenshuai; Zeng, Xiaoyan; Zhang, Li; Peng, Haiyan; Jiao, Yongjun; Zeng, Jun; Treutlein, Herbert R.

    2013-06-01

    In this work, we have developed a new approach to predict the epitopes of antigens that are recognized by a specific antibody. Our method is based on the "multiple copy simultaneous search" (MCSS) approach which identifies optimal locations of small chemical functional groups on the surfaces of the antibody, and identifying sequence patterns of peptides that can bind to the surface of the antibody. The identified sequence patterns are then used to search the amino-acid sequence of the antigen protein. The approach was validated by reproducing the binding epitope of HIV gp120 envelop glycoprotein for the human neutralizing antibody as revealed in the available crystal structure. Our method was then applied to predict the epitopes of two glycoproteins of a newly discovered bunyavirus recognized by an antibody named MAb 4-5. These predicted epitopes can be verified by experimental methods. We also discuss the involvement of different amino acids in the antigen-antibody recognition based on the distributions of MCSS minima of different functional groups.

  17. Toward a structure determination method for biomineral-associated protein using combined solid- state NMR and computational structure prediction.

    PubMed

    Masica, David L; Ash, Jason T; Ndao, Moise; Drobny, Gary P; Gray, Jeffrey J

    2010-12-08

    Protein-biomineral interactions are paramount to materials production in biology, including the mineral phase of hard tissue. Unfortunately, the structure of biomineral-associated proteins cannot be determined by X-ray crystallography or solution nuclear magnetic resonance (NMR). Here we report a method for determining the structure of biomineral-associated proteins. The method combines solid-state NMR (ssNMR) and ssNMR-biased computational structure prediction. In addition, the algorithm is able to identify lattice geometries most compatible with ssNMR constraints, representing a quantitative, novel method for investigating crystal-face binding specificity. We use this method to determine most of the structure of human salivary statherin interacting with the mineral phase of tooth enamel. Computation and experiment converge on an ensemble of related structures and identify preferential binding at three crystal surfaces. The work represents a significant advance toward determining structure of biomineral-adsorbed protein using experimentally biased structure prediction. This method is generally applicable to proteins that can be chemically synthesized. Copyright © 2010 Elsevier Ltd. All rights reserved.

  18. TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions

    PubMed Central

    2017-01-01

    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains important biological information via a multichannel image-like representation. This representation reveals hidden structure-function relationships in biomolecules. We further integrate ESPH and deep convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the deep learning limitations from small and noisy training sets, we propose a multi-task multichannel topological convolutional neural network (MM-TCNN). We demonstrate that TopologyNet outperforms the latest methods in the prediction of protein-ligand binding affinities, mutation induced globular protein folding free energy changes, and mutation induced membrane protein folding free energy changes. Availability: weilab.math.msu.edu/TDL/ PMID:28749969

  19. GBshape: a genome browser database for DNA shape annotations.

    PubMed

    Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J; Parker, Stephen C J; Nuzhdin, Sergey V; Tullius, Thomas D; Rohs, Remo

    2015-01-01

    Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors

    PubMed Central

    Chikhi, Rayan; Sael, Lee; Kihara, Daisuke

    2010-01-01

    Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259

  1. Real-time ligand binding pocket database search using local surface descriptors.

    PubMed

    Chikhi, Rayan; Sael, Lee; Kihara, Daisuke

    2010-07-01

    Because of the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two-dimensional pseudo-Zernike moments or the three-dimensional Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark studies employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed.

  2. A Modeling and Experimental Investigation of the Effects of Antigen Density, Binding Affinity, and Antigen Expression Ratio on Bispecific Antibody Binding to Cell Surface Targets*

    PubMed Central

    Rhoden, John J.; Dyas, Gregory L.

    2016-01-01

    Despite the increasing number of multivalent antibodies, bispecific antibodies, fusion proteins, and targeted nanoparticles that have been generated and studied, the mechanism of multivalent binding to cell surface targets is not well understood. Here, we describe a conceptual and mathematical model of multivalent antibody binding to cell surface antigens. Our model predicts that properties beyond 1:1 antibody:antigen affinity to target antigens have a strong influence on multivalent binding. Predicted crucial properties include the structure and flexibility of the antibody construct, the target antigen(s) and binding epitope(s), and the density of antigens on the cell surface. For bispecific antibodies, the ratio of the expression levels of the two target antigens is predicted to be critical to target binding, particularly for the lower expressed of the antigens. Using bispecific antibodies of different valencies to cell surface antigens including MET and EGF receptor, we have experimentally validated our modeling approach and its predictions and observed several nonintuitive effects of avidity related to antigen density, target ratio, and antibody affinity. In some biological circumstances, the effect we have predicted and measured varied from the monovalent binding interaction by several orders of magnitude. Moreover, our mathematical framework affords us a mechanistic interpretation of our observations and suggests strategies to achieve the desired antibody-antigen binding goals. These mechanistic insights have implications in antibody engineering and structure/activity relationship determination in a variety of biological contexts. PMID:27022022

  3. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method

    PubMed Central

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole

    2007-01-01

    Background Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. Results The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. Conclusion The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available. PMID:17608956

  4. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method.

    PubMed

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole

    2007-07-04

    Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available.

  5. Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d.

    PubMed

    Doxey, Andrew C; Cheng, Zhenyu; Moffatt, Barbara A; McConkey, Brendan J

    2010-08-03

    Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins. The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry. Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.

  6. Determinants of RNA binding and translational repression by the Bicaudal-C regulatory protein.

    PubMed

    Zhang, Yan; Park, Sookhee; Blaser, Susanne; Sheets, Michael D

    2014-03-14

    Bicaudal-C (Bic-C) RNA binding proteins function as important translational repressors in multiple biological contexts within metazoans. However, their RNA binding sites are unknown. We recently demonstrated that Bic-C functions in spatially regulated translational repression of the xCR1 mRNA during Xenopus development. This repression contributes to normal development by confining the xCR1 protein, a regulator of key signaling pathways, to specific cells of the embryo. In this report, we combined biochemical approaches with in vivo mRNA reporter assays to define the minimal Bic-C target site within the xCR1 mRNA. This 32-nucleotide Bic-C target site is predicted to fold into a stem-loop secondary structure. Mutational analyses provided evidence that this stem-loop structure is important for Bic-C binding. The Bic-C target site was sufficient for Bic-C mediated repression in vivo. Thus, we describe the first RNA binding site for a Bic-C protein. This identification provides an important step toward understanding the mechanisms by which evolutionarily conserved Bic-C proteins control cellular function in metazoans.

  7. SuperPain—a resource on pain-relieving compounds targeting ion channels

    PubMed Central

    Gohlke, Björn O.; Preissner, Robert; Preissner, Saskia

    2014-01-01

    Pain is more than an unpleasant sensory experience associated with actual or potential tissue damage: it is the most common reason for physician consultation and often dramatically affects quality of life. The management of pain is often difficult and new targets are required for more effective and specific treatment. SuperPain (http://bioinformatics.charite.de/superpain/) is freely available database for pain-stimulating and pain-relieving compounds, which bind or potentially bind to ion channels that are involved in the transmission of pain signals to the central nervous system, such as TRPV1, TRPM8, TRPA1, TREK1, TRESK, hERG, ASIC, P2X and voltage-gated sodium channels. The database consists of ∼8700 ligands, which are characterized by experimentally measured binding affinities. Additionally, 100 000 putative ligands are included. Moreover, the database provides 3D structures of receptors and predicted ligand-binding poses. These binding poses and a structural classification scheme provide hints for the design of new analgesic compounds. A user-friendly graphical interface allows similarity searching, visualization of ligands docked into the receptor, etc. PMID:24271391

  8. SuperPain--a resource on pain-relieving compounds targeting ion channels.

    PubMed

    Gohlke, Björn O; Preissner, Robert; Preissner, Saskia

    2014-01-01

    Pain is more than an unpleasant sensory experience associated with actual or potential tissue damage: it is the most common reason for physician consultation and often dramatically affects quality of life. The management of pain is often difficult and new targets are required for more effective and specific treatment. SuperPain (http://bioinformatics.charite.de/superpain/) is freely available database for pain-stimulating and pain-relieving compounds, which bind or potentially bind to ion channels that are involved in the transmission of pain signals to the central nervous system, such as TRPV1, TRPM8, TRPA1, TREK1, TRESK, hERG, ASIC, P2X and voltage-gated sodium channels. The database consists of ∼8700 ligands, which are characterized by experimentally measured binding affinities. Additionally, 100 000 putative ligands are included. Moreover, the database provides 3D structures of receptors and predicted ligand-binding poses. These binding poses and a structural classification scheme provide hints for the design of new analgesic compounds. A user-friendly graphical interface allows similarity searching, visualization of ligands docked into the receptor, etc.

  9. Displacement of disordered water molecules from hydrophobic pocket creates enthalpic signature: binding of phosphonamidate to the S₁'-pocket of thermolysin.

    PubMed

    Englert, L; Biela, A; Zayed, M; Heine, A; Hangauer, D; Klebe, G

    2010-11-01

    Prerequisite for the design of tight binding protein inhibitors and prediction of their properties is an in-depth understanding of the structural and thermodynamic details of the binding process. A series of closely related phosphonamidates was studied to elucidate the forces underlying their binding affinity to thermolysin. The investigated inhibitors are identical except for the parts penetrating into the hydrophobic S₁'-pocket. A correlation of structural, kinetic and thermodynamic data was carried out by X-ray crystallography, kinetic inhibition assay and isothermal titration calorimetry. Binding affinity increases with larger ligand hydrophobic P₁'-moieties accommodating the S₁'-pocket. Surprisingly, larger P₁'-side chain modifications are accompanied by an increase in the enthalpic contribution to binding. In agreement with other studies, it is suggested that the release of largely disordered waters from an imperfectly hydrated pocket results in an enthalpically favourable integration of these water molecules into bulk water upon inhibitor binding. This enthalpically favourable process contributes more strongly to the binding energetics than the entropy increase resulting from the release of water molecules from the S₁'-pocket or the formation of apolar interactions between protein and inhibitor. Displacement of highly disordered water molecules from a rather imperfectly hydrated and hydrophobic specificity pocket can reveal an enthalpic signature of inhibitor binding. Copyright © 2010 Elsevier B.V. All rights reserved.

  10. Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

    PubMed Central

    Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S.; Sinha, Saurabh

    2011-01-01

    Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. PMID:21821659

  11. Novel, customizable scoring functions, parameterized using N-PLS, for structure-based drug discovery.

    PubMed

    Catana, Cornel; Stouten, Pieter F W

    2007-01-01

    The ability to accurately predict biological affinity on the basis of in silico docking to a protein target remains a challenging goal in the CADD arena. Typically, "standard" scoring functions have been employed that use the calculated docking result and a set of empirical parameters to calculate a predicted binding affinity. To improve on this, we are exploring novel strategies for rapidly developing and tuning "customized" scoring functions tailored to a specific need. In the present work, three such customized scoring functions were developed using a set of 129 high-resolution protein-ligand crystal structures with measured Ki values. The functions were parametrized using N-PLS (N-way partial least squares), a multivariate technique well-known in the 3D quantitative structure-activity relationship field. A modest correlation between observed and calculated pKi values using a standard scoring function (r2 = 0.5) could be improved to 0.8 when a customized scoring function was applied. To mimic a more realistic scenario, a second scoring function was developed, not based on crystal structures but exclusively on several binding poses generated with the Flo+ docking program. Finally, a validation study was conducted by generating a third scoring function with 99 randomly selected complexes from the 129 as a training set and predicting pKi values for a test set that comprised the remaining 30 complexes. Training and test set r2 values were 0.77 and 0.78, respectively. These results indicate that, even without direct structural information, predictive customized scoring functions can be developed using N-PLS, and this approach holds significant potential as a general procedure for predicting binding affinity on the basis of in silico docking.

  12. Automated docking of ligands to an artificial active site: augmenting crystallographic analysis with computer modeling

    NASA Astrophysics Data System (ADS)

    Rosenfeld, Robin J.; Goodsell, David S.; Musah, Rabi A.; Morris, Garrett M.; Goodin, David B.; Olson, Arthur J.

    2003-08-01

    The W191G cavity of cytochrome c peroxidase is useful as a model system for introducing small molecule oxidation in an artificially created cavity. A set of small, cyclic, organic cations was previously shown to bind in the buried, solvent-filled pocket created by the W191G mutation. We docked these ligands and a set of non-binders in the W191G cavity using AutoDock 3.0. For the ligands, we compared docking predictions with experimentally determined binding energies and X-ray crystal structure complexes. For the ligands, predicted binding energies differed from measured values by ± 0.8 kcal/mol. For most ligands, the docking simulation clearly predicted a single binding mode that matched the crystallographic binding mode within 1.0 Å RMSD. For 2 ligands, where the docking procedure yielded an ambiguous result, solutions matching the crystallographic result could be obtained by including an additional crystallographically observed water molecule in the protein model. For the remaining 2 ligands, docking indicated multiple binding modes, consistent with the original electron density, suggesting disordered binding of these ligands. Visual inspection of the atomic affinity grid maps used in docking calculations revealed two patches of high affinity for hydrogen bond donating groups. Multiple solutions are predicted as these two sites compete for polar hydrogens in the ligand during the docking simulation. Ligands could be distinguished, to some extent, from non-binders using a combination of two trends: predicted binding energy and level of clustering. In summary, AutoDock 3.0 appears to be useful in predicting key structural and energetic features of ligand binding in the W191G cavity.

  13. Literature-based condition-specific miRNA-mRNA target prediction.

    PubMed

    Oh, Minsik; Rhee, Sungmin; Moon, Ji Hwan; Chae, Heejoon; Lee, Sunwon; Kang, Jaewoo; Kim, Sun

    2017-01-01

    miRNAs are small non-coding RNAs that regulate gene expression by binding to the 3'-UTR of genes. Many recent studies have reported that miRNAs play important biological roles by regulating specific mRNAs or genes. Many sequence-based target prediction algorithms have been developed to predict miRNA targets. However, these methods are not designed for condition-specific target predictions and produce many false positives; thus, expression-based target prediction algorithms have been developed for condition-specific target predictions. A typical strategy to utilize expression data is to leverage the negative control roles of miRNAs on genes. To control false positives, a stringent cutoff value is typically set, but in this case, these methods tend to reject many true target relationships, i.e., false negatives. To overcome these limitations, additional information should be utilized. The literature is probably the best resource that we can utilize. Recent literature mining systems compile millions of articles with experiments designed for specific biological questions, and the systems provide a function to search for specific information. To utilize the literature information, we used a literature mining system, BEST, that automatically extracts information from the literature in PubMed and that allows the user to perform searches of the literature with any English words. By integrating omics data analysis methods and BEST, we developed Context-MMIA, a miRNA-mRNA target prediction method that combines expression data analysis results and the literature information extracted based on the user-specified context. In the pathway enrichment analysis using genes included in the top 200 miRNA-targets, Context-MMIA outperformed the four existing target prediction methods that we tested. In another test on whether prediction methods can re-produce experimentally validated target relationships, Context-MMIA outperformed the four existing target prediction methods. In summary, Context-MMIA allows the user to specify a context of the experimental data to predict miRNA targets, and we believe that Context-MMIA is very useful for predicting condition-specific miRNA targets.

  14. Molecular Hybridization of Potent and Selective γ-Hydroxybutyric Acid (GHB) Ligands: Design, Synthesis, Binding Studies, and Molecular Modeling of Novel 3-Hydroxycyclopent-1-enecarboxylic Acid (HOCPCA) and trans-γ-Hydroxycrotonic Acid (T-HCA) Analogs.

    PubMed

    Krall, Jacob; Jensen, Claus Hatt; Bavo, Francesco; Falk-Petersen, Christina Birkedahl; Haugaard, Anne Stæhr; Vogensen, Stine Byskov; Tian, Yongsong; Nittegaard-Nielsen, Mia; Sigurdardóttir, Sara Björk; Kehler, Jan; Kongstad, Kenneth Thermann; Gloriam, David E; Clausen, Rasmus Prætorius; Harpsøe, Kasper; Wellendorph, Petrine; Frølund, Bente

    2017-11-09

    γ-Hydroxybutyric acid (GHB) is a neuroactive substance with specific high-affinity binding sites. To facilitate target identification and ligand optimization, we herein report a comprehensive structure-affinity relationship study for novel ligands targeting these binding sites. A molecular hybridization strategy was used based on the conformationally restricted 3-hydroxycyclopent-1-enecarboxylic acid (HOCPCA) and the linear GHB analog trans-4-hydroxycrotonic acid (T-HCA). In general, all structural modifications performed on HOCPCA led to reduced affinity. In contrast, introduction of diaromatic substituents into the 4-position of T-HCA led to high-affinity analogs (medium nanomolar K i ) for the GHB high-affinity binding sites as the most high-affinity analogs reported to date. The SAR data formed the basis for a three-dimensional pharmacophore model for GHB ligands, which identified molecular features important for high-affinity binding, with high predictive validity. These findings will be valuable in the further processes of both target characterization and ligand identification for the high-affinity GHB binding sites.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bianchetti, Christopher M.; Bingman, Craig A.; Phillips, Jr., George N.

    The thanatos (the Greek god of death)-associated protein (THAP) domain is a sequence-specific DNA-binding domain that contains a C2-CH (Cys-Xaa{sub 2-4}-Cys-Xaa{sub 35-50}-Cys-Xaa{sub 2}-His) zinc finger that is similar to the DNA domain of the P element transposase from Drosophila. THAP-containing proteins have been observed in the proteome of humans, pigs, cows, chickens, zebrafish, Drosophila, C. elegans, and Xenopus. To date, there are no known THAP domain proteins in plants, yeast, or bacteria. There are 12 identified human THAP domain-containing proteins (THAP0-11). In all human THAP protein, the THAP domain is located at the N-terminus and is {approx}90 residues in length.more » Although all of the human THAP-containing proteins have a homologous N-terminus, there is extensive variation in both the predicted structure and length of the remaining protein. Even though the exact function of these THAP proteins is not well defined, there is evidence that they play a role in cell proliferation, apoptosis, cell cycle modulation, chromatin modification, and transcriptional regulation. THAP-containing proteins have also been implicated in a number of human disease states including heart disease, neurological defects, and several types of cancers. Human THAP4 is a 577-residue protein of unknown function that is proposed to bind DNA in a sequence-specific manner similar to THAP1 and has been found to be upregulated in response to heat shock. THAP4 is expressed in a relatively uniform manner in a broad range of tissues and appears to be upregulated in lymphoma cells and highly expressed in heart cells. The C-terminal domain of THAP4 (residues 415-577), designated here as cTHAP4, is evolutionarily conserved and is observed in all known THAP4 orthologs. Several single-domain proteins lacking a THAP domain are found in plants and bacteria and show significant levels of homology to cTHAP4. It appears that cTHAP4 belongs to a large class of proteins that have yet to be fully functionally characterized. On the basis of prior work, we predicted that cTHAP4 is composed of a heme-binding nitrobindin domain, making THAP4 the only human THAP protein predicted to bind a cofactor. Nitrobindin, a recently characterized protein from Arabidopsis thaliana, is structurally similar and exhibits nitric oxide (NO)-binding properties that resemble the heme-binding nitrophorins. Nitrophorins use a heme moiety to store, transport, and release NO in a pH-specific manner. Although the exact function of nitrobindin is not fully known, the similarities between the well-characterized nitrophorins imply a role in NO transport, sensing, or metabolism. To better elucidate the possible function of THAP4, we solved the hemebound structure of cTHAP4 to a resolution of 1.79 {angstrom}.« less

  16. Integrating water exclusion theory into βcontacts to predict binding free energy changes and binding hot spots

    PubMed Central

    2014-01-01

    Background Binding free energy and binding hot spots at protein-protein interfaces are two important research areas for understanding protein interactions. Computational methods have been developed previously for accurate prediction of binding free energy change upon mutation for interfacial residues. However, a large number of interrupted and unimportant atomic contacts are used in the training phase which caused accuracy loss. Results This work proposes a new method, βACV ASA , to predict the change of binding free energy after alanine mutations. βACV ASA integrates accessible surface area (ASA) and our newly defined β contacts together into an atomic contact vector (ACV). A β contact between two atoms is a direct contact without being interrupted by any other atom between them. A β contact’s potential contribution to protein binding is also supposed to be inversely proportional to its ASA to follow the water exclusion hypothesis of binding hot spots. Tested on a dataset of 396 alanine mutations, our method is found to be superior in classification performance to many other methods, including Robetta, FoldX, HotPOINT, an ACV method of β contacts without ASA integration, and ACV ASA methods (similar to βACV ASA but based on distance-cutoff contacts). Based on our data analysis and results, we can draw conclusions that: (i) our method is powerful in the prediction of binding free energy change after alanine mutation; (ii) β contacts are better than distance-cutoff contacts for modeling the well-organized protein-binding interfaces; (iii) β contacts usually are only a small fraction number of the distance-based contacts; and (iv) water exclusion is a necessary condition for a residue to become a binding hot spot. Conclusions βACV ASA is designed using the advantages of both β contacts and water exclusion. It is an excellent tool to predict binding free energy changes and binding hot spots after alanine mutation. PMID:24568581

  17. NFI Transcription Factors Interact with FOXA1 to Regulate Prostate-Specific Gene Expression

    PubMed Central

    Elliott, Amicia D.; DeGraff, David J.; Anderson, Philip D.; Anumanthan, Govindaraj; Yamashita, Hironobu; Sun, Qian; Friedman, David B.; Hachey, David L.; Yu, Xiuping; Sheehan, Jonathan H.; Ahn, Jung-Mo; Raj, Ganesh V.; Piston, David W.; Gronostajski, Richard M.; Matusik, Robert J.

    2014-01-01

    Androgen receptor (AR) action throughout prostate development and in maintenance of the prostatic epithelium is partly controlled by interactions between AR and forkhead box (FOX) transcription factors, particularly FOXA1. We sought to identity additional FOXA1 binding partners that may mediate prostate-specific gene expression. Here we identify the nuclear factor I (NFI) family of transcription factors as novel FOXA1 binding proteins. All four family members (NFIA, NFIB, NFIC, and NFIX) can interact with FOXA1, and knockdown studies in androgen-dependent LNCaP cells determined that modulating expression of NFI family members results in changes in AR target gene expression. This effect is probably mediated by binding of NFI family members to AR target gene promoters, because chromatin immunoprecipitation (ChIP) studies found that NFIB bound to the prostate-specific antigen enhancer. Förster resonance energy transfer studies revealed that FOXA1 is capable of bringing AR and NFIX into proximity, indicating that FOXA1 facilitates the AR and NFI interaction by bridging the complex. To determine the extent to which NFI family members regulate AR/FOXA1 target genes, motif analysis of publicly available data for ChIP followed by sequencing was undertaken. This analysis revealed that 34.4% of peaks bound by AR and FOXA1 contain NFI binding sites. Validation of 8 of these peaks by ChIP revealed that NFI family members can bind 6 of these predicted genomic elements, and 4 of the 8 associated genes undergo gene expression changes as a result of individual NFI knockdown. These observations suggest that NFI regulation of FOXA1/AR action is a frequent event, with individual family members playing distinct roles in AR target gene expression. PMID:24801505

  18. Peptide interfaces with graphene: an emerging intersection of analytical chemistry, theory, and materials.

    PubMed

    Russell, Shane R; Claridge, Shelley A

    2016-04-01

    Because noncovalent interface functionalization is frequently required in graphene-based devices, biomolecular self-assembly has begun to emerge as a route for controlling substrate electronic structure or binding specificity for soluble analytes. The remarkable diversity of structures that arise in biological self-assembly hints at the possibility of equally diverse and well-controlled surface chemistry at graphene interfaces. However, predicting and analyzing adsorbed monolayer structures at such interfaces raises substantial experimental and theoretical challenges. In contrast with the relatively well-developed monolayer chemistry and characterization methods applied at coinage metal surfaces, monolayers on graphene are both less robust and more structurally complex, levying more stringent requirements on characterization techniques. Theory presents opportunities to understand early binding events that lay the groundwork for full monolayer structure. However, predicting interactions between complex biomolecules, solvent, and substrate is necessitating a suite of new force fields and algorithms to assess likely binding configurations, solvent effects, and modulations to substrate electronic properties. This article briefly discusses emerging analytical and theoretical methods used to develop a rigorous chemical understanding of the self-assembly of peptide-graphene interfaces and prospects for future advances in the field.

  19. Conservation of Matrix Attachment Region-Binding Filament-Like Protein 1 among Higher Plants1

    PubMed Central

    Harder, Patricia A.; Silverstein, Rebecca A.; Meier, Iris

    2000-01-01

    The interaction of chromatin with the nuclear matrix via matrix attachment regions (MARs) on the DNA is considered to be of fundamental importance for higher-order chromatin organization and the regulation of gene expression. We have previously isolated a novel nuclear matrix-localized protein (MFP1) from tomato (Lycopersicon esculentum) that preferentially binds to MAR DNA. Tomato MFP1 has a predicted filament-protein-like structure and is associated with the nuclear envelope via an N-terminal targeting domain. Based on the antigenic relationship, we report here that MFP1 is conserved in a large number of dicot and monocot species. Several cDNAs were cloned from tobacco (Nicotiana tabacum) and shown to correspond to two tobacco MFP1 genes. Comparison of the primary and predicted secondary structures of MFP1 from tomato, tobacco, and Arabidopsis indicates a high degree of conservation of the N-terminal targeting domain, the overall putative coiled-coil structure of the protein, and the C-terminal DNA-binding domain. In addition, we show that tobacco MFP1 is regulated in an organ-specific and developmental fashion, and that this regulation occurs at the level of transcription or RNA stability. PMID:10631266

  20. Binding free energy prediction in strongly hydrophobic biomolecular systems.

    PubMed

    Charlier, Landry; Nespoulous, Claude; Fiorucci, Sébastien; Antonczak, Serge; Golebiowski, Jérome

    2007-11-21

    We present a comparison of various computational approaches aiming at predicting the binding free energy in ligand-protein systems where the ligand is located within a highly hydrophobic cavity. The relative binding free energy between similar ligands is obtained by means of the thermodynamic integration (TI) method and compared to experimental data obtained through isothermal titration calorimetry measurements. The absolute free energy of binding prediction was obtained on a similar system (a pyrazine derivative bound to a lipocalin) by TI, potential of mean force (PMF) and also by means of the MMPBSA protocols. Although the TI protocol performs poorly either with an explicit or an implicit solvation scheme, the PMF calculation using an implicit solvation scheme leads to encouraging results, with a prediction of the binding affinity being 2 kcal mol(-1) lower than the experimental value. The use of an implicit solvation scheme appears to be well suited for the study of such hydrophobic systems, due to the lack of water molecules within the binding site.

  1. The Extracytoplasmic Domain of the Mycobacterium tuberculosis Ser/Thr Kinase PknB Binds Specific Muropeptides and Is Required for PknB Localization

    PubMed Central

    Mir, Mushtaq; Asong, Jinkeng; Li, Xiuru; Cardot, Jessica; Boons, Geert-Jan; Husson, Robert N.

    2011-01-01

    The Mycobacterium tuberculosis Ser/Thr kinase PknB has been implicated in the regulation of cell growth and morphology in this organism. The extracytoplasmic domain of this membrane protein comprises four penicillin binding protein and Ser/Thr kinase associated (PASTA) domains, which are predicted to bind stem peptides of peptidoglycan. Using a comprehensive library of synthetic muropeptides, we demonstrate that the extracytoplasmic domain of PknB binds muropeptides in a manner dependent on the presence of specific amino acids at the second and third positions of the stem peptide, and on the presence of the sugar moiety N-acetylmuramic acid linked to the peptide. We further show that PknB localizes strongly to the mid-cell and also to the cell poles, and that the extracytoplasmic domain is required for PknB localization. In contrast to strong growth stimulation by conditioned medium, we observe no growth stimulation of M. tuberculosis by a synthetic muropeptide with high affinity for the PknB PASTAs. We do find a moderate effect of a high affinity peptide on resuscitation of dormant cells. While the PASTA domains of PknB may play a role in stimulating growth by binding exogenous peptidoglycan fragments, our data indicate that a major function of these domains is for proper PknB localization, likely through binding of peptidoglycan fragments produced locally at the mid-cell and the cell poles. These data suggest a model in which PknB is targeted to the sites of peptidoglycan turnover to regulate cell growth and cell division. PMID:21829358

  2. The extracytoplasmic domain of the Mycobacterium tuberculosis Ser/Thr kinase PknB binds specific muropeptides and is required for PknB localization.

    PubMed

    Mir, Mushtaq; Asong, Jinkeng; Li, Xiuru; Cardot, Jessica; Boons, Geert-Jan; Husson, Robert N

    2011-07-01

    The Mycobacterium tuberculosis Ser/Thr kinase PknB has been implicated in the regulation of cell growth and morphology in this organism. The extracytoplasmic domain of this membrane protein comprises four penicillin binding protein and Ser/Thr kinase associated (PASTA) domains, which are predicted to bind stem peptides of peptidoglycan. Using a comprehensive library of synthetic muropeptides, we demonstrate that the extracytoplasmic domain of PknB binds muropeptides in a manner dependent on the presence of specific amino acids at the second and third positions of the stem peptide, and on the presence of the sugar moiety N-acetylmuramic acid linked to the peptide. We further show that PknB localizes strongly to the mid-cell and also to the cell poles, and that the extracytoplasmic domain is required for PknB localization. In contrast to strong growth stimulation by conditioned medium, we observe no growth stimulation of M. tuberculosis by a synthetic muropeptide with high affinity for the PknB PASTAs. We do find a moderate effect of a high affinity peptide on resuscitation of dormant cells. While the PASTA domains of PknB may play a role in stimulating growth by binding exogenous peptidoglycan fragments, our data indicate that a major function of these domains is for proper PknB localization, likely through binding of peptidoglycan fragments produced locally at the mid-cell and the cell poles. These data suggest a model in which PknB is targeted to the sites of peptidoglycan turnover to regulate cell growth and cell division.

  3. Ku must load directly onto the chromosome end in order to mediate its telomeric functions.

    PubMed

    Lopez, Christopher R; Ribes-Zamora, Albert; Indiviglio, Sandra M; Williams, Christopher L; Haricharan, Svasti; Bertuch, Alison A

    2011-08-01

    The Ku heterodimer associates with the Saccharomyces cerevisiae telomere, where it impacts several aspects of telomere structure and function. Although Ku avidly binds DNA ends via a preformed channel, its ability to associate with telomeres via this mechanism could be challenged by factors known to bind directly to the chromosome terminus. This has led to uncertainty as to whether Ku itself binds directly to telomeric ends and whether end association is crucial for Ku's telomeric functions. To address these questions, we constructed DNA end binding-defective Ku heterodimers by altering amino acid residues in Ku70 and Ku80 that were predicted to contact DNA. These mutants continued to associate with their known telomere-related partners, such as Sir4, a factor required for telomeric silencing, and TLC1, the RNA component of telomerase. Despite these interactions, we found that the Ku mutants had markedly reduced association with telomeric chromatin and null-like deficiencies for telomere end protection, length regulation, and silencing functions. In contrast to Ku null strains, the DNA end binding defective Ku mutants resulted in increased, rather than markedly decreased, imprecise end-joining proficiency at an induced double-strand break. This result further supports that it was the specific loss of Ku's telomere end binding that resulted in telomeric defects rather than global loss of Ku's functions. The extensive telomere defects observed in these mutants lead us to propose that Ku is an integral component of the terminal telomeric cap, where it promotes a specific architecture that is central to telomere function and maintenance.

  4. Deciphering complex patterns of class-I HLA-peptide cross-reactivity via hierarchical grouping.

    PubMed

    Mukherjee, Sumanta; Warwicker, Jim; Chandra, Nagasuma

    2015-07-01

    T-cell responses in humans are initiated by the binding of a peptide antigen to a human leukocyte antigen (HLA) molecule. The peptide-HLA complex then recruits an appropriate T cell, leading to cell-mediated immunity. More than 2000 HLA class-I alleles are known in humans, and they vary only in their peptide-binding grooves. The polymorphism they exhibit enables them to bind a wide range of peptide antigens from diverse sources. HLA molecules and peptides present a complex molecular recognition pattern, as many peptides bind to a given allele and a given peptide can be recognized by many alleles. A powerful grouping scheme that not only provides an insightful classification, but is also capable of dissecting the physicochemical basis of recognition specificity is necessary to address this complexity. We present a hierarchical classification of 2010 class-I alleles by using a systematic divisive clustering method. All-pair distances of alleles were obtained by comparing binding pockets in the structural models. By varying the similarity thresholds, a multilevel classification was obtained, with 7 supergroups, each further subclassifying to yield 72 groups. An independent clustering performed based only on similarities in their epitope pools correlated highly with pocket-based clustering. Physicochemical feature combinations that best explain the basis of clustering are identified. Mutual information calculated for the set of peptide ligands enables identification of binding site residues contributing to peptide specificity. The grouping of HLA molecules achieved here will be useful for rational vaccine design, understanding disease susceptibilities and predicting risk of organ transplants.

  5. Computational Investigation of Glycosylation Effects on a Family 1 Carbohydrate-binding Module*

    PubMed Central

    Taylor, Courtney B.; Talib, M. Faiz; McCabe, Clare; Bu, Lintao; Adney, William S.; Himmel, Michael E.; Crowley, Michael F.; Beckham, Gregg T.

    2012-01-01

    Carbohydrate-binding modules (CBMs) are ubiquitous components of glycoside hydrolases, which degrade polysaccharides in nature. CBMs target specific polysaccharides, and CBM binding affinity to cellulose is known to be proportional to cellulase activity, such that increasing binding affinity is an important component of performance improvement. To ascertain the impact of protein and glycan engineering on CBM binding, we use molecular simulation to quantify cellulose binding of a natively glycosylated Family 1 CBM. To validate our approach, we first examine aromatic-carbohydrate interactions on binding, and our predictions are consistent with previous experiments, showing that a tyrosine to tryptophan mutation yields a 2-fold improvement in binding affinity. We then demonstrate that enhanced binding of 3–6-fold over a nonglycosylated CBM is achieved by the addition of a single, native mannose or a mannose dimer, respectively, which has not been considered previously. Furthermore, we show that the addition of a single, artificial glycan on the anterior of the CBM, with the native, posterior glycans also present, can have a dramatic impact on binding affinity in our model, increasing it up to 140-fold relative to the nonglycosylated CBM. These results suggest new directions in protein engineering, in that modifying glycosylation patterns via heterologous expression, manipulation of culture conditions, or introduction of artificial glycosylation sites, can alter CBM binding affinity to carbohydrates and may thus be a general strategy to enhance cellulase performance. Our results also suggest that CBM binding studies should consider the effects of glycosylation on binding and function. PMID:22147693

  6. A web server for analysis, comparison and prediction of protein ligand binding sites.

    PubMed

    Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

    2016-03-25

    One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .

  7. Assessing the performance of MM/PBSA and MM/GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes.

    PubMed

    Chen, Fu; Sun, Huiyong; Wang, Junmei; Zhu, Feng; Liu, Hui; Wang, Zhe; Lei, Tailong; Li, Youyong; Hou, Tingjun

    2018-06-21

    Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (ϵ in ). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with ϵ in = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 118 out of the 149 protein-RNA systems (79.2%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems. Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  8. Prediction of glycolipid-binding domains from the amino acid sequence of lipid raft-associated proteins: application to HpaA, a protein involved in the adhesion of Helicobacter pylori to gastrointestinal cells.

    PubMed

    Fantini, Jacques; Garmy, Nicolas; Yahi, Nouara

    2006-09-12

    Protein-glycolipid interactions mediate the attachment of various pathogens to the host cell surface as well as the association of numerous cellular proteins with lipid rafts. Thus, it is of primary importance to identify the protein domains involved in glycolipid recognition. Using structure similarity searches, we could identify a common glycolipid-binding domain in the three-dimensional structure of several proteins known to interact with lipid rafts. Yet the three-dimensional structure of most raft-targeted proteins is still unknown. In the present study, we have identified a glycolipid-binding domain in the amino acid sequence of a bacterial adhesin (Helicobacter pylori adhesin A, HpaA). The prediction was based on the major properties of the glycolipid-binding domains previously characterized by structural searches. A short (15-mer) synthetic peptide corresponding to this putative glycolipid-binding domain was synthesized, and we studied its interaction with glycolipid monolayers at the air-water interface. The synthetic HpaA peptide recognized LacCer but not Gb3. This glycolipid specificity was in line with that of the whole bacterium. Molecular modeling studies gave some insights into this high selectivity of interaction. It also suggested that Phe147 in HpaA played a key role in LacCer recognition, through sugar-aromatic CH-pi stacking interactions with the hydrophobic side of the galactose ring of LacCer. Correspondingly, the replacement of Phe147 with Ala strongly affected LacCer recognition, whereas substitution with Trp did not. Our method could be used to identify glycolipid-binding domains in microbial and cellular proteins interacting with lipid shells, rafts, and other specialized membrane microdomains.

  9. Topology-based modeling of intrinsically disordered proteins: balancing intrinsic folding and intermolecular interactions.

    PubMed

    Ganguly, Debabani; Chen, Jianhan

    2011-04-01

    Coupled binding and folding is frequently involved in specific recognition of so-called intrinsically disordered proteins (IDPs), a newly recognized class of proteins that rely on a lack of stable tertiary fold for function. Here, we exploit topology-based Gō-like modeling as an effective tool for the mechanism of IDP recognition within the theoretical framework of minimally frustrated energy landscape. Importantly, substantial differences exist between IDPs and globular proteins in both amino acid sequence and binding interface characteristics. We demonstrate that established Gō-like models designed for folded proteins tend to over-estimate the level of residual structures in unbound IDPs, whereas under-estimating the strength of intermolecular interactions. Such systematic biases have important consequences in the predicted mechanism of interaction. A strategy is proposed to recalibrate topology-derived models to balance intrinsic folding propensities and intermolecular interactions, based on experimental knowledge of the overall residual structure level and binding affinity. Applied to pKID/KIX, the calibrated Gō-like model predicts a dominant multistep sequential pathway for binding-induced folding of pKID that is initiated by KIX binding via the C-terminus in disordered conformations, followed by binding and folding of the rest of C-terminal helix and finally the N-terminal helix. This novel mechanism is consistent with key observations derived from a recent NMR titration and relaxation dispersion study and provides a molecular-level interpretation of kinetic rates derived from dispersion curve analysis. These case studies provide important insight into the applicability and potential pitfalls of topology-based modeling for studying IDP folding and interaction in general. Copyright © 2011 Wiley-Liss, Inc.

  10. Identification and Characterization of Noncovalent Interactions That Drive Binding and Specificity in DD-Peptidases and β-Lactamases.

    PubMed

    Hargis, Jacqueline C; Vankayala, Sai Lakshmana; White, Justin K; Woodcock, H Lee

    2014-02-11

    Bacterial resistance to standard (i.e., β-lactam-based) antibiotics has become a global pandemic. Simultaneously, research into the underlying causes of resistance has slowed substantially, although its importance is universally recognized. Key to unraveling critical details is characterization of the noncovalent interactions that govern binding and specificity (DD-peptidases, antibiotic targets, versus β-lactamases, the evolutionarily derived enzymes that play a major role in resistance) and ultimately resistance as a whole. Herein, we describe a detailed investigation that elicits new chemical insights into these underlying intermolecular interactions. Benzylpenicillin and a novel β-lactam peptidomimetic complexed to the Stremptomyces R61 peptidase are examined using an arsenal of computational techniques: MD simulations, QM/MM calculations, charge perturbation analysis, QM/MM orbital analysis, bioinformatics, flexible receptor/flexible ligand docking, and computational ADME predictions. Several key molecular level interactions are identified that not only shed light onto fundamental resistance mechanisms, but also offer explanations for observed specificity. Specifically, an extended π-π network is elucidated that suggests antibacterial resistance has evolved, in part, due to stabilizing aromatic interactions. Additionally, interactions between the protein and peptidomimetic substrate are identified and characterized. Of particular interest is a water-mediated salt bridge between Asp217 and the positively charged N-terminus of the peptidomimetic, revealing an interaction that may significantly contribute to β-lactam specificity. Finally, interaction information is used to suggest modifications to current β-lactam compounds that should both improve binding and specificity in DD-peptidases and their physiochemical properties.

  11. Sequence-Based Prediction of RNA-Binding Residues in Proteins.

    PubMed

    Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.

  12. Sequence-Based Prediction of RNA-Binding Residues in Proteins

    PubMed Central

    Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829

  13. Modeling and simulation studies of human β3 adrenergic receptor and its interactions with agonists.

    PubMed

    Sahi, Shakti; Tewatia, Parul; Malik, Balwant K

    2012-12-01

    β3 adrenergic receptor (β3AR) is known to mediate various pharmacological and physiological effects such as thermogenesis in brown adipocytes, lipolysis in white adipocytes, glucose homeostasis and intestinal smooth muscle relaxation. Several efforts have been made in this field to understand their function and regulation in different human tissues and they have emerged as potential attractive targets in drug discovery for the treatment of diabetes, depression, obesity etc. Although the crystal structures of Bovine Rhodopsin and β2 adrenergic receptor have been resolved, to date there is no three dimensional structural information on β3AR. Our aim in this study was to model 3D structure of β3AR by various molecular modeling and simulation techniques. In this paper, we describe a refined predicted model of β3AR using different algorithms for structure prediction. The structural refinement and minimization of the generated 3D model of β3AR were done by Schrodinger suite 9.1. Docking studies of β3AR model with the known agonists enabled us to identify specific residues, viz, Asp 117, Ser 208, Ser 209, Ser 212, Arg 315, Asn 332, within the β3AR binding pocket, which might play an important role in ligand binding. Receptor ligand interaction studies clearly indicated that these five residues showed strong hydrogen bonding interactions with the ligands. The results have been correlated with the experimental data available. The predicted ligand binding interactions and the simulation studies validate the methods used to predict the 3D-structure.

  14. How to deal with multiple binding poses in alchemical relative protein-ligand binding free energy calculations.

    PubMed

    Kaus, Joseph W; Harder, Edward; Lin, Teng; Abel, Robert; McCammon, J Andrew; Wang, Lingle

    2015-06-09

    Recent advances in improved force fields and sampling methods have made it possible for the accurate calculation of protein–ligand binding free energies. Alchemical free energy perturbation (FEP) using an explicit solvent model is one of the most rigorous methods to calculate relative binding free energies. However, for cases where there are high energy barriers separating the relevant conformations that are important for ligand binding, the calculated free energy may depend on the initial conformation used in the simulation due to the lack of complete sampling of all the important regions in phase space. This is particularly true for ligands with multiple possible binding modes separated by high energy barriers, making it difficult to sample all relevant binding modes even with modern enhanced sampling methods. In this paper, we apply a previously developed method that provides a corrected binding free energy for ligands with multiple binding modes by combining the free energy results from multiple alchemical FEP calculations starting from all enumerated poses, and the results are compared with Glide docking and MM-GBSA calculations. From these calculations, the dominant ligand binding mode can also be predicted. We apply this method to a series of ligands that bind to c-Jun N-terminal kinase-1 (JNK1) and obtain improved free energy results. The dominant ligand binding modes predicted by this method agree with the available crystallography, while both Glide docking and MM-GBSA calculations incorrectly predict the binding modes for some ligands. The method also helps separate the force field error from the ligand sampling error, such that deviations in the predicted binding free energy from the experimental values likely indicate possible inaccuracies in the force field. An error in the force field for a subset of the ligands studied was identified using this method, and improved free energy results were obtained by correcting the partial charges assigned to the ligands. This improved the root-mean-square error (RMSE) for the predicted binding free energy from 1.9 kcal/mol with the original partial charges to 1.3 kcal/mol with the corrected partial charges.

  15. How To Deal with Multiple Binding Poses in Alchemical Relative Protein–Ligand Binding Free Energy Calculations

    PubMed Central

    2016-01-01

    Recent advances in improved force fields and sampling methods have made it possible for the accurate calculation of protein–ligand binding free energies. Alchemical free energy perturbation (FEP) using an explicit solvent model is one of the most rigorous methods to calculate relative binding free energies. However, for cases where there are high energy barriers separating the relevant conformations that are important for ligand binding, the calculated free energy may depend on the initial conformation used in the simulation due to the lack of complete sampling of all the important regions in phase space. This is particularly true for ligands with multiple possible binding modes separated by high energy barriers, making it difficult to sample all relevant binding modes even with modern enhanced sampling methods. In this paper, we apply a previously developed method that provides a corrected binding free energy for ligands with multiple binding modes by combining the free energy results from multiple alchemical FEP calculations starting from all enumerated poses, and the results are compared with Glide docking and MM-GBSA calculations. From these calculations, the dominant ligand binding mode can also be predicted. We apply this method to a series of ligands that bind to c-Jun N-terminal kinase-1 (JNK1) and obtain improved free energy results. The dominant ligand binding modes predicted by this method agree with the available crystallography, while both Glide docking and MM-GBSA calculations incorrectly predict the binding modes for some ligands. The method also helps separate the force field error from the ligand sampling error, such that deviations in the predicted binding free energy from the experimental values likely indicate possible inaccuracies in the force field. An error in the force field for a subset of the ligands studied was identified using this method, and improved free energy results were obtained by correcting the partial charges assigned to the ligands. This improved the root-mean-square error (RMSE) for the predicted binding free energy from 1.9 kcal/mol with the original partial charges to 1.3 kcal/mol with the corrected partial charges. PMID:26085821

  16. Prediction of Surface and pH-Specific Binding of Peptides to Metal and Oxide Nanoparticles

    NASA Astrophysics Data System (ADS)

    Heinz, Hendrik; Lin, Tzu-Jen; Emami, Fateme Sadat; Ramezani-Dakhel, Hadi; Naik, Rajesh; Knecht, Marc; Perry, Carole C.; Huang, Yu

    2015-03-01

    The mechanism of specific peptide adsorption onto metallic and oxidic nanostructures has been elucidated in atomic resolution using novel force fields and surface models in comparison to measurements. As an example, variations in peptide adsorption on Pd and Pt nanoparticles depending on shape, size, and location of peptides on specific bounding facets are explained. Accurate computational predictions of reaction rates in C-C coupling reactions using particle models derived from HE-XRD and PDF data illustrate the utility of computational methods for the rational design of new catalysts. On oxidic nanoparticles such as silica and apatites, it is revealed how changes in pH lead to similarity scores of attracted peptides lower than 20%, supported by appropriate model surfaces and data from adsorption isotherms. The results demonstrate how new computational methods can support the design of nanoparticle carriers for drug release and the understanding of calcification mechanisms in the human body.

  17. Lipopolysaccharide-specific binding C-type lectin with one CRD domain from Fenneropenaeus merguiensis (FmLC4) functions as a pattern recognition receptor in shrimp innate immunity.

    PubMed

    Utarabhand, Prapaporn; Thepnarong, Supattra; Runsaeng, Phanthipha

    2017-10-01

    In crustaceans, an innate immune system is solely required because they lack an adaptive immunity. One kind of pattern recognition receptors (PRRs) that plays a particular role in the innate immunity of aquatic shrimp is lectin. A new diverse C-type lectin (FmLC4) was cloned from the hepatopancreas of Fenneropenaeus merguiensis by using RT-PCR and 5' and 3' rapid amplification of cDNA ends approaches. A full-length FmLC4 cDNA comprises 706 bp with an open reading frame of 552 bp, encoding a peptide of 184 amino acids. The predicted primary sequence of FmLC4 consists of a signal peptide of 19 amino acids, a molecular mass of 20.4 kDa, an isoelectric point of 5.13, one carbohydrate recognition domain with a QPD motif and a Ca 2+ binding site as well as a double-loop characteristic supported by two conserved disulfide bonds. The FmLC4 mRNA expression was found only in the hepatopancreas of normal shrimp and significantly up-regulated upon challenge the shrimp with Vibrio harveyi or white spot syndrome virus (WSSV). Recombinant FmLC4 (rFmLC4) could agglutinate various bacterial strains with Ca 2+ -dependence. Lipopolysaccharide (LPS) could specifically inhibit the agglutinating activity and potently bind to rFmLC4, indicating that FmLC4 was LPS-specific binding C-type lectin. Moreover, rFmLC4 itself displayed the in vivo effective clearance of the pathogenic bacterium V. harveyi. Altogether, FmLC4 may serve as LPS-specific PRR to recognize opportunistic bacterial and viral pathogens, and thus to play a role in the immune defense of aquatic shrimp via the binding and agglutination. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Accurate Prediction of Inducible Transcription Factor Binding Intensities In Vivo

    PubMed Central

    Siepel, Adam; Lis, John T.

    2012-01-01

    DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB–seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB–seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF–bound and HSF–free DNA, and then detecting HSF–bound DNA by high-throughput sequencing. We compared PB–seq binding profiles with ones observed in vivo by ChIP–seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase–seq data and the ChIP–chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity. PMID:22479205

  19. Analysis of a Clonal Lineage of HIV-1 Envelope V2/V3 Conformational Epitope-Specific Broadly Neutralizing Antibodies and Their Inferred Unmutated Common Ancestors ▿ †

    PubMed Central

    Bonsignori, Mattia; Hwang, Kwan-Ki; Chen, Xi; Tsao, Chun-Yen; Morris, Lynn; Gray, Elin; Marshall, Dawn J.; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Sinangil, Faruk; Pancera, Marie; Yongping, Yang; Zhang, Baoshan; Zhu, Jiang; Kwong, Peter D.; O'Dell, Sijy; Mascola, John R.; Wu, Lan; Nabel, Gary J.; Phogat, Sanjay; Seaman, Michael S.; Whitesides, John F.; Moody, M. Anthony; Kelsoe, Garnett; Yang, Xinzhen; Sodroski, Joseph; Shaw, George M.; Montefiori, David C.; Kepler, Thomas B.; Tomaras, Georgia D.; Alam, S. Munir; Liao, Hua-Xin; Haynes, Barton F.

    2011-01-01

    V2/V3 conformational epitope antibodies that broadly neutralize HIV-1 (PG9 and PG16) have been recently described. Since an elicitation of previously known broadly neutralizing antibodies has proven elusive, the induction of antibodies with such specificity is an important goal for HIV-1 vaccine development. A critical question is which immunogens and vaccine formulations might be used to trigger and drive the development of memory B cell precursors with V2/V3 conformational epitope specificity. In this paper we identified a clonal lineage of four V2/V3 conformational epitope broadly neutralizing antibodies (CH01 to CH04) from an African HIV-1-infected broad neutralizer and inferred their common reverted unmutated ancestor (RUA) antibodies. While conformational epitope antibodies rarely bind recombinant Env monomers, a screen of 32 recombinant envelopes for binding to the CH01 to CH04 antibodies showed monoclonal antibody (MAb) binding to the E.A244 gp120 Env and to chronic Env AE.CM243; MAbs CH01 and CH02 also bound to transmitted/founder Env B.9021. CH01 to CH04 neutralized 38% to 49% of a panel of 91 HIV-1 tier 2 pseudoviruses, while the RUAs neutralized only 16% of HIV-1 isolates. Although the reverted unmutated ancestors showed restricted neutralizing activity, they retained the ability to bind to the E.A244 gp120 HIV-1 envelope with an affinity predicted to trigger B cell development. Thus, E.A244, B.9021, and AE.CM243 Envs are three potential immunogen candidates for studies aimed at defining strategies to induce V2/V3 conformational epitope-specific antibodies. PMID:21795340

  20. Astrocytic autoantibody of neuromyelitis optica (NMO-IgG) binds to aquaporin-4 extracellular loops, monomers, tetramers and high order arrays

    PubMed Central

    Iorio, Raffaele; Fryer, James P.; Hinson, Shannon R.; Fallier-Becker, Petra; Wolburg, Hartwig; Pittock, Sean J.; Lennon, Vanda A.

    2012-01-01

    The principal central nervous system (CNS) water channel, aquaporin-4 (AQP4), is confined to astrocytic and ependymal membranes and is the target of a pathogenic autoantibody, neuromyelitis optica (NMO)-IgG. This disease-specific autoantibody unifies a spectrum of relapsing CNS autoimmune inflammatory disorders of which NMO exemplifies the classic phenotype. Multiple sclerosis and other immune-mediated demyelinating disorders of the CNS lack a distinctive biomarker. Two AQP4 isoforms, M1 and M23, exist as homotetrameric and heterotetrameric intramembranous particles (IMPs). Orthogonal arrays of predominantly M23 particles (OAPs) are an ultrastructural characteristic of astrocytic membranes. We used high-titered serum from 32 AQP4-IgG-seropositive patients and 85 controls to investigate the nature and molecular location of AQP4 epitopes that bind NMO-IgG, and the influence of supramolecular structure. NMO-IgG bound to denatured AQP4 monomers (68% of cases), to native tetramers and high order arrays (90% of cases), and to AQP4 in live cell membranes (100% of cases). Disease-specific epitopes reside in extracellular loop C more than in loops A or E. IgG binding to intracellular epitopes lacks disease specificity. These observations predict greater disease specificity and sensitivity for tissue-based and cell-based serological assays employing “native” AQP4 than assays employing denatured AQP4 and fragments. NMO-IgG binds most avidly to plasma membrane surface AQP4 epitopes formed by loop interactions within tetramers and by intermolecular interactions within high order structures. The relative abundance and localization of AQP4 high order arrays in distinct CNS regions may explain the variability in clinical phenotype of NMO spectrum disorders. PMID:22906356

  1. Generation of Novel Single-Chain Antibodies by Phage-Display Technology to Direct Imaging Agents Highly Selective to Pancreatic β- or α-Cells In Vivo

    PubMed Central

    Ueberberg, Sandra; Meier, Juris J.; Waengler, Carmen; Schechinger, Wolfgang; Dietrich, Johannes W.; Tannapfel, Andrea; Schmitz, Inge; Schirrmacher, Ralf; Köller, Manfred; Klein, Harald H.; Schneider, Stephan

    2009-01-01

    OBJECTIVE Noninvasive determination of pancreatic β-cell mass in vivo has been hampered by the lack of suitable β-cell–specific imaging agents. This report outlines an approach for the development of novel ligands homing selectively to islet cells in vivo. RESEARCH DESIGN AND METHODS To generate agents specifically binding to pancreatic islets, a phage library was screened for single-chain antibodies (SCAs) on rat islets using two different approaches. 1) The library was injected into rats in vivo, and islets were isolated after a circulation time of 5 min. 2) Pancreatic islets were directly isolated, and the library was panned in the islets in vitro. Subsequently, the identified SCAs were extensively characterized in vitro and in vivo. RESULTS We report the generation of SCAs that bind highly selective to either β- or α-cells. These SCAs are internalized by target cells, disappear rapidly from the vasculature, and exert no toxicity in vivo. Specific binding to β- or α-cells was detected in cell lines in vitro, in rats in vivo, and in human tissue in situ. Electron microscopy demonstrated binding of SCAs to the endoplasmatic reticulum and the secretory granules. Finally, in a biodistribution study the labeling intensity derived from [125I]-labeled SCAs after intravenous administration in rats strongly predicted the β-cell mass and was inversely related to the glucose excursions during an intraperitoneal glucose tolerance test. CONCLUSIONS Our data provide strong evidence that the presented SCAs are highly specific for pancreatic β-cells and enable imaging and quantification in vivo. PMID:19592622

  2. Interleukin-11 binds specific EF-hand proteins via their conserved structural motifs.

    PubMed

    Kazakov, Alexei S; Sokolov, Andrei S; Vologzhannikova, Alisa A; Permyakova, Maria E; Khorn, Polina A; Ismailov, Ramis G; Denessiouk, Konstantin A; Denesyuk, Alexander I; Rastrygina, Victoria A; Baksheeva, Viktoriia E; Zernii, Evgeni Yu; Zinchenko, Dmitry V; Glazatov, Vladimir V; Uversky, Vladimir N; Mirzabekov, Tajib A; Permyakov, Eugene A; Permyakov, Sergei E

    2017-01-01

    Interleukin-11 (IL-11) is a hematopoietic cytokine engaged in numerous biological processes and validated as a target for treatment of various cancers. IL-11 contains intrinsically disordered regions that might recognize multiple targets. Recently we found that aside from IL-11RA and gp130 receptors, IL-11 interacts with calcium sensor protein S100P. Strict calcium dependence of this interaction suggests a possibility of IL-11 interaction with other calcium sensor proteins. Here we probed specificity of IL-11 to calcium-binding proteins of various types: calcium sensors of the EF-hand family (calmodulin, S100B and neuronal calcium sensors: recoverin, NCS-1, GCAP-1, GCAP-2), calcium buffers of the EF-hand family (S100G, oncomodulin), and a non-EF-hand calcium buffer (α-lactalbumin). A specific subset of the calcium sensor proteins (calmodulin, S100B, NCS-1, GCAP-1/2) exhibits metal-dependent binding of IL-11 with dissociation constants of 1-19 μM. These proteins share several amino acid residues belonging to conservative structural motifs of the EF-hand proteins, 'black' and 'gray' clusters. Replacements of the respective S100P residues by alanine drastically decrease its affinity to IL-11, suggesting their involvement into the association process. Secondary structure and accessibility of the hinge region of the EF-hand proteins studied are predicted to control specificity and selectivity of their binding to IL-11. The IL-11 interaction with the EF-hand proteins is expected to occur under numerous pathological conditions, accompanied by disintegration of plasma membrane and efflux of cellular components into the extracellular milieu.

  3. Genome-wide DNA methylation measurements in prostate tissues uncovers novel prostate cancer diagnostic biomarkers and transcription factor binding patterns.

    PubMed

    Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M

    2017-04-17

    Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.

  4. A Modeling and Experimental Investigation of the Effects of Antigen Density, Binding Affinity, and Antigen Expression Ratio on Bispecific Antibody Binding to Cell Surface Targets.

    PubMed

    Rhoden, John J; Dyas, Gregory L; Wroblewski, Victor J

    2016-05-20

    Despite the increasing number of multivalent antibodies, bispecific antibodies, fusion proteins, and targeted nanoparticles that have been generated and studied, the mechanism of multivalent binding to cell surface targets is not well understood. Here, we describe a conceptual and mathematical model of multivalent antibody binding to cell surface antigens. Our model predicts that properties beyond 1:1 antibody:antigen affinity to target antigens have a strong influence on multivalent binding. Predicted crucial properties include the structure and flexibility of the antibody construct, the target antigen(s) and binding epitope(s), and the density of antigens on the cell surface. For bispecific antibodies, the ratio of the expression levels of the two target antigens is predicted to be critical to target binding, particularly for the lower expressed of the antigens. Using bispecific antibodies of different valencies to cell surface antigens including MET and EGF receptor, we have experimentally validated our modeling approach and its predictions and observed several nonintuitive effects of avidity related to antigen density, target ratio, and antibody affinity. In some biological circumstances, the effect we have predicted and measured varied from the monovalent binding interaction by several orders of magnitude. Moreover, our mathematical framework affords us a mechanistic interpretation of our observations and suggests strategies to achieve the desired antibody-antigen binding goals. These mechanistic insights have implications in antibody engineering and structure/activity relationship determination in a variety of biological contexts. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  5. Predicting and analyzing DNA-binding domains using a systematic approach to identifying a set of informative physicochemical and biochemical properties

    PubMed Central

    2011-01-01

    Background Existing methods of predicting DNA-binding proteins used valuable features of physicochemical properties to design support vector machine (SVM) based classifiers. Generally, selection of physicochemical properties and determination of their corresponding feature vectors rely mainly on known properties of binding mechanism and experience of designers. However, there exists a troublesome problem for designers that some different physicochemical properties have similar vectors of representing 20 amino acids and some closely related physicochemical properties have dissimilar vectors. Results This study proposes a systematic approach (named Auto-IDPCPs) to automatically identify a set of physicochemical and biochemical properties in the AAindex database to design SVM-based classifiers for predicting and analyzing DNA-binding domains/proteins. Auto-IDPCPs consists of 1) clustering 531 amino acid indices in AAindex into 20 clusters using a fuzzy c-means algorithm, 2) utilizing an efficient genetic algorithm based optimization method IBCGA to select an informative feature set of size m to represent sequences, and 3) analyzing the selected features to identify related physicochemical properties which may affect the binding mechanism of DNA-binding domains/proteins. The proposed Auto-IDPCPs identified m=22 features of properties belonging to five clusters for predicting DNA-binding domains with a five-fold cross-validation accuracy of 87.12%, which is promising compared with the accuracy of 86.62% of the existing method PSSM-400. For predicting DNA-binding sequences, the accuracy of 75.50% was obtained using m=28 features, where PSSM-400 has an accuracy of 74.22%. Auto-IDPCPs and PSSM-400 have accuracies of 80.73% and 82.81%, respectively, applied to an independent test data set of DNA-binding domains. Some typical physicochemical properties discovered are hydrophobicity, secondary structure, charge, solvent accessibility, polarity, flexibility, normalized Van Der Waals volume, pK (pK-C, pK-N, pK-COOH and pK-a(RCOOH)), etc. Conclusions The proposed approach Auto-IDPCPs would help designers to investigate informative physicochemical and biochemical properties by considering both prediction accuracy and analysis of binding mechanism simultaneously. The approach Auto-IDPCPs can be also applicable to predict and analyze other protein functions from sequences. PMID:21342579

  6. PREDICTING ER BINDING AFFINITY FOR EDC RANKING AND PRIORITIZATION: A COMPARISON OF THREE MODELS

    EPA Science Inventory

    A comparative analysis of how three COREPA models for ER binding affinity performed when used to predict potential estrogen receptor (ER) ligands is presented. Models I and II were developed based on training sets of 232 and 279 rat ER binding affinity measurements, respectively....

  7. High-resolution crystal structure and IgE recognition of the major grass pollen allergen Phl p 3.

    PubMed

    Devanaboyina, S C; Cornelius, C; Lupinek, C; Fauland, K; Dall'Antonia, F; Nandy, A; Hagen, S; Flicker, S; Valenta, R; Keller, W

    2014-12-01

    Group 2 and 3 grass pollen allergens are major allergens with high allergenic activity and exhibit structural similarity with the C-terminal portion of major group 1 allergens. In this study, we aimed to determine the crystal structure of timothy grass pollen allergen, Phl p 3, and to study its IgE recognition and cross-reactivity with group 2 and group 1 allergens. The three-dimensional structure of Phl p 3 was solved by X-ray crystallography and compared with the structures of group 1 and 2 grass pollen allergens. Cross-reactivity was studied using a human monoclonal antibody which inhibits allergic patients' IgE binding and by IgE inhibition experiments with patients' sera. Conformational Phl p 3 IgE epitopes were predicted with the algorithm SPADE, and Phl p 3 variants containing single point mutations in the predicted IgE binding sites were produced to analyze allergic patients' IgE binding. Phl p 3 is a globular β-sandwich protein showing structural similarity to Phl p 2 and the Phl p 1-C-terminal domain. Phl p 3 showed IgE cross-reactivity with group 2 allergens but not with group 1 allergens. SPADE identified two conformational IgE epitope-containing areas, of which one overlaps with the epitope defined by the monoclonal antibody. The mutation of arginine 68 to alanine completely abolished binding of the blocking antibody. This mutation and a mutation of D13 in the predicted second IgE epitope area also reduced allergic patients' IgE binding. Group 3 and group 2 grass pollen allergens are cross-reactive allergens containing conformational IgE epitopes. They lack relevant IgE cross-reactivity with group 1 allergens and therefore need to be included in diagnostic tests and allergen-specific treatments in addition to group 1 allergens. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. A novel structure-based multimode QSAR method affords predictive models for phosphodiesterase inhibitors.

    PubMed

    Dong, Xialan; Ebalunode, Jerry O; Cho, Sung Jin; Zheng, Weifan

    2010-02-22

    Quantitative structure-activity relationship (QSAR) methods aim to build quantitatively predictive models for the discovery of new molecules. It has been widely used in medicinal chemistry for drug discovery. Many QSAR techniques have been developed since Hansch's seminal work, and more are still being developed. Motivated by Hopfinger's receptor-dependent QSAR (RD-QSAR) formalism and the Lukacova-Balaz scheme to treat multimode issues, we have initiated studies that focus on a structure-based multimode QSAR (SBMM QSAR) method, where the structure of the target protein is used in characterizing the ligand, and the multimode issue of ligand binding is systematically treated with a modified Lukacova-Balaz scheme. All ligand molecules are first docked to the target binding pocket to obtain a set of aligned ligand poses. A structure-based pharmacophore concept is adopted to characterize the binding pocket. Specifically, we represent the binding pocket as a geometric grid labeled by pharmacophoric features. Each pose of the ligand is also represented as a labeled grid, where each grid point is labeled according to the atom types of nearby ligand atoms. These labeled grids or three-dimensional (3D) maps (both the receptor map (R-map) and the ligand map (L-map)) are compared to each other to derive descriptors for each pose of the ligand, resulting in a multimode structure-activity relationship (SAR) table. Iterative partial least-squares (PLS) is employed to build the QSAR models. When we applied this method to analyze PDE-4 inhibitors, predictive models have been developed, obtaining models with excellent training correlation (r(2) = 0.65-0.66), as well as test correlation (R(2) = 0.64-0.65). A comparative analysis with 4 other QSAR techniques demonstrates that this new method affords better models, in terms of the prediction power for the test set.

  9. From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors.

    PubMed

    Zafra Ruano, Ana; Cilia, Elisa; Couceiro, José R; Ruiz Sanz, Javier; Schymkowitz, Joost; Rousseau, Frederic; Luque, Irene; Lenaerts, Tom

    2016-05-01

    Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis.

  10. Molecular Dynamics in Mixed Solvents Reveals Protein-Ligand Interactions, Improves Docking, and Allows Accurate Binding Free Energy Predictions.

    PubMed

    Arcon, Juan Pablo; Defelipe, Lucas A; Modenutti, Carlos P; López, Elias D; Alvarez-Garcia, Daniel; Barril, Xavier; Turjanski, Adrián G; Martí, Marcelo A

    2017-04-24

    One of the most important biological processes at the molecular level is the formation of protein-ligand complexes. Therefore, determining their structure and underlying key interactions is of paramount relevance and has direct applications in drug development. Because of its low cost relative to its experimental sibling, molecular dynamics (MD) simulations in the presence of different solvent probes mimicking specific types of interactions have been increasingly used to analyze protein binding sites and reveal protein-ligand interaction hot spots. However, a systematic comparison of different probes and their real predictive power from a quantitative and thermodynamic point of view is still missing. In the present work, we have performed MD simulations of 18 different proteins in pure water as well as water mixtures of ethanol, acetamide, acetonitrile and methylammonium acetate, leading to a total of 5.4 μs simulation time. For each system, we determined the corresponding solvent sites, defined as space regions adjacent to the protein surface where the probability of finding a probe atom is higher than that in the bulk solvent. Finally, we compared the identified solvent sites with 121 different protein-ligand complexes and used them to perform molecular docking and ligand binding free energy estimates. Our results show that combining solely water and ethanol sites allows sampling over 70% of all possible protein-ligand interactions, especially those that coincide with ligand-based pharmacophoric points. Most important, we also show how the solvent sites can be used to significantly improve ligand docking in terms of both accuracy and precision, and that accurate predictions of ligand binding free energies, along with relative ranking of ligand affinity, can be performed.

  11. From Binding-Induced Dynamic Effects in SH3 Structures to Evolutionary Conserved Sectors

    PubMed Central

    Ruiz Sanz, Javier; Schymkowitz, Joost; Rousseau, Frederic

    2016-01-01

    Src Homology 3 domains are ubiquitous small interaction modules known to act as docking sites and regulatory elements in a wide range of proteins. Prior experimental NMR work on the SH3 domain of Src showed that ligand binding induces long-range dynamic changes consistent with an induced fit mechanism. The identification of the residues that participate in this mechanism produces a chart that allows for the exploration of the regulatory role of such domains in the activity of the encompassing protein. Here we show that a computational approach focusing on the changes in side chain dynamics through ligand binding identifies equivalent long-range effects in the Src SH3 domain. Mutation of a subset of the predicted residues elicits long-range effects on the binding energetics, emphasizing the relevance of these positions in the definition of intramolecular cooperative networks of signal transduction in this domain. We find further support for this mechanism through the analysis of seven other publically available SH3 domain structures of which the sequences represent diverse SH3 classes. By comparing the eight predictions, we find that, in addition to a dynamic pathway that is relatively conserved throughout all SH3 domains, there are dynamic aspects specific to each domain and homologous subgroups. Our work shows for the first time from a structural perspective, which transduction mechanisms are common between a subset of closely related and distal SH3 domains, while at the same time highlighting the differences in signal transduction that make each family member unique. These results resolve the missing link between structural predictions of dynamic changes and the domain sectors recently identified for SH3 domains through sequence analysis. PMID:27213566

  12. Dispersion-correcting potentials can significantly improve the bond dissociation enthalpies and noncovalent binding energies predicted by density-functional theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DiLabio, Gino A., E-mail: Gino.DiLabio@nrc.ca; Department of Chemistry, University of British Columbia, Okanagan, 3333 University Way, Kelowna, British Columbia V1V 1V7; Koleini, Mohammad

    2014-05-14

    Dispersion-correcting potentials (DCPs) are atom-centered Gaussian functions that are applied in a manner that is similar to effective core potentials. Previous work on DCPs has focussed on their use as a simple means of improving the ability of conventional density-functional theory methods to predict the binding energies of noncovalently bonded molecular dimers. We show in this work that DCPs developed for use with the LC-ωPBE functional along with 6-31+G(2d,2p) basis sets are capable of simultaneously improving predicted noncovalent binding energies of van der Waals dimer complexes and covalent bond dissociation enthalpies in molecules. Specifically, the DCPs developed herein for themore » C, H, N, and O atoms provide binding energies for a set of 66 noncovalently bonded molecular dimers (the “S66” set) with a mean absolute error (MAE) of 0.21 kcal/mol, which represents an improvement of more than a factor of 10 over unadorned LC-ωPBE/6-31+G(2d,2p) and almost a factor of two improvement over LC-ωPBE/6-31+G(2d,2p) used in conjunction with the “D3” pairwise dispersion energy corrections. In addition, the DCPs reduce the MAE of calculated X-H and X-Y (X,Y = C, H, N, O) bond dissociation enthalpies for a set of 40 species from 3.2 kcal/mol obtained with unadorned LC-ωPBE/6-31+G(2d,2p) to 1.6 kcal/mol. Our findings demonstrate that broad improvements to the performance of DFT methods may be achievable through the use of DCPs.« less

  13. Functional annotation from the genome sequence of the giant panda.

    PubMed

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-08-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

  14. DockRank: Ranking docked conformations using partner-specific sequence homology-based protein interface prediction

    PubMed Central

    Xue, Li C.; Jordan, Rafael A.; EL-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2015-01-01

    Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. Dock-Rank uses interface residues predicted by partner-specific sequence homology-based protein–protein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/. PMID:23873600

  15. DockRank: ranking docked conformations using partner-specific sequence homology-based protein interface prediction.

    PubMed

    Xue, Li C; Jordan, Rafael A; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-02-01

    Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner-specific sequence homology-based protein-protein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/. Copyright © 2013 Wiley Periodicals, Inc.

  16. Quantitative and predictive model of kinetic regulation by E. coli TPP riboswitches

    PubMed Central

    Guedich, Sondés; Puffer-Enders, Barbara; Baltzinger, Mireille; Hoffmann, Guillaume; Da Veiga, Cyrielle; Jossinet, Fabrice; Thore, Stéphane; Bec, Guillaume; Ennifar, Eric; Burnouf, Dominique; Dumas, Philippe

    2016-01-01

    ABSTRACT Riboswitches are non-coding elements upstream or downstream of mRNAs that, upon binding of a specific ligand, regulate transcription and/or translation initiation in bacteria, or alternative splicing in plants and fungi. We have studied thiamine pyrophosphate (TPP) riboswitches regulating translation of thiM operon and transcription and translation of thiC operon in E. coli, and that of THIC in the plant A. thaliana. For all, we ascertained an induced-fit mechanism involving initial binding of the TPP followed by a conformational change leading to a higher-affinity complex. The experimental values obtained for all kinetic and thermodynamic parameters of TPP binding imply that the regulation by A. thaliana riboswitch is governed by mass-action law, whereas it is of kinetic nature for the two bacterial riboswitches. Kinetic regulation requires that the RNA polymerase pauses after synthesis of each riboswitch aptamer to leave time for TPP binding, but only when its concentration is sufficient. A quantitative model of regulation highlighted how the pausing time has to be linked to the kinetic rates of initial TPP binding to obtain an ON/OFF switch in the correct concentration range of TPP. We verified the existence of these pauses and the model prediction on their duration. Our analysis also led to quantitative estimates of the respective efficiency of kinetic and thermodynamic regulations, which shows that kinetically regulated riboswitches react more sharply to concentration variation of their ligand than thermodynamically regulated riboswitches. This rationalizes the interest of kinetic regulation and confirms empirical observations that were obtained by numerical simulations. PMID:26932506

  17. Quantitative and predictive model of kinetic regulation by E. coli TPP riboswitches.

    PubMed

    Guedich, Sondés; Puffer-Enders, Barbara; Baltzinger, Mireille; Hoffmann, Guillaume; Da Veiga, Cyrielle; Jossinet, Fabrice; Thore, Stéphane; Bec, Guillaume; Ennifar, Eric; Burnouf, Dominique; Dumas, Philippe

    2016-01-01

    Riboswitches are non-coding elements upstream or downstream of mRNAs that, upon binding of a specific ligand, regulate transcription and/or translation initiation in bacteria, or alternative splicing in plants and fungi. We have studied thiamine pyrophosphate (TPP) riboswitches regulating translation of thiM operon and transcription and translation of thiC operon in E. coli, and that of THIC in the plant A. thaliana. For all, we ascertained an induced-fit mechanism involving initial binding of the TPP followed by a conformational change leading to a higher-affinity complex. The experimental values obtained for all kinetic and thermodynamic parameters of TPP binding imply that the regulation by A. thaliana riboswitch is governed by mass-action law, whereas it is of kinetic nature for the two bacterial riboswitches. Kinetic regulation requires that the RNA polymerase pauses after synthesis of each riboswitch aptamer to leave time for TPP binding, but only when its concentration is sufficient. A quantitative model of regulation highlighted how the pausing time has to be linked to the kinetic rates of initial TPP binding to obtain an ON/OFF switch in the correct concentration range of TPP. We verified the existence of these pauses and the model prediction on their duration. Our analysis also led to quantitative estimates of the respective efficiency of kinetic and thermodynamic regulations, which shows that kinetically regulated riboswitches react more sharply to concentration variation of their ligand than thermodynamically regulated riboswitches. This rationalizes the interest of kinetic regulation and confirms empirical observations that were obtained by numerical simulations.

  18. Accuracy of binding mode prediction with a cascadic stochastic tunneling method.

    PubMed

    Fischer, Bernhard; Basili, Serena; Merlitz, Holger; Wenzel, Wolfgang

    2007-07-01

    We investigate the accuracy of the binding modes predicted for 83 complexes of the high-resolution subset of the ASTEX/CCDC receptor-ligand database using the atomistic FlexScreen approach with a simple forcefield-based scoring function. The median RMS deviation between experimental and predicted binding mode was just 0.83 A. Over 80% of the ligands dock within 2 A of the experimental binding mode, for 60 complexes the docking protocol locates the correct binding mode in all of ten independent simulations. Most docking failures arise because (a) the experimental structure clashed in our forcefield and is thus unattainable in the docking process or (b) because the ligand is stabilized by crystal water. 2007 Wiley-Liss, Inc.

  19. Urate is a ligand for the transcriptional regulator PecS.

    PubMed

    Perera, Inoka C; Grove, Anne

    2010-09-24

    PecS is a member of the MarR (multiple antibiotic resistance regulator) family, which has been shown in Erwinia to regulate the expression of virulence genes. MarR homologs typically bind a small molecule ligand, resulting in attenuated DNA binding. For PecS, the natural ligand has not been identified. We have previously shown that urate is a ligand for the Deinococcus radiodurans-encoded MarR homolog HucR (hypothetical uricase regulator) and identified residues responsible for ligand binding. We show here that all four residues involved in urate binding and propagation of conformational changes to DNA recognition helices are conserved in PecS homologs, suggesting that urate is the ligand for PecS. Consistent with this prediction, Agrobacterium tumefaciens PecS specifically binds urate, and urate attenuates DNA binding in vitro. PecS binds two operator sites in the intergenic region between the divergent pecS gene and pecM genes, one of which features two partially overlapping repeats to which PecS binds as a dimer on opposite faces of the duplex. Notably, urate dissociates PecS from cognate DNA, allowing transcription of both genes in vivo. Taken together, our data show that urate is a ligand for PecS and suggest that urate serves a novel function in signaling the colonization of a host plant. Copyright © 2010 Elsevier Ltd. All rights reserved.

  20. Modeling the Binding of Neurotransmitter Transporter Inhibitors with Molecular Dynamics and Free Energy Calculations

    NASA Astrophysics Data System (ADS)

    Jean, Bernandie

    The monoamine transporter (MAT) proteins responsible for the reuptake of the neurotransmitter substrates, dopamine, serotonin, and norepinephrine, are drug targets for the treatment of psychiatric disorders including depression, anxiety, and attention deficit hyperactivity disorder. Small molecules that inhibit these proteins can serve as useful therapeutic agents. However, some dopamine transporter (DAT) inhibitors, such as cocaine and methamphetamine, are highly addictive and abusable. Efforts have been made to develop small molecules that will inhibit the transporters and elucidate specific binding site interactions. This work provides knowledge of molecular interactions associated with MAT inhibitors by offering an atomistic perspective that can guide designs of new pharmacotherapeutics with enhanced activity. The work described herein evaluates intermolecular interactions using computational methods to reveal the mechanistic detail of inhibitors binding in the DAT. Because cocaine recognizes the extracellular-facing or outward-facing (OF) DAT conformation and benztropine recognizes the intracellular-facing or inward-facing (IF) conformation, it was postulated that behaviorally "typical" (abusable, locomotor psychostimulant) inhibitors stabilize the OF DAT and "atypical" (little or no abuse potential) inhibitors favor IF DAT. Indeed, behaviorally-atypical cocaine analogs have now been shown to prefer the OF DAT conformation. Specifically, the binding interactions of two cocaine analogs, LX10 and LX11, were studied in the OF DAT using molecular dynamics simulations. LX11 was able to interact with residues of transmembrane helix 8 and bind in a fashion that allowed for hydration of the primary binding site (S1) from the intracellular space, thus impacting the intracellular interaction network capable of regulating conformational transitions in DAT. Additionally, a novel serotonin transporter (SERT) inhibitor previously discovered through virtual screening at the SERT secondary binding site (S2) was studied. Intermolecular interactions between SM11 and SERT have been assessed using binding free energy calculations to predict the ligand-binding site and optimize ligand-binding interactions. Results indicate the addition of atoms to the 4-chlorobenzyl moiety were most energetically favorable. The simulations carried out in DAT and SERT were supported by experimental results. Furthermore, the co-crystal structures of DAT and SERT share similar ligand-binding interactions with the homology models used in this study.

  1. Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0

    PubMed Central

    Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke

    2015-01-01

    Motivation: Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. Results: We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. Availability and implementation: http://kiharalab.org/patchsurfer2.0/ Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25359888

  2. Absence of serum growth hormone binding protein in patients with growth hormone receptor deficiency (Laron dwarfism).

    PubMed

    Daughaday, W H; Trivedi, B

    1987-07-01

    It has recently been recognized that human serum contains a protein that specifically binds human growth hormone (hGH). This protein has the same restricted specificity for hGH as the membrane-bound GH receptor. To determine whether the GH-binding protein is a derivative of, or otherwise related to, the GH receptor, we have examined the serum of three patients with Laron-type dwarfism, a condition in which GH refractoriness has been attributed to a defect in the GH receptor. The binding of 125I-labeled hGH incubated with serum has been measured after gel filtration of the serum through an Ultrogel AcA 44 minicolumn. Nonspecific binding was determined when 125I-hGH was incubated with serum in the presence of an excess of GH. Results are expressed as percent of specifically bound 125I-hGH and as specific binding relative to that of a reference serum after correction is made for endogenous GH. The mean +/- SEM of specific binding of sera from eight normal adults (26-46 years of age) was 21.6 +/- 0.45%, and the relative specific binding was 101.1 +/- 8.6%. Sera from 11 normal children had lower specific binding of 12.5 +/- 1.95% and relative specific binding of 56.6 +/- 9.1%. Sera from three children with Laron-type dwarfism lacked any demonstrable GH binding, whereas sera from 10 other children with other types of nonpituitary short stature had normal relative specific binding. We suggest that the serum GH-binding protein is a soluble derivative of the GH receptor. Measurement of the serum GH-binding protein may permit recognition of other abnormalities of the GH receptor.

  3. Absence of serum growth hormone binding protein in patients with growth hormone receptor deficiency (Laron dwarfism).

    PubMed Central

    Daughaday, W H; Trivedi, B

    1987-01-01

    It has recently been recognized that human serum contains a protein that specifically binds human growth hormone (hGH). This protein has the same restricted specificity for hGH as the membrane-bound GH receptor. To determine whether the GH-binding protein is a derivative of, or otherwise related to, the GH receptor, we have examined the serum of three patients with Laron-type dwarfism, a condition in which GH refractoriness has been attributed to a defect in the GH receptor. The binding of 125I-labeled hGH incubated with serum has been measured after gel filtration of the serum through an Ultrogel AcA 44 minicolumn. Nonspecific binding was determined when 125I-hGH was incubated with serum in the presence of an excess of GH. Results are expressed as percent of specifically bound 125I-hGH and as specific binding relative to that of a reference serum after correction is made for endogenous GH. The mean +/- SEM of specific binding of sera from eight normal adults (26-46 years of age) was 21.6 +/- 0.45%, and the relative specific binding was 101.1 +/- 8.6%. Sera from 11 normal children had lower specific binding of 12.5 +/- 1.95% and relative specific binding of 56.6 +/- 9.1%. Sera from three children with Laron-type dwarfism lacked any demonstrable GH binding, whereas sera from 10 other children with other types of nonpituitary short stature had normal relative specific binding. We suggest that the serum GH-binding protein is a soluble derivative of the GH receptor. Measurement of the serum GH-binding protein may permit recognition of other abnormalities of the GH receptor. PMID:3474620

  4. Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations.

    PubMed

    Hou, Tingjun; Wang, Junmei; Li, Youyong; Wang, Wei

    2011-01-24

    The Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) and the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) methods calculate binding free energies for macromolecules by combining molecular mechanics calculations and continuum solvation models. To systematically evaluate the performance of these methods, we report here an extensive study of 59 ligands interacting with six different proteins. First, we explored the effects of the length of the molecular dynamics (MD) simulation, ranging from 400 to 4800 ps, and the solute dielectric constant (1, 2, or 4) on the binding free energies predicted by MM/PBSA. The following three important conclusions could be observed: (1) MD simulation length has an obvious impact on the predictions, and longer MD simulation is not always necessary to achieve better predictions. (2) The predictions are quite sensitive to the solute dielectric constant, and this parameter should be carefully determined according to the characteristics of the protein/ligand binding interface. (3) Conformational entropy often show large fluctuations in MD trajectories, and a large number of snapshots are necessary to achieve stable predictions. Next, we evaluated the accuracy of the binding free energies calculated by three Generalized Born (GB) models. We found that the GB model developed by Onufriev and Case was the most successful model in ranking the binding affinities of the studied inhibitors. Finally, we evaluated the performance of MM/GBSA and MM/PBSA in predicting binding free energies. Our results showed that MM/PBSA performed better in calculating absolute, but not necessarily relative, binding free energies than MM/GBSA. Considering its computational efficiency, MM/GBSA can serve as a powerful tool in drug design, where correct ranking of inhibitors is often emphasized.

  5. PHOENIX: a scoring function for affinity prediction derived using high-resolution crystal structures and calorimetry measurements.

    PubMed

    Tang, Yat T; Marshall, Garland R

    2011-02-28

    Binding affinity prediction is one of the most critical components to computer-aided structure-based drug design. Despite advances in first-principle methods for predicting binding affinity, empirical scoring functions that are fast and only relatively accurate are still widely used in structure-based drug design. With the increasing availability of X-ray crystallographic structures in the Protein Data Bank and continuing application of biophysical methods such as isothermal titration calorimetry to measure thermodynamic parameters contributing to binding free energy, sufficient experimental data exists that scoring functions can now be derived by separating enthalpic (ΔH) and entropic (TΔS) contributions to binding free energy (ΔG). PHOENIX, a scoring function to predict binding affinities of protein-ligand complexes, utilizes the increasing availability of experimental data to improve binding affinity predictions by the following: model training and testing using high-resolution crystallographic data to minimize structural noise, independent models of enthalpic and entropic contributions fitted to thermodynamic parameters assumed to be thermodynamically biased to calculate binding free energy, use of shape and volume descriptors to better capture entropic contributions. A set of 42 descriptors and 112 protein-ligand complexes were used to derive functions using partial least-squares for change of enthalpy (ΔH) and change of entropy (TΔS) to calculate change of binding free energy (ΔG), resulting in a predictive r2 (r(pred)2) of 0.55 and a standard error (SE) of 1.34 kcal/mol. External validation using the 2009 version of the PDBbind "refined set" (n = 1612) resulted in a Pearson correlation coefficient (R(p)) of 0.575 and a mean error (ME) of 1.41 pK(d). Enthalpy and entropy predictions were of limited accuracy individually. However, their difference resulted in a relatively accurate binding free energy. While the development of an accurate and applicable scoring function was an objective of this study, the main focus was evaluation of the use of high-resolution X-ray crystal structures with high-quality thermodynamic parameters from isothermal titration calorimetry for scoring function development. With the increasing application of structure-based methods in molecular design, this study suggests that using high-resolution crystal structures, separating enthalpy and entropy contributions to binding free energy, and including descriptors to better capture entropic contributions may prove to be effective strategies toward rapid and accurate calculation of binding affinity.

  6. Predicting nucleic acid binding interfaces from structural models of proteins

    PubMed Central

    Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

    2011-01-01

    The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767

  7. Understanding and Manipulating Electrostatic Fields at the Protein-Protein Interface Using Vibrational Spectroscopy and Continuum Electrostatics Calculations.

    PubMed

    Ritchie, Andrew W; Webb, Lauren J

    2015-11-05

    Biological function emerges in large part from the interactions of biomacromolecules in the complex and dynamic environment of the living cell. For this reason, macromolecular interactions in biological systems are now a major focus of interest throughout the biochemical and biophysical communities. The affinity and specificity of macromolecular interactions are the result of both structural and electrostatic factors. Significant advances have been made in characterizing structural features of stable protein-protein interfaces through the techniques of modern structural biology, but much less is understood about how electrostatic factors promote and stabilize specific functional macromolecular interactions over all possible choices presented to a given molecule in a crowded environment. In this Feature Article, we describe how vibrational Stark effect (VSE) spectroscopy is being applied to measure electrostatic fields at protein-protein interfaces, focusing on measurements of guanosine triphosphate (GTP)-binding proteins of the Ras superfamily binding with structurally related but functionally distinct downstream effector proteins. In VSE spectroscopy, spectral shifts of a probe oscillator's energy are related directly to that probe's local electrostatic environment. By performing this experiment repeatedly throughout a protein-protein interface, an experimental map of measured electrostatic fields generated at that interface is determined. These data can be used to rationalize selective binding of similarly structured proteins in both in vitro and in vivo environments. Furthermore, these data can be used to compare to computational predictions of electrostatic fields to explore the level of simulation detail that is necessary to accurately predict our experimental findings.

  8. The homologue of mannose-binding lectin in the carp family Cyprinidae is expressed at high level in spleen, and the deduced primary structure predicts affinity for galactose.

    PubMed

    Vitved, L; Holmskov, U; Koch, C; Teisner, B; Hansen, S; Salomonsen, J; Skjødt, K

    2000-09-01

    Mannose-binding lectin (MBL) participates in the innate immune system as an activator of the complement system and as an opsonin after binding to certain carbohydrate structures on microorganisms. We isolated and characterized cDNA transcripts encoding an MBL homologue from three members of the carp family Cyprinidae, the zebrafish Danio rerio, the goldfish Carassius auratus, and the carp Cyprinus carpio. The carp and zebrafish transcripts contain two polyadenylation sites and RT-PCR on mRNA from carp tissues revealed the carp transcript to be most prominently expressed in the spleen. The deduced mature proteins contain 228 or 233 amino acids with a short N-terminal segment containing a single conserved cysteine expected to form interchain disulfide bridges, a collagen domain interrupted by four amino acids between two glycine residues, a neck region predicted to form an alpha-helical coiled-coil structure, and a C-terminal carbohydrate recognition domain (CRD). Several of the structurally important residues in the CRD are conserved, but the residues known to interact with the calcium ion and hydroxyl groups of the carbohydrate ligand are different. The amino acid motif EPN, important for mannose specificity, was QPD in the Cyprinidae homologue, suggesting specificity for galactose instead. The identity between the deduced amino acid sequences is more than 90% between the carp and the goldfish and 68% and 65% between these two species, respectively, and the zebrafish. The identity with bird and mammalian MBLs ranges from 28 to 33%.

  9. Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.

    PubMed

    Sun, Meijian; Wang, Xia; Zou, Chuanxin; He, Zenghui; Liu, Wei; Li, Honglin

    2016-06-07

    RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .

  10. Identification, characterization and leucocyte expression of Siglec-10, a novel human sialic acid-binding receptor.

    PubMed Central

    Munday, J; Kerr, S; Ni, J; Cornish, A L; Zhang, J Q; Nicoll, G; Floyd, H; Mattei, M G; Moore, P; Liu, D; Crocker, P R

    2001-01-01

    Here we characterize Siglec-10 as a new member of the Siglec family of sialic acid-binding Ig-like lectins. A full-length cDNA was isolated from a human spleen library and the corresponding gene identified. Siglec-10 is predicted to contain five extracellular Ig-like domains and a cytoplasmic tail containing three putative tyrosine-based signalling motifs. Siglec-10 exhibited a high degree of sequence similarity to CD33-related Siglecs and mapped to the same region, on chromosome 19q13.3. The expressed protein was able to mediate sialic acid-dependent binding to human erythrocytes and soluble sialoglycoconjugates. Using specific antibodies, Siglec-10 was detected on subsets of human leucocytes including eosinophils, monocytes and a minor population of natural killer-like cells. The molecular properties and expression pattern suggest that Siglec-10 may function as an inhibitory receptor within the innate immune system. PMID:11284738

  11. SAAMBE: Webserver to Predict the Charge of Binding Free Energy Caused by Amino Acids Mutations.

    PubMed

    Petukh, Marharyta; Dai, Luogeng; Alexov, Emil

    2016-04-12

    Predicting the effect of amino acid substitutions on protein-protein affinity (typically evaluated via the change of protein binding free energy) is important for both understanding the disease-causing mechanism of missense mutations and guiding protein engineering. In addition, researchers are also interested in understanding which energy components are mostly affected by the mutation and how the mutation affects the overall structure of the corresponding protein. Here we report a webserver, the Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) webserver, which addresses the demand for tools for predicting the change of protein binding free energy. SAAMBE is an easy to use webserver, which only requires that a coordinate file be inputted and the user is provided with various, but easy to navigate, options. The user specifies the mutation position, wild type residue and type of mutation to be made. The server predicts the binding free energy change, the changes of the corresponding energy components and provides the energy minimized 3D structure of the wild type and mutant proteins for download. The SAAMBE protocol performance was tested by benchmarking the predictions against over 1300 experimentally determined changes of binding free energy and a Pearson correlation coefficient of 0.62 was obtained. How the predictions can be used for discriminating disease-causing from harmless mutations is discussed. The webserver can be accessed via http://compbio.clemson.edu/saambe_webserver/.

  12. Computational Exploration of a Protein Receptor Binding Space with Student Proposed Peptide Ligands

    PubMed Central

    King, Matthew D.; Phillips, Paul; Turner, Matthew W.; Katz, Michael; Lew, Sarah; Bradburn, Sarah; Andersen, Tim; Mcdougal, Owen M.

    2017-01-01

    Computational molecular docking is a fast and effective in silico method for the analysis of binding between a protein receptor model and a ligand. The visualization and manipulation of protein to ligand binding in three-dimensional space represents a powerful tool in the biochemistry curriculum to enhance student learning. The DockoMatic tutorial described herein provides a framework by which instructors can guide students through a drug screening exercise. Using receptor models derived from readily available protein crystal structures, docking programs have the ability to predict ligand binding properties, such as preferential binding orientations and binding affinities. The use of computational studies can significantly enhance complimentary wet chemical experimentation by providing insight into the important molecular interactions within the system of interest, as well as guide the design of new candidate ligands based on observed binding motifs and energetics. In this laboratory tutorial, the graphical user interface, DockoMatic, facilitates docking job submissions to the docking engine, AutoDock 4.2. The purpose of this exercise is to successfully dock a 17-amino acid peptide, α-conotoxin TxIA, to the acetylcholine binding protein from Aplysia californica-AChBP to determine the most stable binding configuration. Each student will then propose two specific amino acid substitutions of α-conotoxin TxIA to enhance peptide binding affinity, create the mutant in DockoMatic, and perform docking calculations to compare their results with the class. Students will also compare intermolecular forces, binding energy, and geometric orientation of their prepared analog to their initial α-conotoxin TxIA docking results. PMID:26537635

  13. DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.

    PubMed

    Lanchantin, Jack; Singh, Ritambhara; Wang, Beilun; Qi, Yanjun

    2017-01-01

    Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence's saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them.

  14. Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks

    PubMed Central

    Lanchantin, Jack; Singh, Ritambhara; Wang, Beilun; Qi, Yanjun

    2018-01-01

    Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence’s saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them. PMID:27896980

  15. Computational Identification of Diverse Mechanisms Underlying Transcription Factor-DNA Occupancy

    PubMed Central

    Cheng, Qiong; Kazemian, Majid; Pham, Hannah; Blatti, Charles; Celniker, Susan E.; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh

    2013-01-01

    ChIP-based genome-wide assays of transcription factor (TF) occupancy have emerged as a powerful, high-throughput method to understand transcriptional regulation, especially on a global scale. This has led to great interest in the underlying biochemical mechanisms that direct TF-DNA binding, with the ultimate goal of computationally predicting a TF's occupancy profile in any cellular condition. In this study, we examined the influence of various potential determinants of TF-DNA binding on a much larger scale than previously undertaken. We used a thermodynamics-based model of TF-DNA binding, called “STAP,” to analyze 45 TF-ChIP data sets from Drosophila embryonic development. We built a cross-validation framework that compares a baseline model, based on the ChIP'ed (“primary”) TF's motif, to more complex models where binding by secondary TFs is hypothesized to influence the primary TF's occupancy. Candidates interacting TFs were chosen based on RNA-SEQ expression data from the time point of the ChIP experiment. We found widespread evidence of both cooperative and antagonistic effects by secondary TFs, and explicitly quantified these effects. We were able to identify multiple classes of interactions, including (1) long-range interactions between primary and secondary motifs (separated by ≤150 bp), suggestive of indirect effects such as chromatin remodeling, (2) short-range interactions with specific inter-site spacing biases, suggestive of direct physical interactions, and (3) overlapping binding sites suggesting competitive binding. Furthermore, by factoring out the previously reported strong correlation between TF occupancy and DNA accessibility, we were able to categorize the effects into those that are likely to be mediated by the secondary TF's effect on local accessibility and those that utilize accessibility-independent mechanisms. Finally, we conducted in vitro pull-down assays to test model-based predictions of short-range cooperative interactions, and found that seven of the eight TF pairs tested physically interact and that some of these interactions mediate cooperative binding to DNA. PMID:23935523

  16. The Role of Flexibility and Conformational Selection in the Binding Promiscuity of PDZ Domains

    PubMed Central

    Münz, Márton; Hein, Jotun; Biggin, Philip C.

    2012-01-01

    In molecular recognition, it is often the case that ligand binding is coupled to conformational change in one or both of the binding partners. Two hypotheses describe the limiting cases involved; the first is the induced fit and the second is the conformational selection model. The conformational selection model requires that the protein adopts conformations that are similar to the ligand-bound conformation in the absence of ligand, whilst the induced-fit model predicts that the ligand-bound conformation of the protein is only accessible when the ligand is actually bound. The flexibility of the apo protein clearly plays a major role in these interpretations. For many proteins involved in signaling pathways there is the added complication that they are often promiscuous in that they are capable of binding to different ligand partners. The relationship between protein flexibility and promiscuity is an area of active research and is perhaps best exemplified by the PDZ domain family of proteins. In this study we use molecular dynamics simulations to examine the relationship between flexibility and promiscuity in five PDZ domains: the human Dvl2 (Dishevelled-2) PDZ domain, the human Erbin PDZ domain, the PDZ1 domain of InaD (inactivation no after-potential D protein) from fruit fly, the PDZ7 domain of GRIP1 (glutamate receptor interacting protein 1) from rat and the PDZ2 domain of PTP-BL (protein tyrosine phosphatase) from mouse. We show that despite their high structural similarity, the PDZ binding sites have significantly different dynamics. Importantly, the degree of binding pocket flexibility was found to be closely related to the various characteristics of peptide binding specificity and promiscuity of the five PDZ domains. Our findings suggest that the intrinsic motions of the apo structures play a key role in distinguishing functional properties of different PDZ domains and allow us to make predictions that can be experimentally tested. PMID:23133356

  17. Combinatorial multispectral, thermodynamics, docking and site-directed mutagenesis reveal the cognitive characteristics of honey bee chemosensory protein to plant semiochemical.

    PubMed

    Tan, Jing; Song, Xinmi; Fu, Xiaobin; Wu, Fan; Hu, Fuliang; Li, Hongliang

    2018-08-05

    In the chemoreceptive system of insects, there are always some soluble binding proteins, such as some antennal-specific chemosensory proteins (CSPs), which are abundantly distributed in the chemosensory sensillar lymph. The antennal-specific CSPs usually have strong capability to bind diverse semiochemicals, while the detailed interaction between CSPs and the semiochemicals remain unclear. Here, by means of the combinatorial multispectral, thermodynamics, docking and site-directed mutagenesis, we detailedly interpreted a binding interaction between a plant semiochemical β-ionone and antennal-specific CSP1 from the worker honey bee. Thermodynamic parameters (ΔH < 0, ΔS > 0) indicate that the interaction is mainly driven by hydrophobic forces and electrostatic interactions. Docking prediction results showed that there are two key amino acids, Phe44 and Gln63, may be involved in the interacting process of CSP1 to β-ionone. In order to confirm the two key amino acids, site-directed mutagenesis were performed and the binding constant (K A ) for two CSP1 mutant proteins was reduced by 60.82% and 46.80% compared to wild-type CSP1. The thermodynamic analysis of mutant proteins furtherly verified that Phe44 maintained an electrostatic interaction and Gln63 contributes hydrophobic and electrostatic forces. Our investigation initially elucidates the physicochemical mechanism of the interaction between antennal-special CSPs in insects including bees to plant semiochemicals, as well as the development of twice thermodynamic analysis (wild type and mutant proteins) combined with multispectral and site-directed mutagenesis methods. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. The common equine class I molecule Eqca-1*00101 (ELA-A3.1) is characterized by narrow peptide binding and T cell epitope repertoires.

    PubMed

    Bergmann, Tobias; Moore, Carrie; Sidney, John; Miller, Donald; Tallmadge, Rebecca; Harman, Rebecca M; Oseroff, Carla; Wriston, Amanda; Shabanowitz, Jeffrey; Hunt, Donald F; Osterrieder, Nikolaus; Peters, Bjoern; Antczak, Douglas F; Sette, Alessandro

    2015-11-01

    Here we describe a detailed quantitative peptide-binding motif for the common equine leukocyte antigen (ELA) class I allele Eqca-1*00101, present in roughly 25 % of Thoroughbred horses. We determined a preliminary binding motif by sequencing endogenously bound ligands. Subsequently, a positional scanning combinatorial library (PSCL) was used to further characterize binding specificity and derive a quantitative motif involving aspartic acid in position 2 and hydrophobic residues at the C-terminus. Using this motif, we selected and tested 9- and 10-mer peptides derived from the equine herpesvirus type 1 (EHV-1) proteome for their capacity to bind Eqca-1*00101. PSCL predictions were very efficient, with an receiver operating characteristic (ROC) curve performance of 0.877, and 87 peptides derived from 40 different EHV-1 proteins were identified with affinities of 500 nM or higher. Quantitative analysis revealed that Eqca-1*00101 has a narrow peptide-binding repertoire, in comparison to those of most human, non-human primate, and mouse class I alleles. Peripheral blood mononuclear cells from six EHV-1-infected, or vaccinated but uninfected, Eqca-1*00101-positive horses were used in IFN-γ enzyme-linked immunospot (ELISPOT) assays. When we screened the 87 Eqca-1*00101-binding peptides for T cell reactivity, only one Eqca-1*00101 epitope, derived from the intermediate-early protein ICP4, was identified. Thus, despite its common occurrence in several horse breeds, Eqca-1*00101 is associated with a narrow binding repertoire and a similarly narrow T cell response to an important equine viral pathogen. Intriguingly, these features are shared with other human and macaque major histocompatibility complex (MHC) molecules with a similar specificity for D in position 2 or 3 in their main anchor motif.

  19. The common equine class I molecule Eqca-1*00101 (ELA-A3.1) is characterized by narrow peptide binding and T cell epitope repertoires

    PubMed Central

    Bergmann, Tobias; Moore, Carrie; Sidney, John; Miller, Donald; Tallmadge, Rebecca; Harman, Rebecca M.; Oseroff, Carla; Wriston, Amanda; Shabanowitz, Jeffrey; Hunt, Donald F.; Osterrieder, Nikolaus; Peters, Bjoern; Antczak, Douglas F.; Sette, Alessandro

    2016-01-01

    Here we describe a detailed quantitative peptide-binding motif for the common equine leukocyte antigen (ELA) class I allele Eqca-1*00101, present in roughly 25 % of Thoroughbred horses. We determined a preliminary binding motif by sequencing endogenously bound ligands. Subsequently, a positional scanning combinatorial library (PSCL) was used to further characterize binding specificity and derive a quantitative motif involving aspartic acid in position 2 and hydrophobic residues at the C-terminus. Using this motif, we selected and tested 9- and 10-mer peptides derived from the equine herpesvirus type 1 (EHV-1) proteome for their capacity to bind Eqca-1*00101. PSCL predictions were very efficient, with an receiver operating characteristic (ROC) curve performance of 0.877, and 87 peptides derived from 40 different EHV-1 proteins were identified with affinities of 500 nM or higher. Quantitative analysis revealed that Eqca-1*00101 has a narrow peptide-binding repertoire, in comparison to those of most human, non-human primate, and mouse class I alleles. Peripheral blood mononuclear cells from six EHV-1-infected, or vaccinated but uninfected, Eqca-1*00101-positive horses were used in IFN-γ enzyme-linked immunospot (ELISPOT) assays. When we screened the 87 Eqca-1*00101-binding peptides for T cell reactivity, only one Eqca-1*00101 epitope, derived from the intermediate-early protein ICP4, was identified. Thus, despite its common occurrence in several horse breeds, Eqca-1*00101 is associated with a narrow binding repertoire and a similarly narrow T cell response to an important equine viral pathogen. Intriguingly, these features are shared with other human and macaque major histocompatibility complex (MHC) molecules with a similar specificity for D in position 2 or 3 in their main anchor motif. PMID:26399241

  20. Identification of regulatory targets of tissue-specific transcription factors: application to retina-specific gene regulation

    PubMed Central

    Qian, Jiang; Esumi, Noriko; Chen, Yangjian; Wang, Qingliang; Chowers, Itay; Zack, Donald J.

    2005-01-01

    Identification of tissue-specific gene regulatory networks can yield insights into the molecular basis of a tissue's development, function and pathology. Here, we present a computational approach designed to identify potential regulatory target genes of photoreceptor cell-specific transcription factors (TFs). The approach is based on the hypothesis that genes related to the retina in terms of expression, disease and/or function are more likely to be the targets of retina-specific TFs than other genes. A list of genes that are preferentially expressed in retina was obtained by integrating expressed sequence tag, SAGE and microarray datasets. The regulatory targets of retina-specific TFs are enriched in this set of retina-related genes. A Bayesian approach was employed to integrate information about binding site location relative to a gene's transcription start site. Our method was applied to three retina-specific TFs, CRX, NRL and NR2E3, and a number of potential targets were predicted. To experimentally assess the validity of the bioinformatic predictions, mobility shift, transient transfection and chromatin immunoprecipitation assays were performed with five predicted CRX targets, and the results were suggestive of CRX regulation in 5/5, 3/5 and 4/5 cases, respectively. Together, these experiments strongly suggest that RP1, GUCY2D, ABCA4 are novel targets of CRX. PMID:15967807

Top