Predicting protein-binding regions in RNA using nucleotide profiles and compositions.
Choi, Daesik; Park, Byungkyu; Chae, Hanju; Lee, Wook; Han, Kyungsook
2017-03-14
Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .
Dhanda, Sandeep Kumar; Grifoni, Alba; Pham, John; Vaughan, Kerrie; Sidney, John; Peters, Bjoern; Sette, Alessandro
2018-01-01
Unwanted immune responses against protein therapeutics can reduce efficacy or lead to adverse reactions. T-cell responses are key in the development of such responses, and are directed against immunodominant regions within the protein sequence, often associated with binding to several allelic variants of HLA class II molecules (promiscuous binders). Herein, we report a novel computational strategy to predict 'de-immunized' peptides, based on previous studies of erythropoietin protein immunogenicity. This algorithm (or method) first predicts promiscuous binding regions within the target protein sequence and then identifies residue substitutions predicted to reduce HLA binding. Further, this method anticipates the effect of any given substitution on flanking peptides, thereby circumventing the creation of nascent HLA-binding regions. As a proof-of-principle, the algorithm was applied to Vatreptacog α, an engineered Factor VII molecule associated with unintended immunogenicity. The algorithm correctly predicted the two immunogenic peptides containing the engineered residues. As a further validation, we selected and evaluated the immunogenicity of seven substitutions predicted to simultaneously reduce HLA binding for both peptides, five control substitutions with no predicted reduction in HLA-binding capacity, and additional flanking region controls. In vitro immunogenicity was detected in 21·4% of the cultures of peptides predicted to have reduced HLA binding and 11·4% of the flanking regions, compared with 46% for the cultures of the peptides predicted to be immunogenic. This method has been implemented as an interactive application, freely available online at http://tools.iedb.org/deimmunization/. © 2017 John Wiley & Sons Ltd.
Predicting MHC-II binding affinity using multiple instance regression
EL-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2011-01-01
Reliably predicting the ability of antigen peptides to bind to major histocompatibility complex class II (MHC-II) molecules is an essential step in developing new vaccines. Uncovering the amino acid sequence correlates of the binding affinity of MHC-II binding peptides is important for understanding pathogenesis and immune response. The task of predicting MHC-II binding peptides is complicated by the significant variability in their length. Most existing computational methods for predicting MHC-II binding peptides focus on identifying a nine amino acids core region in each binding peptide. We formulate the problems of qualitatively and quantitatively predicting flexible length MHC-II peptides as multiple instance learning and multiple instance regression problems, respectively. Based on this formulation, we introduce MHCMIR, a novel method for predicting MHC-II binding affinity using multiple instance regression. We present results of experiments using several benchmark datasets that show that MHCMIR is competitive with the state-of-the-art methods for predicting MHC-II binding peptides. An online web server that implements the MHCMIR method for MHC-II binding affinity prediction is freely accessible at http://ailab.cs.iastate.edu/mhcmir. PMID:20855923
NASA Astrophysics Data System (ADS)
Keskin, Ozlem; Ma, Buyong; Rogale, Kristina; Gunasekaran, K.; Nussinov, Ruth
2005-06-01
Understanding and ultimately predicting protein associations is immensely important for functional genomics and drug design. Here, we propose that binding sites have preferred organizations. First, the hot spots cluster within densely packed 'hot regions'. Within these regions, they form networks of interactions. Thus, hot spots located within a hot region contribute cooperatively to the stability of the complex. However, the contributions of separate, independent hot regions are additive. Moreover, hot spots are often already pre-organized in the unbound (free) protein states. Describing a binding site through independent local hot regions has implications for binding site definition, design and parametrization for prediction. The compactness and cooperativity emphasize the similarity between binding and folding. This proposition is grounded in computation and experiment. It explains why summation of the interactions may over-estimate the stability of the complex. Furthermore, statistically, charge-charge coupling of the hot spots is disfavored. However, since within the highly packed regions the solvent is screened, the electrostatic contributions are strengthened. Thus, we propose a new description of protein binding sites: a site consists of (one or a few) self-contained cooperative regions. Since the residue hot spots are those conserved by evolution, proteins binding multiple partners at the same sites are expected to use all or some combination of these regions.
iDBPs: a web server for the identification of DNA binding proteins.
Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2010-03-01
The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. http://idbps.tau.ac.il/
Dweep, Harsh; Sticht, Carsten; Pandey, Priyanka; Gretz, Norbert
2011-10-01
MicroRNAs are small, non-coding RNA molecules that can complementarily bind to the mRNA 3'-UTR region to regulate the gene expression by transcriptional repression or induction of mRNA degradation. Increasing evidence suggests a new mechanism by which miRNAs may regulate target gene expression by binding in promoter and amino acid coding regions. Most of the existing databases on miRNAs are restricted to mRNA 3'-UTR region. To address this issue, we present miRWalk, a comprehensive database on miRNAs, which hosts predicted as well as validated miRNA binding sites, information on all known genes of human, mouse and rat. All mRNAs, mitochondrial genes and 10 kb upstream flanking regions of all known genes of human, mouse and rat were analyzed by using a newly developed algorithm named 'miRWalk' as well as with eight already established programs for putative miRNA binding sites. An automated and extensive text-mining search was performed on PubMed database to extract validated information on miRNAs. Combined information was put into a MySQL database. miRWalk presents predicted and validated information on miRNA-target interaction. Such a resource enables researchers to validate new targets of miRNA not only on 3'-UTR, but also on the other regions of all known genes. The 'Validated Target module' is updated every month and the 'Predicted Target module' is updated every 6 months. miRWalk is freely available at http://mirwalk.uni-hd.de/. Copyright © 2011 Elsevier Inc. All rights reserved.
iDBPs: a web server for the identification of DNA binding proteins
Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2010-01-01
Summary: The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. Availability: http://idbps.tau.ac.il/ Contact: NirB@tauex.tau.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20089514
PredictProtein—an open resource for online prediction of protein structural and functional features
Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard
2014-01-01
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Local functional descriptors for surface comparison based binding prediction
2012-01-01
Background Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface. Results We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods. Conclusions Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications. PMID:23176080
Galzitskaya, Oxana; Deryusheva, Eugenia; Machulin, Andrey; Nemashkalova, Ekaterina; Glyakina, Anna
2018-06-21
High prediction accuracy of flexible loops in different protein families is a challenge because of the crucial functions associated with these regions. Results of the currently available programs for prediction of loops vary from protein to protein. For prediction of flexible regions in the G-domain for 23 representatives of G-proteins with the known 3D structure we have used eight programs. The results of predictions demonstrate that the FoldUnfold program predicts better loop positions than the PONDR, RОNN, DisEMBL, IUPred, GlobPlot 2, FoldIndex, and MobiDB programs. When classifying the predicted loops (rigid/flexible) according to the Debye-Waller fluctuation factors, our data reveal the existing weak correlation between the B-factors and the average number of closed residues according to the FoldUnfold program; the percentage of overlapping characteristics (residue fold/unfold status) of the protein residues from the two methods is about 60-70%. According to the FoldUnfold program, for G-proteins with the posttranslational modifications, the surrounding binding site residues by disordered-promoting glycine and alanine residues conduces to a more flexible position of the binding sites for fatty acid, while methionine, cysteine and isoleucine residues provide more rigid binding sites. Thus, our research demonstrates additional possibilities of the FoldUnfold program for prediction of flexible regions and characteristics of individual residues in a different protein family. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Brain mu-opioid receptor binding predicts treatment outcome in cocaine-abusing outpatients
Ghitza, Udi E.; Preston, Kenzie L.; Epstein, David H.; Kuwabara, Hiroto; Endres, Christopher J.; Bencherif, Badreddine; Boyd, Susan J.; Copersino, Marc L.; Frost, J. James; Gorelick, David A.
2010-01-01
Background Cocaine users not seeking treatment have increased regional brain mu-opioid receptor (mOR) binding that correlates with cocaine craving and tendency to relapse. In cocaine-abusing outpatients in treatment, the relationship of mOR binding and treatment outcome is unknown. Methods We determined whether regional brain mOR binding before treatment correlates with outcome and compared it to standard clinical predictors of outcome. Twenty-five individuals seeking outpatient treatment for cocaine abuse or dependence (DSM-IV) received up to 12 weeks of cognitive-behavioral therapy and cocaine-abstinence reinforcement whereby each cocaine-free urine was reinforced with vouchers redeemable for goods. Regional brain mOR binding was measured before treatment using positron emission tomography (PET) with [11C] carfentanil (a selective mOR agonist). Main outcome measures were: 1) overall percentage of urines positive for cocaine during first month of treatment, 2) longest duration (weeks) of abstinence from cocaine during treatment, all verified by urine toxicology. Results Elevated mOR binding in the medial frontal and middle frontal gyri before treatment correlated with greater cocaine use during treatment. Elevated mOR binding in the anterior cingulate, medial frontal, middle frontal, middle temporal, and sub-lobar insular gyri correlated with shorter duration of cocaine abstinence during treatment. Regional mOR binding contributed significant predictive power for treatment outcome beyond that of standard clinical variables such as baseline drug and alcohol use. Conclusions Elevated mOR binding in brain regions associated with reward sensitivity is a significant independent predictor of treatment outcome in cocaine-abusing outpatients, suggesting a key role for the brain endogenous opioid system in cocaine addiction. PMID:20579973
The Role of Genome Accessibility in Transcription Factor Binding in Bacteria.
Gomes, Antonio L C; Wang, Harris H
2016-04-01
ChIP-seq enables genome-scale identification of regulatory regions that govern gene expression. However, the biological insights generated from ChIP-seq analysis have been limited to predictions of binding sites and cooperative interactions. Furthermore, ChIP-seq data often poorly correlate with in vitro measurements or predicted motifs, highlighting that binding affinity alone is insufficient to explain transcription factor (TF)-binding in vivo. One possibility is that binding sites are not equally accessible across the genome. A more comprehensive biophysical representation of TF-binding is required to improve our ability to understand, predict, and alter gene expression. Here, we show that genome accessibility is a key parameter that impacts TF-binding in bacteria. We developed a thermodynamic model that parameterizes ChIP-seq coverage in terms of genome accessibility and binding affinity. The role of genome accessibility is validated using a large-scale ChIP-seq dataset of the M. tuberculosis regulatory network. We find that accounting for genome accessibility led to a model that explains 63% of the ChIP-seq profile variance, while a model based in motif score alone explains only 35% of the variance. Moreover, our framework enables de novo ChIP-seq peak prediction and is useful for inferring TF-binding peaks in new experimental conditions by reducing the need for additional experiments. We observe that the genome is more accessible in intergenic regions, and that increased accessibility is positively correlated with gene expression and anti-correlated with distance to the origin of replication. Our biophysically motivated model provides a more comprehensive description of TF-binding in vivo from first principles towards a better representation of gene regulation in silico, with promising applications in systems biology.
Binding ligand prediction for proteins using partial matching of local surface patches.
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Disfani, Fatemeh Miri; Hsu, Wei-Lun; Mizianty, Marcin J.; Oldfield, Christopher J.; Xue, Bin; Dunker, A. Keith; Uversky, Vladimir N.; Kurgan, Lukasz
2012-01-01
Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues. Availability: http://biomine.ece.ualberta.ca/MoRFpred/; http://biomine.ece.ualberta.ca/MoRFpred/Supplement.pdf Contact: lkurgan@ece.ualberta.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22689782
PRISM offers a comprehensive genomic approach to transcription factor function prediction
Wenger, Aaron M.; Clarke, Shoa L.; Guturu, Harendra; Chen, Jenny; Schaar, Bruce T.; McLean, Cory Y.; Bejerano, Gill
2013-01-01
The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells. PMID:23382538
Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction*
Rahman, Kh. Shamsur; Chowdhury, Erfan Ullah; Sachse, Konrad; Kaltenboeck, Bernhard
2016-01-01
X-ray crystallography has shown that an antibody paratope typically binds 15–22 amino acids (aa) of an epitope, of which 2–5 randomly distributed amino acids contribute most of the binding energy. In contrast, researchers typically choose for B-cell epitope mapping short peptide antigens in antibody binding assays. Furthermore, short 6–11-aa epitopes, and in particular non-epitopes, are over-represented in published B-cell epitope datasets that are commonly used for development of B-cell epitope prediction approaches from protein antigen sequences. We hypothesized that such suboptimal length peptides result in weak antibody binding and cause false-negative results. We tested the influence of peptide antigen length on antibody binding by analyzing data on more than 900 peptides used for B-cell epitope mapping of immunodominant proteins of Chlamydia spp. We demonstrate that short 7–12-aa peptides of B-cell epitopes bind antibodies poorly; thus, epitope mapping with short peptide antigens falsely classifies many B-cell epitopes as non-epitopes. We also show in published datasets of confirmed epitopes and non-epitopes a direct correlation between length of peptide antigens and antibody binding. Elimination of short, ≤11-aa epitope/non-epitope sequences improved datasets for evaluation of in silico B-cell epitope prediction. Achieving up to 86% accuracy, protein disorder tendency is the best indicator of B-cell epitope regions for chlamydial and published datasets. For B-cell epitope prediction, the most effective approach is plotting disorder of protein sequences with the IUPred-L scale, followed by antibody reactivity testing of 16–30-aa peptides from peak regions. This strategy overcomes the well known inaccuracy of in silico B-cell epitope prediction from primary protein sequences. PMID:27189949
Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K.; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G.; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H.
2017-01-01
The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. PMID:27899623
Haider, Kamran; Huggins, David J
2013-10-28
Intermolecular interactions in the aqueous phase must compete with the interactions between the two binding partners and their solvating water molecules. In biological systems, water molecules in protein binding sites cluster at well-defined hydration sites and can form strong hydrogen-bonding interactions with backbone and side-chain atoms. Displacement of such water molecules is only favorable when the ligand can form strong compensating hydrogen bonds. Conversely, water molecules in hydrophobic regions of protein binding sites make only weak interactions, and the requirements for favorable displacement are less stringent. The propensity of water molecules for displacement can be identified using inhomogeneous fluid solvation theory (IFST), a statistical mechanical method that decomposes the solvation free energy of a solute into the contributions from different spatial regions and identifies potential binding hotspots. In this study, we employed IFST to study the displacement of water molecules from the ATP binding site of Hsp90, using a test set of 103 ligands. The predicted contribution of a hydration site to the hydration free energy was found to correlate well with the observed displacement. Additionally, we investigated if this correlation could be improved by using the energetic scores of favorable probe groups binding at the location of hydration sites, derived from a multiple copy simultaneous search (MCSS) method. The probe binding scores were not highly predictive of the observed displacement and did not improve the predictivity when used in combination with IFST-based hydration free energies. The results show that IFST alone can be used to reliably predict the observed displacement of water molecules in Hsp90. However, MCSS can augment IFST calculations by suggesting which functional groups should be used to replace highly displaceable water molecules. Such an approach could be very useful in improving the hit-to-lead process for new drug targets.
Ambroggio, Xavier; Jiang, Lubin; Aebig, Joan; Obiakor, Harold; Lukszo, Jan; Narum, David L
2013-01-01
The malaria parasite, Plasmodium falciparum, and related parasites use a variety of proteins with Duffy-Binding Like (DBL) domains to bind glycoproteins on the surface of host cells. Among these proteins, the 175 kDa erythrocyte binding antigen, EBA-175, specifically binds to glycophorin A on the surface of human erythrocytes during the process of merozoite invasion. The domain responsible for glycophorin A binding was identified as region II (RII) which contains two DBL domains, F1 and F2. The crystal structure of this region revealed a dimer that is presumed to represent the glycophorin A binding conformation as sialic acid binding sites and large cavities are observed at the dimer interface. The dimer interface is largely composed of two loops from within each monomer, identified as the F1 and F2 β-fingers that contact depressions in the opposing monomers in a similar manner. Previous studies have identified a panel of five monoclonal antibodies (mAbs) termed R215 to R218 and R256 that bind to RII and inhibit invasion of erythrocytes to varying extents. In this study, we predict the F2 β-finger region as the conformational epitope for mAbs, R215, R217, and R256, and confirm binding for the most effective blocking mAb R217 and R215 to a synthetic peptide mimic of the F2 β-finger. Localization of the epitope to the dimerization and glycan binding sites of EBA-175 RII and site-directed mutagenesis within the predicted epitope are consistent with R215 and R217 blocking erythrocyte invasion by Plasmodium falciparum by preventing formation of the EBA-175- glycophorin A complex.
A comparison of progestin and androgen receptor binding using the CoMFA technique
NASA Astrophysics Data System (ADS)
Loughney, Deborah A.; Schwender, Charles F.
1992-12-01
A series of 48 steroids has been studied with the SYBYL QSAR module using Relative Binding Affinities (RBAs) to progesterone and androgen receptors obtained from the literature. Models for the progesterone and androgen data were developed. Both models show regions where sterics and electrostatics correlate to binding affinity but are different for androgen and progesterone which suggests differences possibly important for receptor selectivity. The progesterone model is more predictive than the androgen (predictive r2 of 0.725 vs. 0.545 for progesterone and androgen, respectively).
USDA-ARS?s Scientific Manuscript database
Major histocompatibility complex (MHC) class I molecules regulate adaptive immune responses through the presentation of antigenic peptides to CD8positive T-cells. Polymorphisms in the peptide binding region of class I molecules determine peptide binding affinity and stability during antigen presenta...
Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0
Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke
2015-01-01
Motivation: Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. Results: We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. Availability and implementation: http://kiharalab.org/patchsurfer2.0/ Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25359888
Glover, Karen; Mei, Yang; Sinha, Sangita C
2016-10-01
Many proteins contain intrinsically disordered regions (IDRs) lacking stable secondary and ordered tertiary structure. IDRs are often implicated in macromolecular interactions, and may undergo structural transitions upon binding to interaction partners. However, as binding partners of many protein IDRs are unknown, these structural transitions are difficult to verify and often are poorly understood. In this study we describe a method to identify IDRs that are likely to undergo helical transitions upon binding. This method combines bioinformatics analyses followed by circular dichroism spectroscopy to monitor 2,2,2-trifluoroethanol (TFE)-induced changes in secondary structure content of these IDRs. Our results demonstrate that there is no significant change in the helicity of IDRs that are not predicted to fold upon binding. IDRs that are predicted to fold fall into two groups: one group does not become helical in the presence of TFE and includes examples of IDRs that form β-strands upon binding, while the other group becomes more helical and includes examples that are known to fold into helices upon binding. Therefore, we propose that bioinformatics analyses combined with experimental evaluation using TFE may provide a general method to identify IDRs that undergo binding-induced disorder-to-helix transitions. Copyright © 2016 Elsevier B.V. All rights reserved.
MHC2NNZ: A novel peptide binding prediction approach for HLA DQ molecules
NASA Astrophysics Data System (ADS)
Xie, Jiang; Zeng, Xu; Lu, Dongfang; Liu, Zhixiang; Wang, Jiao
2017-07-01
The major histocompatibility complex class II (MHC-II) molecule plays a crucial role in immunology. Computational prediction of MHC-II binding peptides can help researchers understand the mechanism of immune systems and design vaccines. Most of the prediction algorithms for MHC-II to date have made large efforts in human leukocyte antigen (HLA, the name of MHC in Human) molecules encoded in the DR locus. However, HLA DQ molecules are equally important and have only been made less progress because it is more difficult to handle them experimentally. In this study, we propose an artificial neural network-based approach called MHC2NNZ to predict peptides binding to HLA DQ molecules. Unlike previous artificial neural network-based methods, MHC2NNZ not only considers sequence similarity features but also captures the chemical and physical properties, and a novel method incorporating these properties is proposed to represent peptide flanking regions (PFR). Furthermore, MHC2NNZ improves the prediction accuracy by combining with amino acid preference at more specific positions of the peptides binding core. By evaluating on 3549 peptides binding to six most frequent HLA DQ molecules, MHC2NNZ is demonstrated to outperform other state-of-the-art MHC-II prediction methods.
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando
2014-01-01
Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. PMID:25268582
Berghof, Tom V. L.; Visker, Marleen H. P. W.; Arts, Joop A. J.; Parmentier, Henk K.; van der Poel, Jan J.; Vereijken, Addie L. J.; Bovenhuis, Henk
2018-01-01
Natural antibodies (NAb) are antigen binding antibodies present in individuals without a previous exposure to this antigen. Keyhole limpet hemocyanin (KLH)-binding NAb levels were previously associated with survival in chickens. This suggests that selective breeding for KLH-binding NAb may increase survival by means of improved general disease resistance. Genome-wide association studies (GWAS) were performed to identify genes underlying genetic variation in NAb levels. The studied population consisted of 1,628 adolescent layer chickens with observations for titers of KLH-binding NAb of the isotypes IgM, IgA, IgG, the total KLH-binding (IgT) NAb titers, total antibody concentrations of the isotypes IgM, IgA, IgG, and the total antibodies concentration in plasma. GWAS were performed using 57,636 single-nucleotide polymorphisms (SNP). One chromosomal region on chromosome 4 was associated with KLH-binding IgT NAb, and total IgM concentration, and especially with KLH-binding IgM NAb. The region of interest was fine mapped by imputing the region of the study population to whole genome sequence, and subsequently performing an association study using the imputed sequence variants. 16 candidate genes were identified, of which FAM114A1, Toll-like receptor 1 family member B (TLR1B), TLR1A, Krüppel-like factor 3 (KLF3) showed the strongest associations. SNP located in coding regions of the candidate genes were checked for predicted changes in protein functioning. One SNP (at 69,965,939 base pairs) received the maximum impact score from two independent prediction tools, which makes this SNP the most likely causal variant. This SNP is located in TLR1A, which suggests a fundamental role of TLR1A on regulation of IgM levels (i.e., KLH-binding IgM NAb, and total IgM concentration), or B cells biology, or both. This study contributes to increased understanding of (genetic) regulation of KLH-binding NAb levels, and total antibody concentrations. PMID:29375555
Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H
2017-01-09
The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Prediction and Reduction of the Aggregation of Monoclonal Antibodies.
van der Kant, Rob; Karow-Zwick, Anne R; Van Durme, Joost; Blech, Michaela; Gallardo, Rodrigo; Seeliger, Daniel; Aßfalg, Kerstin; Baatsen, Pieter; Compernolle, Griet; Gils, Ann; Studts, Joey M; Schulz, Patrick; Garidel, Patrick; Schymkowitz, Joost; Rousseau, Frederic
2017-04-21
Protein aggregation remains a major area of focus in the production of monoclonal antibodies. Improving the intrinsic properties of antibodies can improve manufacturability, attrition rates, safety, formulation, titers, immunogenicity, and solubility. Here, we explore the potential of predicting and reducing the aggregation propensity of monoclonal antibodies, based on the identification of aggregation-prone regions and their contribution to the thermodynamic stability of the protein. Although aggregation-prone regions are thought to occur in the antigen binding region to drive hydrophobic binding with antigen, we were able to rationally design variants that display a marked decrease in aggregation propensity while retaining antigen binding through the introduction of artificial aggregation gatekeeper residues. The reduction in aggregation propensity was accompanied by an increase in expression titer, showing that reducing protein aggregation is beneficial throughout the development process. The data presented show that this approach can significantly reduce liabilities in novel therapeutic antibodies and proteins, leading to a more efficient path to clinical studies. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
In Silico Prediction and In Vitro Characterization of Multifunctional Human RNase3
Kuo, Ping-Hsueh; Chen, Chien-Jung; Chang, Hsiu-Hui; Fang, Shun-lung; Wu, Wei-Shuo; Lai, Yiu-Kay; Pai, Tun-Wen; Chang, Margaret Dah-Tsyr
2013-01-01
Human ribonucleases A (hRNaseA) superfamily consists of thirteen members with high-structure similarities but exhibits divergent physiological functions other than RNase activity. Evolution of hRNaseA superfamily has gained novel functions which may be preserved in a unique region or domain to account for additional molecular interactions. hRNase3 has multiple functions including ribonucleolytic, heparan sulfate (HS) binding, cellular binding, endocytic, lipid destabilization, cytotoxic, and antimicrobial activities. In this study, three putative multifunctional regions, 34RWRCK38 (HBR1), 75RSRFR79 (HBR2), and 101RPGRR105 (HBR3), of hRNase3 have been identified employing in silico sequence analysis and validated employing in vitro activity assays. A heparin binding peptide containing HBR1 is characterized to act as a key element associated with HS binding, cellular binding, and lipid binding activities. In this study, we provide novel insights to identify functional regions of hRNase3 that may have implications for all hRNaseA superfamily members. PMID:23484086
Marsh, Lorraine
2015-01-01
Many systems in biology rely on binding of ligands to target proteins in a single high-affinity conformation with a favorable ΔG. Alternatively, interactions of ligands with protein regions that allow diffuse binding, distributed over multiple sites and conformations, can exhibit favorable ΔG because of their higher entropy. Diffuse binding may be biologically important for multidrug transporters and carrier proteins. A fine-grained computational method for numerical integration of total binding ΔG arising from diffuse regional interaction of a ligand in multiple conformations using a Markov Chain Monte Carlo (MCMC) approach is presented. This method yields a metric that quantifies the influence on overall ligand affinity of ligand binding to multiple, distinct sites within a protein binding region. This metric is essentially a measure of dispersion in equilibrium ligand binding and depends on both the number of potential sites of interaction and the distribution of their individual predicted affinities. Analysis of test cases indicates that, for some ligand/protein pairs involving transporters and carrier proteins, diffuse binding contributes greatly to total affinity, whereas in other cases the influence is modest. This approach may be useful for studying situations where "nonspecific" interactions contribute to biological function.
Mehdi, Haider; Naqvi, Asma; Kamboh, M. lIyas
2008-01-01
Human β2-glycoprotein I (β2GPI) binds to recombinant hepatitis B surface antigen (rHBsAg), but the location of the binding domain on β2GPI is unknown. It has been suggested that the lipid rather than the protein moiety of rHBsAg binds to β2GPI. Since β2 GPI binds to anionic phospholipids (PL) through its lipid binding region in the fifth domain of β2GPI, we predicted that this lipid binding region may also be involved in binding rHBsAg. In this study, we examined rHBsAg binding to two naturally occurring mutants of β2GPI, Cys306Gly and Trp316Ser, or evolutionarily conserved hydrophobic amino acid sequence, Leu313-Ala314-Phe315 in the fifth domain of β2GPI. The two naturally occurring mutations and two mutagenized amino acids, Leu313Gly or Phe315Ser, disrupted the binding of recombinant β2GPI (rβ2GPI) to both rHBsAg and cardiolipin (CL), an anionic PL. These results suggest that rHBsAg and CL share the same region in the fifth domain of β2GPI. Credence to this conclusion was further provided by competitive ELISA, where CL-bound rβ2GPI was incubated with increasing amounts of rHBsAg. As expected, pre-incubation of rβ2GPI with CL precluded binding to rHBsAg, indicating that CL and rHBsAg bind to the same region on β2GPI. Our data provide evidence that the lipid (PL) rather than the protein moiety of rHBsAg binds to β2GPI and that this binding region is located in the fifth domain of β2GPI, which also binds to anionic PL. PMID:18230366
Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0.
Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke
2015-03-01
Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. http://kiharalab.org/patchsurfer2.0/ CONTACT: dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Computational approach to analyze isolated ssDNA aptamers against angiotensin II.
Heiat, Mohammad; Najafi, Ali; Ranjbar, Reza; Latifi, Ali Mohammad; Rasaee, Mohammad Javad
2016-07-20
Aptamers are oligonucleotides with highly structured molecules that can bind to their targets through specific 3-D conformation. Commonly, not all the nucleotides such as primer binding fixed region and some other sequences are vital for aptamers folding and interaction. Elimination of unnecessary regions needs trustworthy prediction tools to reduce experimental efforts and errors. Here we introduced a manipulated in-silico approach to predict the 3-D structure of aptamers and their target interactions. To design an approach for computational analysis of isolated ssDNA aptamers (FLC112, FLC125 and their truncated core region including CRC112 and CRC125), their secondary and tertiary structures were modeled by Mfold and RNA composer respectively. Output PDB files were modified from RNA to DNA in the discovery studio visualizer software. Using ZDOCK server, the aptamer-target interactions were predicted. Finally, the interaction scores were compared with the experimental results. In-silico interaction scores and the experimental outcomes were in the same descending arrangement of FLC112>CRC125>CRC112>FLC125 with similar intensity. The consistent results of innovative in-silico method with experimental outputs, affirmed that the present method may be a reliable approach. Also, it showed that the exact in-silico predictions can be utilized as a credible reference to find aptameric fragments binding potency. Copyright © 2016 Elsevier B.V. All rights reserved.
Kume, Akiko; Kawai, Shun; Kato, Ryuji; Iwata, Shinmei; Shimizu, Kazunori; Honda, Hiroyuki
2017-02-01
To investigate the binding properties of a peptide sequence, we conducted principal component analysis (PCA) of the physicochemical features of a tetramer peptide library comprised of 512 peptides, and the variables were reduced to two principal components. We selected IL-2 and IgG as model proteins and the binding affinity to these proteins was assayed using the 512 peptides mentioned above. PCA of binding affinity data showed that 16 and 18 variables were suitable for localizing IL-2 and IgG high-affinity binding peptides, respectively, into a restricted region of the PCA plot. We then investigated whether the binding affinity of octamer peptide libraries could be predicted using the identified region in the tetramer PCA. The results show that octamer high-affinity binding peptides were also concentrated in the tetramer high-affinity binding region of both IL-2 and IgG. The average fluorescence intensity of high-affinity binding peptides was 3.3- and 2.1-fold higher than that of low-affinity binding peptides for IL-2 and IgG, respectively. We conclude that PCA may be used to identify octamer peptides with high- or low-affinity binding properties from data from a tetramer peptide library. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Nebulette interacts with filamin C.
Holmes, William B; Moncman, Carole L
2008-02-01
The actin-binding proteins, nebulette, and nebulin, are comprised of a four-domain layout containing an acidic N-terminal region, a repeat domain, a serine-rich-linker region, and a Src homology-3 domain. Both proteins contain homologous N-terminal regions that are predicted to be in different environments within the sarcomere. The nebulin acidic N-terminal region is found at the distal ends of the thin filaments. Nebulette, however, is predicted to extend 150 nm from the center of the Z-line. To dissect out the functions of the N-terminal domain of nebulette, we have performed a yeast two-hybrid screen using nebulette residues 1-86 as bait. We have identified filamin-C, ZASP-1, and tropomyosin-1 as binding partners. Characterization of the nebulette-filamin interaction indicates that filamin-C predominantly interacts with the modules. These data suggest that filamin-C, a known component of striated muscle Z-lines, interacts with nebulette modules. Copyright 2007 Wiley-Liss, Inc.
Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques
2008-01-01
This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.
Vazquez-Anderson, Jorge; Mihailovic, Mia K.; Baldridge, Kevin C.; Reyes, Kristofer G.; Haning, Katie; Cho, Seung Hee; Amador, Paul; Powell, Warren B.
2017-01-01
Abstract Current approaches to design efficient antisense RNAs (asRNAs) rely primarily on a thermodynamic understanding of RNA–RNA interactions. However, these approaches depend on structure predictions and have limited accuracy, arguably due to overlooking important cellular environment factors. In this work, we develop a biophysical model to describe asRNA–RNA hybridization that incorporates in vivo factors using large-scale experimental hybridization data for three model RNAs: a group I intron, CsrB and a tRNA. A unique element of our model is the estimation of the availability of the target region to interact with a given asRNA using a differential entropic consideration of suboptimal structures. We showcase the utility of this model by evaluating its prediction capabilities in four additional RNAs: a group II intron, Spinach II, 2-MS2 binding domain and glgC 5΄ UTR. Additionally, we demonstrate the applicability of this approach to other bacterial species by predicting sRNA–mRNA binding regions in two newly discovered, though uncharacterized, regulatory RNAs. PMID:28334800
Lefkowith, J B; Di Valerio, R; Norris, J; Glick, G D; Alexander, A L; Jackson, L; Gilkeson, G S
1996-08-01
We recently produced a panel of seven glomerular-binding mAbs from a nephritic MRL-lpr mouse that bind to histones/nucleosomes (group I) or DNA (group II) adherent to glomerular basement membrane. To elucidate the molecular basis of their binding and ontogeny, we sequenced their variable (V) regions, analyzed the apparent somatic mutations, and predicted their three-dimensional structures. There were two clonally related sets (3 of 4 in group I, 3 of 3 in group II) both of the VHJ1558 family, and one mAb of the VH 7183 family. V region somatic mutations within clonally related sets had little effect on glomerular binding and did not appear to be selected for based on glomerular binding. The VH regions were most homologous with those from autoantibodies to histones, DNA, or IgG (i.e., rheumatoid factors), the Vkappa regions, with those from autoantibodies to small nuclear ribonucleoproteins (snRNP). The VH regions also exhibited an unusual VD junction (in the group I clonally related set) and an overall high content of charged amino acids (arginine, aspartic acid) in complementarity-determining regions (CDRs), particularly in CDR3. Molecular modeling studies suggested that the Fv regions of these mAbs converge to form a flat, open surface with a net positive charge. The CDR arginines in group I mAbs; appear to be located in Ag contact regions of the binding cleft. In sum, these data suggest that glomerulotropic mAbs are a highly restricted set of Abs with distinctive molecular features that may mediate their binding to glomeruli.
BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations
Wang, Junbai; Batmanov, Kirill
2015-01-01
Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972
Prediction of Protein-Protein Interaction Sites Using Electrostatic Desolvation Profiles
Fiorucci, Sébastien; Zacharias, Martin
2010-01-01
Abstract Protein-protein complex formation involves removal of water from the interface region. Surface regions with a small free energy penalty for water removal or desolvation may correspond to preferred interaction sites. A method to calculate the electrostatic free energy of placing a neutral low-dielectric probe at various protein surface positions has been designed and applied to characterize putative interaction sites. Based on solutions of the finite-difference Poisson equation, this method also includes long-range electrostatic contributions and the protein solvent boundary shape in contrast to accessible-surface-area-based solvation energies. Calculations on a large set of proteins indicate that in many cases (>90%), the known binding site overlaps with one of the six regions of lowest electrostatic desolvation penalty (overlap with the lowest desolvation region for 48% of proteins). Since the onset of electrostatic desolvation occurs even before direct protein-protein contact formation, it may help guide proteins toward the binding region in the final stage of complex formation. It is interesting that the probe desolvation properties associated with residue types were found to depend to some degree on whether the residue was outside of or part of a binding site. The probe desolvation penalty was on average smaller if the residue was part of a binding site compared to other surface locations. Applications to several antigen-antibody complexes demonstrated that the approach might be useful not only to predict protein interaction sites in general but to map potential antigenic epitopes on protein surfaces. PMID:20441756
Andreatta, Massimo; Karosiene, Edita; Rasmussen, Michael; Stryhn, Anette; Buus, Søren; Nielsen, Morten
2015-11-01
A key event in the generation of a cellular response against malicious organisms through the endocytic pathway is binding of peptidic antigens by major histocompatibility complex class II (MHC class II) molecules. The bound peptide is then presented on the cell surface where it can be recognized by T helper lymphocytes. NetMHCIIpan is a state-of-the-art method for the quantitative prediction of peptide binding to any human or mouse MHC class II molecule of known sequence. In this paper, we describe an updated version of the method with improved peptide binding register identification. Binding register prediction is concerned with determining the minimal core region of nine residues directly in contact with the MHC binding cleft, a crucial piece of information both for the identification and design of CD4(+) T cell antigens. When applied to a set of 51 crystal structures of peptide-MHC complexes with known binding registers, the new method NetMHCIIpan-3.1 significantly outperformed the earlier 3.0 version. We illustrate the impact of accurate binding core identification for the interpretation of T cell cross-reactivity using tetramer double staining with a CMV epitope and its variants mapped to the epitope binding core. NetMHCIIpan is publicly available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.1 .
Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina
2016-01-01
Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805
Camacho, Carlos J
2005-08-01
The CAPRI-II experiment added an extra level of complexity to the problem of predicting protein-protein interactions by including 5 targets for which participants had to build or complete the 3-dimensional (3D) structure of either the receptor or ligand based on the structure of a close homolog. In this article, we describe how modeling key side-chains using molecular dynamics (MD) in explicit solvent improved the recognition of the binding region of a free energy- based computational docking method. In particular, we show that MD is able to predict with relatively high accuracy the rotamer conformation of the anchor side-chains important for molecular recognition as suggested by Rajamani et al. (Proc Natl Acad Sci USA 2004;101:11287-11292). As expected, the conformations are some of the most common rotamers for the given residue, while latch side-chains that undergo induced fit upon binding are forced into less common conformations. Using these models as starting conformations in conjunction with the rigid-body docking server ClusPro and the flexible docking algorithm SmoothDock, we produced valuable predictions for 6 of the 9 targets in CAPRI-II, missing only the 3 targets that underwent significant structural rearrangements upon binding. We also show that our free energy- based scoring function, consisting of the sum of van der Waals, Coulombic electrostatic with a distance-dependent dielectric, and desolvation free energy successfully discriminates the nativelike conformation of our submitted predictions. The latter emphasizes the critical role that thermodynamics plays on our methodology, and validates the generality of the algorithm to predict protein interactions.
CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data
O'Connor, Timothy; Bodén, Mikael
2017-01-01
Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599
Discrete Molecular Dynamics Can Predict Helical Prestructured Motifs in Disordered Proteins
Han, Kyou-Hoon; Dokholyan, Nikolay V.; Tompa, Péter; Kalmár, Lajos; Hegedűs, Tamás
2014-01-01
Intrinsically disordered proteins (IDPs) lack a stable tertiary structure, but their short binding regions termed Pre-Structured Motifs (PreSMo) can form transient secondary structure elements in solution. Although disordered proteins are crucial in many biological processes and designing strategies to modulate their function is highly important, both experimental and computational tools to describe their conformational ensembles and the initial steps of folding are sparse. Here we report that discrete molecular dynamics (DMD) simulations combined with replica exchange (RX) method efficiently samples the conformational space and detects regions populating α-helical conformational states in disordered protein regions. While the available computational methods predict secondary structural propensities in IDPs based on the observation of protein-protein interactions, our ab initio method rests on physical principles of protein folding and dynamics. We show that RX-DMD predicts α-PreSMos with high confidence confirmed by comparison to experimental NMR data. Moreover, the method also can dissect α-PreSMos in close vicinity to each other and indicate helix stability. Importantly, simulations with disordered regions forming helices in X-ray structures of complexes indicate that a preformed helix is frequently the binding element itself, while in other cases it may have a role in initiating the binding process. Our results indicate that RX-DMD provides a breakthrough in the structural and dynamical characterization of disordered proteins by generating the structural ensembles of IDPs even when experimental data are not available. PMID:24763499
Karttunen, Mikko; Choy, Wing-Yiu; Cino, Elio A
2018-06-07
Nuclear factor erythroid 2-related factor 2 (Nrf2) is a transcription factor and principal regulator of the antioxidant pathway. The Kelch domain of Kelch-like ECH-associated protein 1 (Keap1) binds to motifs in the N-terminal region of Nrf2, promoting its degradation. There is interest in developing ligands that can compete with Nrf2 for binding to Kelch, thereby activating its transcriptional activities and increasing antioxidant levels. Using experimental Δ G bind values of Kelch-binding motifs determined previously, a revised hydrophobicity-based model was developed for estimating Δ G bind from amino acid sequence and applied to rank potential uncharacterized Kelch-binding motifs identified from interaction databases and BLAST searches. Model predictions and molecular dynamics (MD) simulations suggested that full-length MAD2A binds Kelch more favorably than a high-affinity 20-mer Nrf2 E78P peptide, but that the motif in isolation is not a particularly strong binder. Endeavoring to develop shorter peptides for activating Nrf2, new designs were created based on the E78P peptide, some of which showed considerable propensity to form binding-competent structures in MD, and were predicted to interact with Kelch more favorably than the E78P peptide. The peptides could be promising new ligands for enhancing the oxidative stress response.
Kaus, Joseph W; Harder, Edward; Lin, Teng; Abel, Robert; McCammon, J Andrew; Wang, Lingle
2015-06-09
Recent advances in improved force fields and sampling methods have made it possible for the accurate calculation of protein–ligand binding free energies. Alchemical free energy perturbation (FEP) using an explicit solvent model is one of the most rigorous methods to calculate relative binding free energies. However, for cases where there are high energy barriers separating the relevant conformations that are important for ligand binding, the calculated free energy may depend on the initial conformation used in the simulation due to the lack of complete sampling of all the important regions in phase space. This is particularly true for ligands with multiple possible binding modes separated by high energy barriers, making it difficult to sample all relevant binding modes even with modern enhanced sampling methods. In this paper, we apply a previously developed method that provides a corrected binding free energy for ligands with multiple binding modes by combining the free energy results from multiple alchemical FEP calculations starting from all enumerated poses, and the results are compared with Glide docking and MM-GBSA calculations. From these calculations, the dominant ligand binding mode can also be predicted. We apply this method to a series of ligands that bind to c-Jun N-terminal kinase-1 (JNK1) and obtain improved free energy results. The dominant ligand binding modes predicted by this method agree with the available crystallography, while both Glide docking and MM-GBSA calculations incorrectly predict the binding modes for some ligands. The method also helps separate the force field error from the ligand sampling error, such that deviations in the predicted binding free energy from the experimental values likely indicate possible inaccuracies in the force field. An error in the force field for a subset of the ligands studied was identified using this method, and improved free energy results were obtained by correcting the partial charges assigned to the ligands. This improved the root-mean-square error (RMSE) for the predicted binding free energy from 1.9 kcal/mol with the original partial charges to 1.3 kcal/mol with the corrected partial charges.
2016-01-01
Recent advances in improved force fields and sampling methods have made it possible for the accurate calculation of protein–ligand binding free energies. Alchemical free energy perturbation (FEP) using an explicit solvent model is one of the most rigorous methods to calculate relative binding free energies. However, for cases where there are high energy barriers separating the relevant conformations that are important for ligand binding, the calculated free energy may depend on the initial conformation used in the simulation due to the lack of complete sampling of all the important regions in phase space. This is particularly true for ligands with multiple possible binding modes separated by high energy barriers, making it difficult to sample all relevant binding modes even with modern enhanced sampling methods. In this paper, we apply a previously developed method that provides a corrected binding free energy for ligands with multiple binding modes by combining the free energy results from multiple alchemical FEP calculations starting from all enumerated poses, and the results are compared with Glide docking and MM-GBSA calculations. From these calculations, the dominant ligand binding mode can also be predicted. We apply this method to a series of ligands that bind to c-Jun N-terminal kinase-1 (JNK1) and obtain improved free energy results. The dominant ligand binding modes predicted by this method agree with the available crystallography, while both Glide docking and MM-GBSA calculations incorrectly predict the binding modes for some ligands. The method also helps separate the force field error from the ligand sampling error, such that deviations in the predicted binding free energy from the experimental values likely indicate possible inaccuracies in the force field. An error in the force field for a subset of the ligands studied was identified using this method, and improved free energy results were obtained by correcting the partial charges assigned to the ligands. This improved the root-mean-square error (RMSE) for the predicted binding free energy from 1.9 kcal/mol with the original partial charges to 1.3 kcal/mol with the corrected partial charges. PMID:26085821
Identification of Candidate Transcription Factor Binding Sites in the Cattle Genome
Bickhart, Derek M.; Liu, George E.
2013-01-01
A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach—using sequence conservation across cattle, human and dog—and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS. PMID:23433959
Prediction of protein-protein interaction sites using electrostatic desolvation profiles.
Fiorucci, Sébastien; Zacharias, Martin
2010-05-19
Protein-protein complex formation involves removal of water from the interface region. Surface regions with a small free energy penalty for water removal or desolvation may correspond to preferred interaction sites. A method to calculate the electrostatic free energy of placing a neutral low-dielectric probe at various protein surface positions has been designed and applied to characterize putative interaction sites. Based on solutions of the finite-difference Poisson equation, this method also includes long-range electrostatic contributions and the protein solvent boundary shape in contrast to accessible-surface-area-based solvation energies. Calculations on a large set of proteins indicate that in many cases (>90%), the known binding site overlaps with one of the six regions of lowest electrostatic desolvation penalty (overlap with the lowest desolvation region for 48% of proteins). Since the onset of electrostatic desolvation occurs even before direct protein-protein contact formation, it may help guide proteins toward the binding region in the final stage of complex formation. It is interesting that the probe desolvation properties associated with residue types were found to depend to some degree on whether the residue was outside of or part of a binding site. The probe desolvation penalty was on average smaller if the residue was part of a binding site compared to other surface locations. Applications to several antigen-antibody complexes demonstrated that the approach might be useful not only to predict protein interaction sites in general but to map potential antigenic epitopes on protein surfaces. Copyright (c) 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Vazquez-Anderson, Jorge; Mihailovic, Mia K; Baldridge, Kevin C; Reyes, Kristofer G; Haning, Katie; Cho, Seung Hee; Amador, Paul; Powell, Warren B; Contreras, Lydia M
2017-05-19
Current approaches to design efficient antisense RNAs (asRNAs) rely primarily on a thermodynamic understanding of RNA-RNA interactions. However, these approaches depend on structure predictions and have limited accuracy, arguably due to overlooking important cellular environment factors. In this work, we develop a biophysical model to describe asRNA-RNA hybridization that incorporates in vivo factors using large-scale experimental hybridization data for three model RNAs: a group I intron, CsrB and a tRNA. A unique element of our model is the estimation of the availability of the target region to interact with a given asRNA using a differential entropic consideration of suboptimal structures. We showcase the utility of this model by evaluating its prediction capabilities in four additional RNAs: a group II intron, Spinach II, 2-MS2 binding domain and glgC 5΄ UTR. Additionally, we demonstrate the applicability of this approach to other bacterial species by predicting sRNA-mRNA binding regions in two newly discovered, though uncharacterized, regulatory RNAs. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rational redesign of neutral endopeptidase binding to merlin and moesin proteins
Niv, Masha Y; Iida, Katsuyuki; Zheng, Rong; Horiguchi, Akio; Shen, Ruoqian; Nanus, David M
2009-01-01
Neutral endopeptidase (NEP) is a 90- to 110-kDa cell-surface peptidase that is normally expressed by numerous tissues but whose expression is lost or reduced in a variety of malignancies. The anti-tumorigenic function of NEP is mediated not only by its catalytic activity but also through direct protein–protein interactions of its cytosolic region with several binding partners, including Lyn kinase, PTEN, and ezrin/radixin/moesin (ERM) proteins. We have previously shown that mutation of the K19K20K21 basic cluster in NEPs' cytosolic region to residues QNI disrupts binding to the ERM proteins. Here we show that the ERM-related protein merlin (NF2) does not bind NEP or its cytosolic region. Using experimental data, threading, and sequence analysis, we predicted the involvement of moesin residues E159Q160 in binding to the NEP cytosolic domain. Mutation of these residues to NL (to mimic the corresponding N159L160 residues in the nonbinder merlin) disrupted moesin binding to NEP. Mutation of residues N159L160Y161K162M163 in merlin to the corresponding moesin residues resulted in NEP binding to merlin. This engineered NEP peptide–merlin interaction was diminished by the QNI mutation in NEP, supporting the role of the NEP basic cluster in binding. We thus identified the region of interaction between NEP and moesin, and engineered merlin into a NEP-binding protein. These data form the basis for further exploration of the details of NEP-ERM binding and function. PMID:19388049
Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.
Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G
2016-01-01
While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.
Lee, Yong-Jik; Lee, Sang-Jae; Kim, Seong-Bo; Lee, Sang Jun; Lee, Sung Haeng; Lee, Dong-Woo
2014-03-18
Structural genomics demonstrates that despite low levels of structural similarity of proteins comprising a metabolic pathway, their substrate binding regions are likely to be conserved. Herein based on the 3D-structures of the α/β-fold proteins involved in the ara operon, we attempted to predict the substrate binding residues of thermophilic Geobacillus stearothermophilus L-arabinose isomerase (GSAI) with no 3D-structure available. Comparison of the structures of L-arabinose catabolic enzymes revealed a conserved feature to form the substrate-binding modules, which can be extended to predict the substrate binding site of GSAI (i.e., D195, E261 and E333). Moreover, these data implicated that proteins in the l-arabinose metabolic pathway might retain their substrate binding niches as the modular structure through conserved molecular evolution even with totally different structural scaffolds. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Structure and Dynamics Analysis on Plexin-B1 Rho GTPase Binding Domain as a Monomer and Dimer
2015-01-01
Plexin-B1 is a single-pass transmembrane receptor. Its Rho GTPase binding domain (RBD) can associate with small Rho GTPases and can also self-bind to form a dimer. In total, more than 400 ns of NAMD molecular dynamics simulations were performed on RBD monomer and dimer. Different analysis methods, such as root mean squared fluctuation (RMSF), order parameters (S2), dihedral angle correlation, transfer entropy, principal component analysis, and dynamical network analysis, were carried out to characterize the motions seen in the trajectories. RMSF results show that after binding, the L4 loop becomes more rigid, but the L2 loop and a number of residues in other regions become slightly more flexible. Calculating order parameters (S2) for CH, NH, and CO bonds on both backbone and side chain shows that the L4 loop becomes essentially rigid after binding, but part of the L1 loop becomes slightly more flexible. Backbone dihedral angle cross-correlation results show that loop regions such as the L1 loop including residues Q25 and G26, the L2 loop including residue R61, and the L4 loop including residues L89–R91, are highly correlated compared to other regions in the monomer form. Analysis of the correlated motions at these residues, such as Q25 and R61, indicate two signal pathways. Transfer entropy calculations on the RBD monomer and dimer forms suggest that the binding process should be driven by the L4 loop and C-terminal. However, after binding, the L4 loop functions as the motion responder. The signal pathways in RBD were predicted based on a dynamical network analysis method using the pathways predicted from the dihedral angle cross-correlation calculations as input. It is found that the shortest pathways predicted from both inputs can overlap, but signal pathway 2 (from F90 to R61) is more dominant and overlaps all of the routes of pathway 1 (from F90 to P111). This project confirms the allosteric mechanism in signal transmission inside the RBD network, which was in part proposed in the previous experimental study. PMID:24901636
Yakhnin, Helen; Baker, Carol S.; Berezin, Igor; Evangelista, Michael A.; Rassin, Alisa; Romeo, Tony; Babitzke, Paul
2011-01-01
The RNA binding protein CsrA is the central component of a conserved global regulatory system that activates or represses gene expression posttranscriptionally. In every known example of CsrA-mediated translational control, CsrA binds to the 5′ untranslated region of target transcripts, thereby repressing translation initiation and/or altering the stability of the RNA. Furthermore, with few exceptions, repression by CsrA involves binding directly to the Shine-Dalgarno sequence and blocking ribosome binding. sdiA encodes the quorum-sensing receptor for N-acyl-l-homoserine lactone in Escherichia coli. Because sdiA indirectly stimulates transcription of csrB, which encodes a small RNA (sRNA) antagonist of CsrA, we further explored the relationship between sdiA and the Csr system. Primer extension analysis revealed four putative transcription start sites within 85 nucleotides of the sdiA initiation codon. Potential σ70-dependent promoters were identified for each of these primer extension products. In addition, two CsrA binding sites were predicted in the initially translated region of sdiA. Expression of chromosomally integrated sdiA′-′lacZ translational fusions containing the entire promoter and CsrA binding site regions indicates that CsrA represses sdiA expression. The results from gel shift and footprint studies demonstrate that tight binding of CsrA requires both of these sites. Furthermore, the results from toeprint and in vitro translation experiments indicate that CsrA represses translation of sdiA by directly competing with 30S ribosomal subunit binding. Thus, this represents the first example of CsrA preventing translation by interacting solely within the coding region of an mRNA target. PMID:21908661
Prediction of the binding sites of huperzine A in acetylcholinesterase by docking studies
NASA Astrophysics Data System (ADS)
Pang, Yuan-Ping; Kozikowski, Alan P.
1994-12-01
We have performed docking studies with the SYSDOC program on acetylcholinesterase (AChE) to predict the binding sites in AChE of huperzine A (HA), which is a potent and selective, reversible inhibitor of AChE. The unique aspects of our docking studies include the following: (i) Molecular flexibility of the guest and the host is taken into account, which permits both to change their conformations upon binding. (ii) The binding energy is evaluated by a sum of energies of steric, electrostatic and hydrogen bonding interactions. In the energy calculation no grid approximation is used, and all hydrogen atoms of the system are treated explicitly. (iii) The energy of cation-π interactions between the guest and the host, which is important in the binding of AChE, is included in the calculated binding energy. (iv) Docking is performed in all regions of the host's binding cavity. Based on our docking studies and the pharmacological results reported for HA and its analogs, we predict that HA binds to the bottom of the binding cavity of AChE (the gorge) with its ammonium group interacting with Trp84, Phe330, Glu199 and Asp72 (catalytic site). At the the opening of the gorge with its ammonium group partially interacting with Trp279 (peripheral site). At the catalytic site, three partially overlapping subsites of HA were identified which might provide a dynamic view of binding of HA to the catalytic site.
Antibody specific epitope prediction-emergence of a new paradigm.
Sela-Culang, Inbal; Ofran, Yanay; Peters, Bjoern
2015-04-01
The development of accurate tools for predicting B-cell epitopes is important but difficult. Traditional methods have examined which regions in an antigen are likely binding sites of an antibody. However, it is becoming increasingly clear that most antigen surface residues will be able to bind one or more of the myriad of possible antibodies. In recent years, new approaches have emerged for predicting an epitope for a specific antibody, utilizing information encoded in antibody sequence or structure. Applying such antibody-specific predictions to groups of antibodies in combination with easily obtainable experimental data improves the performance of epitope predictions. We expect that further advances of such tools will be possible with the integration of immunoglobulin repertoire sequencing data. Copyright © 2015 Elsevier B.V. All rights reserved.
Predicted structure of MIF/CD74 and RTL1000/CD74 complexes.
Meza-Romero, Roberto; Benedek, Gil; Leng, Lin; Bucala, Richard; Vandenbark, Arthur A
2016-04-01
Macrophage migration inhibitory factor (MIF) is a key cytokine in autoimmune and inflammatory diseases that attracts and then retains activated immune cells from the periphery to the tissues. MIF exists as a homotrimer and its effects are mediated through its primary receptor, CD74 (the class II invariant chain that exhibits a highly structured trimerization domain), present on class II expressing cells. Although a number of binding residues have been identified between MIF and CD74 trimers, their spatial orientation has not been established. Using a docking program in silico, we have modeled binding interactions between CD74 and MIF as well as CD74 and a competitive MIF inhibitor, RTL1000, a partial MHC class II construct that is currently in clinical trials for multiple sclerosis. These analyses revealed 3 binding sites on the MIF trimer that each were predicted to bind one CD74 trimer through interactions with two distinct 5 amino acid determinants. Surprisingly, predicted binding of one CD74 trimer to a single RTL1000 antagonist utilized the same two 5 residue determinants, providing strong suggestive evidence in support of the MIF binding regions on CD74. Taken together, our structural modeling predicts a new MIF(CD74)3 dodecamer that may provide the basis for increased MIF potency and the requirement for ~3-fold excess RTL1000 to achieve full antagonism.
Nagano, Yukio; Furuhashi, Hirofumi; Inaba, Takehito; Sasaki, Yukiko
2001-01-01
Complementary DNA encoding a DNA-binding protein, designated PLATZ1 (plant AT-rich sequence- and zinc-binding protein 1), was isolated from peas. The amino acid sequence of the protein is similar to those of other uncharacterized proteins predicted from the genome sequences of higher plants. However, no paralogous sequences have been found outside the plant kingdom. Multiple alignments among these paralogous proteins show that several cysteine and histidine residues are invariant, suggesting that these proteins are a novel class of zinc-dependent DNA-binding proteins with two distantly located regions, C-x2-H-x11-C-x2-C-x(4–5)-C-x2-C-x(3–7)-H-x2-H and C-x2-C-x(10–11)-C-x3-C. In an electrophoretic mobility shift assay, the zinc chelator 1,10-o-phenanthroline inhibited DNA binding, and two distant zinc-binding regions were required for DNA binding. A protein blot with 65ZnCl2 showed that both regions are required for zinc-binding activity. The PLATZ1 protein non-specifically binds to A/T-rich sequences, including the upstream region of the pea GTPase pra2 and plastocyanin petE genes. Expression of the PLATZ1 repressed those of the reporter constructs containing the coding sequence of luciferase gene driven by the cauliflower mosaic virus (CaMV) 35S90 promoter fused to the tandem repeat of the A/T-rich sequences. These results indicate that PLATZ1 is a novel class of plant-specific zinc-dependent DNA-binding protein responsible for A/T-rich sequence-mediated transcriptional repression. PMID:11600698
Licht, J D; Hanna-Rose, W; Reddy, J C; English, M A; Ro, M; Grossel, M; Shaknovich, R; Hansen, U
1994-01-01
We previously demonstrated that the Drosophila Krüppel protein is a transcriptional repressor with separable DNA-binding and transcriptional repression activities. In this study, the minimal amino (N)-terminal repression region of the Krüppel protein was defined by transferring regions of the Krüppel protein to a heterologous DNA-binding protein, the lacI protein. Fusion of a predicted alpha-helical region from amino acids 62 to 92 in the N terminus of the Krüppel protein was sufficient to transfer repression activity. This putative alpha-helix has several hydrophobic surfaces, as well as a glutamine-rich surface. Mutants containing multiple amino acid substitutions of the glutamine residues demonstrated that this putative alpha-helical region is essential for repression activity of a Krüppel protein containing the entire N-terminal and DNA-binding regions. Furthermore, one point mutant with only a single glutamine on this surface altered to lysine abolished the ability of the Krüppel protein to repress, indicating the importance of the amino acid at residue 86 for repression. The N terminus also contained an adjacent activation region localized between amino acids 86 and 117. Finally, in accordance with predictions from primary amino acid sequence similarity, a repression region from the Drosophila even-skipped protein, which was six times more potent than that of the Krüppel protein in the mammalian cells, was characterized. This segment included a hydrophobic stretch of 11 consecutive alanine residues and a proline-rich region. Images PMID:8196644
The role of receptor topology in the vitamin D3 uptake and Ca{sup 2+} response systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morrill, Gene A., E-mail: gene.morrill@einstein.yu.edu; Kostellow, Adele B.; Gupta, Raj K.
The steroid hormone, vitamin D{sub 3}, regulates gene transcription via at least two receptors and initiates putative rapid response systems at the plasma membrane. The vitamin D receptor (VDR) binds vitamin D{sub 3} and a second receptor, importin-4, imports the VDR-vitamin D{sub 3} complex into the nucleus via nuclear pores. Here we present evidence that the Homo sapiens VDR homodimer contains two transmembrane (TM) helices ({sup 327}E – D{sup 342}), two TM “half-helix” ({sup 264}K − N{sup 276}), one or more large channels, and 16 cholesterol binding (CRAC/CARC) domains. The importin-4 monomer exhibits 3 pore-lining regions ({sup 226}E – L{supmore » 251}; {sup 768}V – G{sup 783}; {sup 876}S – A{sup 891}) and 16 CRAC/CARC domains. The MEMSAT algorithm indicates that VDR and importin-4 may not be restricted to cytoplasm and nucleus. VDR homodimer TM helix-topology predicts insertion into the plasma membrane, with two 84 residue C-terminal regions being extracellular. Similarly, MEMSAT predicts importin-4 insertion into the plasma membrane with 226 residue extracellular N-terminal regions and 96 residue C-terminal extracellular loops; with the pore-lining regions contributing gated Ca{sup 2+} channels. The PoreWalker algorithm indicates that, of the 427 residues in each VDR monomer, 91 line the largest channel, including two vitamin D{sub 3} binding sites and residues from both the TM helix and “half-helix”. Cholesterol-binding domains also extend into the channel within the ligand binding region. Programmed changes in bound cholesterol may regulate both membrane Ca{sup 2+} response systems and vitamin D{sub 3} uptake as well as receptor internalization by the endomembrane system culminating in uptake of the vitamin D{sub 3}-VDR-importin-4 complex into the nucleus.« less
Trabanino, Rene J.; Hall, Spencer E.; Vaidehi, Nagarajan; Floriano, Wely B.; Kam, Victor W. T.; Goddard, William A.
2004-01-01
G-protein-coupled receptors (GPCRs) are involved in cell communication processes and with mediating such senses as vision, smell, taste, and pain. They constitute a prominent superfamily of drug targets, but an atomic-level structure is available for only one GPCR, bovine rhodopsin, making it difficult to use structure-based methods to design receptor-specific drugs. We have developed the MembStruk first principles computational method for predicting the three-dimensional structure of GPCRs. In this article we validate the MembStruk procedure by comparing its predictions with the high-resolution crystal structure of bovine rhodopsin. The crystal structure of bovine rhodopsin has the second extracellular (EC-II) loop closed over the transmembrane regions by making a disulfide linkage between Cys-110 and Cys-187, but we speculate that opening this loop may play a role in the activation process of the receptor through the cysteine linkage with helix 3. Consequently we predicted two structures for bovine rhodopsin from the primary sequence (with no input from the crystal structure)—one with the EC-II loop closed as in the crystal structure, and the other with the EC-II loop open. The MembStruk-predicted structure of bovine rhodopsin with the closed EC-II loop deviates from the crystal by 2.84 Å coordinate root mean-square (CRMS) in the transmembrane region main-chain atoms. The predicted three-dimensional structures for other GPCRs can be validated only by predicting binding sites and energies for various ligands. For such predictions we developed the HierDock first principles computational method. We validate HierDock by predicting the binding site of 11-cis-retinal in the crystal structure of bovine rhodopsin. Scanning the whole protein without using any prior knowledge of the binding site, we find that the best scoring conformation in rhodopsin is 1.1 Å CRMS from the crystal structure for the ligand atoms. This predicted conformation has the carbonyl O only 2.82 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 0.62 Å CRMS from the crystal structure. We also used HierDock to predict the binding site of 11-cis-retinal in the MembStruk-predicted structure of bovine rhodopsin (closed loop). Scanning the whole protein structure leads to a structure in which the carbonyl O is only 2.85 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 2.92 Å CRMS from the crystal structure. The good agreement of the ab initio-predicted protein structures and ligand binding site with experiment validates the use of the MembStruk and HierDock first principles' methods. Since these methods are generic and applicable to any GPCR, they should be useful in predicting the structures of other GPCRs and the binding site of ligands to these proteins. PMID:15041637
Yoshino, M; Tsutsumi, K; Kanazawa, A
2015-01-01
β-Conglycinin, a major component of seed storage protein in soybean, comprises three subunits: α, α' and β. The expression of genes for these subunits is strictly controlled during embryogenesis. The proximal promoter region up to 245 bp upstream of the transcription start site of the α subunit gene sufficiently confers spatial and temporal control of transcription in embryos. Here, the binding profile of nuclear proteins in the proximal promoter region of the α subunit gene was analysed. DNase I footprinting analysis indicated binding of proteins to the RY element and DNA regions including box I, a region conserved in cognate gene promoters. An electrophoretic mobility shift assay (EMSA) using different portions of box I as a probe revealed that multiple portions of box I bind to nuclear proteins. In addition, an EMSA using nuclear proteins extracted from embryos at different developmental stages indicated that the levels of major DNA-protein complexes on box I increased during embryo maturation. These results are consistent with the notion that box I is important for the transcriptional control of seed storage protein genes. Furthermore, the present data suggest that nuclear proteins bind to novel motifs in box I including 5'-TCAATT-3' rather than to predicted cis-regulatory elements. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Salmon, D; Hanocq-Quertier, J; Paturiaux-Hanocq, F; Pays, A; Tebabi, P; Nolan, D P; Michel, A; Pays, E
1997-12-15
The Trypanosoma brucei transferrin (Tf) receptor is a heterodimer encoded by ESAG7 and ESAG6, two genes contained in the different polycistronic transcription units of the variant surface glycoprotein (VSG) gene. The sequence of ESAG7/6 differs slightly between different units, so that receptors with different affinities for Tf are expressed alternatively following transcriptional switching of VSG expression sites during antigenic variation of the parasite. Based on the sequence homology between pESAG7/6 and the N-terminal domain of VSGs, it can be predicted that the four blocks containing the major sequence differences between pESAG7 and pESAG6 form surface-exposed loops and generate the ligand-binding site. The exchange of a few amino acids in this region between pESAG6s encoded by different VSG units greatly increased the affinity for bovine Tf. Similar changes in other regions were ineffective, while mutations predicted to alter the VSG-like structure abolished the binding. Chimeric proteins containing the N-terminal dimerization domain of VSG and the C-terminal half of either pESAG7 or pESAG6, which contains the ligand-binding domain, can form heterodimers that bind Tf. Taken together, these data provided evidence that the T.brucei Tf receptor is structurally related to the N-terminal domain of the VSG and that the ligand-binding site corresponds to the exposed surface loops of the protein.
Hales, J. B.
2011-01-01
The process of associating items encountered over time and across variable time delays is fundamental for creating memories in daily life, such as for stories and episodes. Forming associative memory for temporally discontiguous items involves medial temporal lobe structures and additional neocortical processing regions, including prefrontal cortex, parietal lobe, and lateral occipital regions. However, most prior memory studies, using concurrently presented stimuli, have failed to examine the temporal aspect of successful associative memory formation to identify when activity in these brain regions is predictive of associative memory formation. In the current study, functional MRI data were acquired while subjects were shown pairs of sequentially presented visual images with a fixed interitem delay within pairs. This design allowed the entire time course of the trial to be analyzed, starting from onset of the first item, across the 5.5-s delay period, and through offset of the second item. Subjects then completed a postscan recognition test for the items and associations they encoded during the scan and their confidence for each. After controlling for item-memory strength, we isolated brain regions selectively involved in associative encoding. Consistent with prior findings, increased regional activity predicting subsequent associative memory success was found in anterior medial temporal lobe regions of left perirhinal and entorhinal cortices and in left prefrontal cortex and lateral occipital regions. The temporal separation within each pair, however, allowed extension of these findings by isolating the timing of regional involvement, showing that increased response in these regions occurs during binding but not during maintenance. PMID:21248058
CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation.
Nikulova, Anna A; Favorov, Alexander V; Sutormin, Roman A; Makeev, Vsevolod J; Mironov, Andrey A
2012-07-01
Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory 'grammar', or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.
NASA Technical Reports Server (NTRS)
Patil, Shameekumar; Takezawa, D.; Poovaiah, B. W.
1995-01-01
Calcium, a universal second messenger, regulates diverse cellular processes in eukaryotes. Ca-2(+) and Ca-2(+)/calmodulin-regulated protein phosphorylation play a pivotal role in amplifying and diversifying the action of Ca-2(+)- mediated signals. A chimeric Ca-2(+)/calmodulin-dependent protein kinase (CCaMK) gene with a visinin-like Ca-2(+)- binding domain was cloned and characterized from lily. The cDNA clone contains an open reading frame coding for a protein of 520 amino acids. The predicted structure of CCaMK contains a catalytic domain followed by two regulatory domains, a calmodulin-binding domain and a visinin-like Ca-2(+)-binding domain. The amino-terminal region of CCaMK contains all 11 conserved subdomains characteristic of serine/threonine protein kinases. The calmodulin-binding region of CCaMK has high homology (79%) to alpha subunit of mammalian Ca-2(+)/calmodulin-dependent protein kinase. The calmodulin-binding region is fused to a neural visinin-like domain that contains three Ca-2(+)-binding EF-hand motifs and a biotin-binding site. The Escherichia coli-expressed protein (approx. 56 kDa) binds calmodulin in a Ca-2(+)-dependent manner. Furthermore, Ca-45-binding assays revealed that CCaMK directly binds Ca-2(+). The CCaMK gene is preferentially expressed in developing anthers. Southern blot analysis revealed that CCaMK is encoded by a single gene. The structural features of the gene suggest that it has multiple regulatory controls and could play a unique role in Ca-2(+) signaling in plants.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.
2004-08-06
The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui
2017-06-01
The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.
Predicting a small molecule-kinase interaction map: A machine learning approach
2011-01-01
Background We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. Results A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. Conclusions In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful. PMID:21708012
Fanning, T; Singer, M
1987-01-01
Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomas, P.M.; Wohllk, N.; Huang, E.
1996-09-01
Familial persistent hyperinsulinemic hypoglycemia of infancy is a disorder of glucose homeostasis and is characterized by unregulated insulin secretion and profound hypoglycemia. Loss-of-function mutations in the second nucleotide-binding fold of the sulfonylurea receptor, a subunit of the pancreatic-islet {beta}-cell ATP-dependent potassium channel, has been demonstrated to be causative for persistent hyperinsulinemic hypoglycemia of infancy. We now describe three additional mutations in the first nucleotide-binding fold of the sulfonylurea-receptor gene. One point mutation disrupts the highly conserved Walker A motif of the first nucleotide-binding-fold region. The other two mutations occur in noncoding sequences required for RNA processing and are predicted tomore » disrupt the normal splicing pathway of the sulfonylurea-receptor mRNA precursor. These data suggest that both nucleotide-binding-fold regions of the sulfortylurea receptor are required for normal regulation of {beta}-cell ATP-dependent potassium channel activity and insulin secretion. 32 refs., 4 figs., 1 tab.« less
Shazman, Shula; Elber, Gershon; Mandel-Gutfreund, Yael
2011-09-01
Protein nucleic acid interactions play a critical role in all steps of the gene expression pathway. Nucleic acid (NA) binding proteins interact with their partners, DNA or RNA, via distinct regions on their surface that are characterized by an ensemble of chemical, physical and geometrical properties. In this study, we introduce a novel methodology based on differential geometry, commonly used in face recognition, to characterize and predict NA binding surfaces on proteins. Applying the method on experimentally solved three-dimensional structures of proteins we successfully classify double-stranded DNA (dsDNA) from single-stranded RNA (ssRNA) binding proteins, with 83% accuracy. We show that the method is insensitive to conformational changes that occur upon binding and can be applicable for de novo protein-function prediction. Remarkably, when concentrating on the zinc finger motif, we distinguish successfully between RNA and DNA binding interfaces possessing the same binding motif even within the same protein, as demonstrated for the RNA polymerase transcription-factor, TFIIIA. In conclusion, we present a novel methodology to characterize protein surfaces, which can accurately tell apart dsDNA from an ssRNA binding interfaces. The strength of our method in recognizing fine-tuned differences on NA binding interfaces make it applicable for many other molecular recognition problems, with potential implications for drug design.
Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong
2015-01-01
Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.
Mechanism of human antibody-mediated neutralization of Marburg virus.
Flyak, Andrew I; Ilinykh, Philipp A; Murin, Charles D; Garron, Tania; Shen, Xiaoli; Fusco, Marnie L; Hashiguchi, Takao; Bornholdt, Zachary A; Slaughter, James C; Sapparapu, Gopal; Klages, Curtis; Ksiazek, Thomas G; Ward, Andrew B; Saphire, Erica Ollmann; Bukreyev, Alexander; Crowe, James E
2015-02-26
The mechanisms by which neutralizing antibodies inhibit Marburg virus (MARV) are not known. We isolated a panel of neutralizing antibodies from a human MARV survivor that bind to MARV glycoprotein (GP) and compete for binding to a single major antigenic site. Remarkably, several of the antibodies also bind to Ebola virus (EBOV) GP. Single-particle EM structures of antibody-GP complexes reveal that all of the neutralizing antibodies bind to MARV GP at or near the predicted region of the receptor-binding site. The presence of the glycan cap or mucin-like domain blocks binding of neutralizing antibodies to EBOV GP, but not to MARV GP. The data suggest that MARV-neutralizing antibodies inhibit virus by binding to infectious virions at the exposed MARV receptor-binding site, revealing a mechanism of filovirus inhibition. Copyright © 2015 Elsevier Inc. All rights reserved.
Receptor-like genes in the major resistance locus of lettuce are subject to divergent selection.
Meyers, B C; Shen, K A; Rohani, P; Gaut, B S; Michelmore, R W
1998-01-01
Disease resistance genes in plants are often found in complex multigene families. The largest known cluster of disease resistance specificities in lettuce contains the RGC2 family of genes. We compared the sequences of nine full-length genomic copies of RGC2 representing the diversity in the cluster to determine the structure of genes within this family and to examine the evolution of its members. The transcribed regions range from at least 7.0 to 13.1 kb, and the cDNAs contain deduced open reading frames of approximately 5. 5 kb. The predicted RGC2 proteins contain a nucleotide binding site and irregular leucine-rich repeats (LRRs) that are characteristic of resistance genes cloned from other species. Unique features of the RGC2 gene products include a bipartite LRR region with >40 repeats. At least eight members of this family are transcribed. The level of sequence diversity between family members varied in different regions of the gene. The ratio of nonsynonymous (Ka) to synonymous (Ks) nucleotide substitutions was lowest in the region encoding the nucleotide binding site, which is the presumed effector domain of the protein. The LRR-encoding region showed an alternating pattern of conservation and hypervariability. This alternating pattern of variation was also found in all comparisons within families of resistance genes cloned from other species. The Ka /Ks ratios indicate that diversifying selection has resulted in increased variation at these codons. The patterns of variation support the predicted structure of LRR regions with solvent-exposed hypervariable residues that are potentially involved in binding pathogen-derived ligands. PMID:9811792
Maleki, Soheila J.; Teuber, Suzanne S.; Cheng, Hsiaopo; Chen, Deliang; Comstock, Sarah S.; Ruan, Sanbao; Schein, Catherine H.
2011-01-01
Background Cross reactivity between peanuts and tree nuts implies that similar IgE epitopes are present in their proteins. Objective To determine whether walnut sequences similar to known peanut IgE binding sequences, according to the property distance (PD) scale implemented in the Structural Database of Allergenic Proteins (SDAP), react with IgE from sera of patients with allergy to walnut and/or peanut. Methods Patient sera were characterized by Western blotting for IgE-binding to nut protein extracts, and to peptides from walnut and peanut allergens, similar to known peanut epitopes as defined by low PD values, synthesized on membranes. Competitive ELISA was used to show that peanut and predicted walnut epitope sequences compete with purified Ara h 2 for binding to IgE in serum from a cross-reactive patient. Results Sequences from the vicilin walnut allergen Jug r 2 which had low PD values to epitopes of the peanut allergen Ara h 2, a 2s-albumin, bound IgE in sera from five patients who reacted to either walnut, peanut or both. A walnut epitope recognized by 6 patients mapped to a surface-exposed region on a model of the N-terminal pro-region of Jug r 2. A predicted walnut epitope competed for IgE binding to Ara h 2 in serum as well as the known IgE epitope from Ara h 2. Conclusions Sequences with low PD value (<8.5) to known IgE epitopes could contribute to cross-reactivity between allergens. This further validates the PD scoring method for predicting cross-reactive epitopes in allergens. PMID:21883278
Qin, Pengmin; Duncan, Niall W; Wiebking, Christine; Gravel, Paul; Lyttelton, Oliver; Hayes, Dave J; Verhaeghe, Jeroen; Kostikov, Alexey; Schirrmacher, Ralf; Reader, Andrew J; Northoff, Georg
2012-01-01
Recent imaging studies have demonstrated that levels of resting γ-aminobutyric acid (GABA) in the visual cortex predict the degree of stimulus-induced activity in the same region. These studies have used the presentation of discrete visual stimulus; the change from closed eyes to open also represents a simple visual stimulus, however, and has been shown to induce changes in local brain activity and in functional connectivity between regions. We thus aimed to investigate the role of the GABA system, specifically GABA(A) receptors, in the changes in brain activity between the eyes closed (EC) and eyes open (EO) state in order to provide detail at the receptor level to complement previous studies of GABA concentrations. We conducted an fMRI study involving two different modes of the change from EC to EO: an EO and EC block design, allowing the modeling of the haemodynamic response, followed by longer periods of EC and EO to allow the measuring of functional connectivity. The same subjects also underwent [(18)F]Flumazenil PET to measure GABA(A) receptor binding potentials. It was demonstrated that the local-to-global ratio of GABA(A) receptor binding potential in the visual cortex predicted the degree of changes in neural activity from EC to EO. This same relationship was also shown in the auditory cortex. Furthermore, the local-to-global ratio of GABA(A) receptor binding potential in the visual cortex also predicted the change in functional connectivity between the visual and auditory cortex from EC to EO. These findings contribute to our understanding of the role of GABA(A) receptors in stimulus-induced neural activity in local regions and in inter-regional functional connectivity.
NASA Astrophysics Data System (ADS)
Rhea, James R.; Young, Thomas C.
1987-10-01
The proton binding characteristics of humic acids extracted from the sediments of Cranberry Pond, an acidic water body located in the Adirondack Mountain region of New York State, were explored by the application of a multiligand distribution model. The model characterizes a class of proton binding sites by mean log K values and the standard deviations of log K values about the mean. Mean log K values and their relative abundances were determined directly from experimental titration data. The model accurately predicts the binding of protons by the humic acids for pH values in the range 3.5 to 10.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rhea, J.R.; Young, T.C.
1987-01-01
The proton binding characteristics of humic acids extracted from the sediments of Cranberry Pond, an acidic water body located in the Adirondack Mountain region of New York State, were explored by the application of a nultiligand distribution model. The model characterizes a class of proton binding sites by mean log K values and the standard deviations of log K values and the mean. Mean log K values and their relative abundances were determined directly from experimental titration data. The model accurately predicts the binding of protons by the humic acids for pH values in the range 3.5 to 10.0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Polonskaya, Zhanna; Benham, Craig J.; Hearing, Janet
The minimal replicator of the Epstein-Barr virus (EBV) latent cycle origin of DNA replication oriP is composed of two binding sites for the Epstein-Barr virus nuclear antigen-1 (EBNA-1) and flanking inverted repeats that bind the telomere repeat binding factor TRF2. Although not required for minimal replicator activity, additional binding sites for EBNA-1 and TRF2 and one or more auxiliary elements located to the right of the EBNA-1/TRF2 sites are required for the efficient replication of oriP plasmids. Another region of oriP that is predicted to be destabilized by DNA supercoiling is shown here to be an important functional component ofmore » oriP. The ability of DNA fragments of unrelated sequence and possessing supercoiled-induced DNA duplex destabilized (SIDD) structures, but not fragments characterized by helically stable DNA, to substitute for this component of oriP demonstrates a role for the SIDD region in the initiation of oriP-plasmid DNA replication.« less
Cheng, Chia-Yang; Chu, Chia-Han; Hsu, Hung-Wei; Hsu, Fang-Rong; Tang, Chung Yi; Wang, Wen-Ching; Kung, Hsing-Jien; Chang, Pei-Ching
2014-01-01
Post-translational modification (PTM) of transcriptional factors and chromatin remodelling proteins is recognized as a major mechanism by which transcriptional regulation occurs. Chromatin immunoprecipitation (ChIP) in combination with high-throughput sequencing (ChIP-seq) is being applied as a gold standard when studying the genome-wide binding sites of transcription factor (TFs). This has greatly improved our understanding of protein-DNA interactions on a genomic-wide scale. However, current ChIP-seq peak calling tools are not sufficiently sensitive and are unable to simultaneously identify post-translational modified TFs based on ChIP-seq analysis; this is largely due to the wide-spread presence of multiple modified TFs. Using SUMO-1 modification as an example; we describe here an improved approach that allows the simultaneous identification of the particular genomic binding regions of all TFs with SUMO-1 modification. Traditional peak calling methods are inadequate when identifying multiple TF binding sites that involve long genomic regions and therefore we designed a ChIP-seq processing pipeline for the detection of peaks via a combinatorial fusion method. Then, we annotate the peaks with known transcription factor binding sites (TFBS) using the Transfac Matrix Database (v7.0), which predicts potential SUMOylated TFs. Next, the peak calling result was further analyzed based on the promoter proximity, TFBS annotation, a literature review, and was validated by ChIP-real-time quantitative PCR (qPCR) and ChIP-reChIP real-time qPCR. The results show clearly that SUMOylated TFs are able to be pinpointed using our pipeline. A methodology is presented that analyzes SUMO-1 ChIP-seq patterns and predicts related TFs. Our analysis uses three peak calling tools. The fusion of these different tools increases the precision of the peak calling results. TFBS annotation method is able to predict potential SUMOylated TFs. Here, we offer a new approach that enhances ChIP-seq data analysis and allows the identification of multiple SUMOylated TF binding sites simultaneously, which can then be utilized for other functional PTM binding site prediction in future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sprague, E.R.; Wang, C.; Baker, D.
Herpes simplex virus type-1 expresses a heterodimeric Fc receptor, gE-gI, on the surfaces of virions and infected cells that binds the Fc region of host immunoglobulin G and is implicated in the cell-to-cell spread of virus. gE-gI binds immunoglobulin G at the basic pH of the cell surface and releases it at the acidic pH of lysosomes, consistent with a role in facilitating the degradation of antiviral antibodies. Here we identify the C-terminal domain of the gE ectodomain (CgE) as the minimal Fc-binding domain and present a 1.78-{angstrom} CgE structure. A 5-{angstrom} gE-gI/Fc crystal structure, which was independently verified bymore » a theoretical prediction method, reveals that CgE binds Fc at the C{sub H}2-C{sub H}3 interface, the binding site for several mammalian and bacterial Fc-binding proteins. The structure identifies interface histidines that may confer pH-dependent binding and regions of CgE implicated in cell-to-cell spread of virus. The ternary organization of the gE-gI/Fc complex is compatible with antibody bipolar bridging, which can interfere with the antiviral immune response.« less
Slayton, Mark; Hossain, Tanvir; Biegalke, Bonita J
2018-05-01
The human cytomegalovirus (HCMV) UL34 gene encodes sequence-specific DNA-binding proteins (pUL34) which are required for viral replication. Interactions of pUL34 with DNA binding sites represses transcription of two viral immune evasion genes, US3 and US9. 12 additional predicted pUL34-binding sites are present in the HCMV genome (strain AD169) with three binding sites concentrated near the HCMV origin of lytic replication (oriLyt). We used ChIP-seq analysis of pUL34-DNA interactions to confirm that pUL34 binds to the oriLyt region during infection. Mutagenesis of the UL34-binding sites in an oriLyt-containing plasmid significantly reduced viral-mediated oriLyt-dependent DNA replication. Mutagenesis of these sites in the HCMV genome reduced the replication efficiencies of the resulting viruses. Protein-protein interaction analyses demonstrated that pUL34 interacts with the viral proteins IE2, UL44, and UL84, that are essential for viral DNA replication, suggesting that pUL34-DNA interactions in the oriLyt region are involved in the DNA replication cascade. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Iryani, I.; Amelia, F.; Iswendi, I.
2018-04-01
Cervix cancer triggered by Human papillomavirus infection is the second cause to woman death in worldwide. The binding site of E1-E2 protein of HPV 16 is not known from a 3-D structure yet, so in this study we address this issue to study the structure of E1-E2 protein from Human papillomavirus type 16 and to find its potential binding sites using biphenylsulfonacetic acid as inhibitor. Swiss model was used for 3D structure prediction and PDB: 2V9P (E1 protein) and 2NNU (E2 protein) having 52.32% and 100% identity respectively was selected as a template. The 3D model structure developed of E1 and E2 in the core and allowed regions were 99.2% and 99.5%. The ligand binding sites were predicted using online server meta pocket 2.0 and MOE 2009.10 was used for docking. E1-and E2 protein of HPV-16 has three potential binding site that can interact with the inhibitors. The Docking biphenylsulfonacetic acid using these binding sites shows that ligand interact with the protein through hydrogen bonds on Lys 403, Arg 410, His 551 in the first pocket, on Tyr 32, Leu 99 in the second pocket, and Lys 558m Lys 517 in the third pocket.
Kawai, Ryoko; Araki, Mitsugu; Yoshimura, Masashi; Kamiya, Narutoshi; Ono, Masahiro; Saji, Hideo; Okuno, Yasushi
2018-05-16
Development of new diagnostic imaging probes for Alzheimer's disease, such as positron emission tomography (PET) and single photon emission computed tomography (SPECT) probes, has been strongly desired. In this study, we investigated the most accessible amyloid β (Aβ) binding site of [ 123 I]IMPY, a Thioflavin-T-derived SPECT probe, using experimental and computational methods. First, we performed a competitive inhibition assay with Orange-G, which recognizes the KLVFFA region in Aβ fibrils, suggesting that IMPY and Orange-G bind to different sites in Aβ fibrils. Next, we precisely predicted the IMPY binding site on a multiple-protofilament Aβ fibril model using computational approaches, consisting of molecular dynamics and docking simulations. We generated possible IMPY-binding structures using docking simulations to identify candidates for probe-binding sites. The binding free energy of IMPY with the Aβ fibril was calculated by a free energy simulation method, MP-CAFEE. These computational results suggest that IMPY preferentially binds to an interfacial pocket located between two protofilaments and is stabilized mainly through hydrophobic interactions. Finally, our computational approach was validated by comparing it with the experimental results. The present study demonstrates the possibility of computational approaches to screen new PET/SPECT probes for Aβ imaging.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.
2004-08-06
Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
ProMateus—an open research approach to protein-binding sites analysis
Neuvirth, Hani; Heinemann, Uri; Birnbaum, David; Tishby, Naftali; Schreiber, Gideon
2007-01-01
The development of bioinformatic tools by individual labs results in the abundance of parallel programs for the same task. For example, identification of binding site regions between interacting proteins is done using: ProMate, WHISCY, PPI-Pred, PINUP and others. All servers first identify unique properties of binding sites and then incorporate them into a predictor. Obviously, the resulting prediction would improve if the most suitable parameters from each of those predictors would be incorporated into one server. However, because of the variation in methods and databases, this is currently not feasible. Here, the protein-binding site prediction server is extended into a general protein-binding sites research tool, ProMateus. This web tool, based on ProMate's infrastructure enables the easy exploration and incorporation of new features and databases by the user, providing an evaluation of the benefit of individual features and their combination within a set framework. This transforms the individual research into a community exercise, bringing out the best from all users for optimized predictions. The analysis is demonstrated on a database of protein protein and protein-DNA interactions. This approach is basically different from that used in generating meta-servers. The implications of the open-research approach are discussed. ProMateus is available at http://bip.weizmann.ac.il/promate. PMID:17488838
Soloff, Paul H; Chiappetta, Laurel; Mason, Neale Scott; Becker, Carl; Price, Julie C
2014-06-30
Impulsivity and aggressiveness are personality traits associated with a vulnerability to suicidal behavior. Behavioral expression of these traits differs by gender and has been related to central serotonergic function. We assessed the relationships between serotonin-2A receptor function, gender, and personality traits in borderline personality disorder (BPD), a disorder characterized by impulsive-aggression and recurrent suicidal behavior. Participants, who included 33 BPD patients and 27 healthy controls (HC), were assessed for Axis I and II disorders with the Structured Clinical Interview for DSM-IV and the International Personality Disorders Examination, and with the Diagnostic Interview for Borderline Patients-Revised for BPD. Depressed mood, impulsivity, aggression, and temperament were assessed with standardized measures. Positron emission tomography with [(18)F]altanserin as ligand and arterial blood sampling was used to determine the binding potentials (BPND) of serotonin-2A receptors in 11 regions of interest. Data were analyzed using Logan graphical analysis, controlling for age and non-specific binding. Among BPD subjects, aggression, Cluster B co-morbidity, antisocial PD, and childhood abuse were each related to altanserin binding. BPND values predicted impulsivity and aggression in BPD females (but not BPD males), and in HC males (but not HC females.) Altanserin binding was greater in BPD females than males in every contrast, but it did not discriminate suicide attempters from non-attempters. Region-specific differences in serotonin-2A receptor binding related to diagnosis and gender predicted clinical expression of aggression and impulsivity. Vulnerability to suicidal behavior in BPD may be related to serotonin-2A binding through expression of personality risk factors. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Majoros, William H; Ohler, Uwe
2010-12-16
The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
Analysis of functional importance of binding sites in the Drosophila gap gene network model.
Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria
2015-01-01
The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.
HotRegion: a database of predicted hot spot clusters.
Cukuroglu, Engin; Gursoy, Attila; Keskin, Ozlem
2012-01-01
Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. Hot regions are important for the stability of protein complexes, as well as providing specificity to binding sites. We propose a database called HotRegion, which provides the hot region information of the interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided. HotRegion is accessible at http://prism.ccbb.ku.edu.tr/hotregion.
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.
2013-01-01
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.
Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A
2013-07-01
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
Seong, Ki Moon; Park, Hweon; Kim, Seong Jung; Ha, Hyo Nam; Lee, Jae Yung; Kim, Joon
2007-06-01
A yeast transcriptional activator, Gcn4p, induces the expression of genes that are involved in amino acid and purine biosynthetic pathways under amino acid starvation. Gcn4p has an acidic activation domain in the central region and a bZIP domain in the C-terminus that is divided into the DNA-binding motif and dimerization leucine zipper motif. In order to identify amino acids in the DNA-binding motif of Gcn4p which are involved in transcriptional activation, we constructed mutant libraries in the DNA-binding motif through an innovative application of random mutagenesis. Mutant library made by oligonucleotides which were mutated randomly using the Poisson distribution showed that the actual mutation frequency was in good agreement with expected values. This method could save the time and effort to create a mutant library with a predictable mutation frequency. Based on the studies using the mutant libraries constructed by the new method, the specific residues of the DNA-binding domain in Gcn4p appear to be involved in the transcriptional activities on a conserved binding site.
Bubis, José; Martínez, Juan Carlos; Calabokis, Maritza; Ferreira, Joilyneth; Sanz-Rodríguez, Carlos E; Navas, Victoria; Escalona, José Leonardo; Guo, Yurong; Taylor, Susan S
2018-03-01
The full gene sequence encoding for the Trypanosoma equiperdum ortholog of the cAMP-dependent protein kinase (PKA) regulatory (R) subunits was cloned. A poly-His tagged construct was generated [TeqR-like(His) 8 ], and the protein was expressed in bacteria and purified to homogeneity. The size of the purified TeqR-like(His) 8 was determined to be ∼57,000 Da by molecular exclusion chromatography indicating that the parasite protein is a monomer. Limited proteolysis with various proteases showed that the T. equiperdum R-like protein possesses a hinge region very susceptible to proteolysis. The recombinant TeqR-like(His) 8 did not bind either [ 3 H] cAMP or [ 3 H] cGMP up to concentrations of 0.40 and 0.65 μM, respectively, and neither the parasite protein nor its proteolytically generated carboxy-terminal large fragments were capable of binding to a cAMP-Sepharose affinity column. Bioinformatics analyses predicted that the carboxy-terminal region of the trypanosomal R-like protein appears to fold similarly to the analogous region of all known PKA R subunits. However, the protein amino-terminal portion seems to be unrelated and shows homology with proteins that contained Leu-rich repeats, a folding motif that is particularly appropriate for protein-protein interactions. In addition, the three-dimensional structure of the T. equiperdum protein was modeled using the crystal structure of the bovine PKA R I α subunit as template. Molecular docking experiments predicted critical changes in the environment of the two putative nucleotide binding clefts of the parasite protein, and the resulting binding energy differences support the lack of cyclic nucleotide binding in the trypanosomal R-like protein. Copyright © 2017 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.
Qin, Lingyun; Liu, Huili; Chen, Rong; Zhou, Jingjing; Cheng, Xiyao; Chen, Yao; Huang, Yongqi; Su, Zhengding
2017-11-07
The oncoprotein MdmX (mouse double minute X) is highly homologous to Mdm2 (mouse double minute 2) in terms of their amino acid sequences and three-dimensional conformations, but Mdm2 inhibitors exhibit very weak affinity for MdmX, providing an excellent model for exploring how protein conformation distinguishes and alters inhibitor binding. The intrinsic conformation flexibility of proteins plays pivotal roles in determining and predicting the binding properties and the design of inhibitors. Although the molecular dynamics simulation approach enables us to understand protein-ligand interactions, the mechanism underlying how a flexible binding pocket adapts an inhibitor has been less explored experimentally. In this work, we have investigated how the intrinsic flexible regions of the N-terminal domain of MdmX (N-MdmX) affect the affinity of the Mdm2 inhibitor nutlin-3a using protein engineering. Guided by heteronuclear nuclear Overhauser effect measurements, we identified the flexible regions that affect inhibitor binding affinity around the ligand-binding pocket on N-MdmX. A disulfide engineering mutant, N-MdmX C25-C110/C76-C88 , which incorporated two staples to rigidify the ligand-binding pocket, allowed an affinity for nutlin-3a higher than that of wild-type N-MdmX (K d ∼ 0.48 vs K d ∼ 20.3 μM). Therefore, this mutant provides not only an effective protein model for screening and designing of MdmX inhibitors but also a valuable clue for enhancing the intermolecular interactions of the pharmacophores of a ligand with pronounced flexible regions. In addition, our results revealed an allosteric ligand-binding mechanism of N-MdmX in which the ligand initially interacts with a compact core, followed by augmenting intermolecular interactions with intrinsic flexible regions. This strategy should also be applicable to many other protein targets to accelerate drug discovery.
Shazman, Shula; Elber, Gershon; Mandel-Gutfreund, Yael
2011-01-01
Protein nucleic acid interactions play a critical role in all steps of the gene expression pathway. Nucleic acid (NA) binding proteins interact with their partners, DNA or RNA, via distinct regions on their surface that are characterized by an ensemble of chemical, physical and geometrical properties. In this study, we introduce a novel methodology based on differential geometry, commonly used in face recognition, to characterize and predict NA binding surfaces on proteins. Applying the method on experimentally solved three-dimensional structures of proteins we successfully classify double-stranded DNA (dsDNA) from single-stranded RNA (ssRNA) binding proteins, with 83% accuracy. We show that the method is insensitive to conformational changes that occur upon binding and can be applicable for de novo protein-function prediction. Remarkably, when concentrating on the zinc finger motif, we distinguish successfully between RNA and DNA binding interfaces possessing the same binding motif even within the same protein, as demonstrated for the RNA polymerase transcription-factor, TFIIIA. In conclusion, we present a novel methodology to characterize protein surfaces, which can accurately tell apart dsDNA from an ssRNA binding interfaces. The strength of our method in recognizing fine-tuned differences on NA binding interfaces make it applicable for many other molecular recognition problems, with potential implications for drug design. PMID:21693557
Root-Bernstein, Robert; Root-Bernstein, Meredith
2016-05-21
We have proposed that the ribosome may represent a missing link between prebiotic chemistries and the first cells. One of the predictions that follows from this hypothesis, which we test here, is that ribosomal RNA (rRNA) must have encoded the proteins necessary for ribosomal function. In other words, the rRNA also functioned pre-biotically as mRNA. Since these ribosome-binding proteins (rb-proteins) must bind to the rRNA, but the rRNA also functioned as mRNA, it follows that rb-proteins should bind to their own mRNA as well. This hypothesis can be contrasted to a "null" hypothesis in which rb-proteins evolved independently of the rRNA sequences and therefore there should be no necessary similarity between the rRNA to which rb-proteins bind and the mRNA that encodes the rb-protein. Five types of evidence reported here support the plausibility of the hypothesis that the mRNA encoding rb-proteins evolved from rRNA: (1) the ubiquity of rb-protein binding to their own mRNAs and autogenous control of their own translation; (2) the higher-than-expected incidence of Arginine-rich modules associated with RNA binding that occurs in rRNA-encoded proteins; (3) the fact that rRNA-binding regions of rb-proteins are homologous to their mRNA binding regions; (4) the higher than expected incidence of rb-protein sequences encoded in rRNA that are of a high degree of homology to their mRNA as compared with a random selection of other proteins; and (5) rRNA in modern prokaryotes and eukaryotes encodes functional proteins. None of these results can be explained by the null hypothesis that assumes independent evolution of rRNA and the mRNAs encoding ribosomal proteins. Also noteworthy is that very few proteins bind their own mRNAs that are not associated with ribosome function. Further tests of the hypothesis are suggested: (1) experimental testing of whether rRNA-encoded proteins bind to rRNA at their coding sites; (2) whether tRNA synthetases, which are also known to bind to their own mRNAs, are encoded by the tRNA sequences themselves; (3) and the prediction that archaeal and prokaryotic (DNA-based) genomes were built around rRNA "genes" so that rRNA-related sequences will be found to make up an unexpectedly high proportion of these genomes. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Chappell, James D.; Duong, Joy L.; Wright, Benjamin W.; Dermody, Terence S.
2000-01-01
The reovirus attachment protein, ς1, is responsible for strain-specific patterns of viral tropism in the murine central nervous system and receptor binding on cultured cells. The ς1 protein consists of a fibrous tail domain proximal to the virion surface and a virion-distal globular head domain. To better understand mechanisms of reovirus attachment to cells, we conducted studies to identify the region of ς1 that binds cell surface carbohydrate. Chimeric and truncated ς1 proteins derived from prototype reovirus strains type 1 Lang (T1L) and type 3 Dearing (T3D) were expressed in insect cells by using a baculovirus vector. Assessment of expressed protein susceptibility to proteolytic cleavage, binding to anti-ς1 antibodies, and oligomerization indicates that the chimeric and truncated ς1 proteins are properly folded. To assess carbohydrate binding, recombinant ς1 proteins were tested for the capacity to agglutinate mammalian erythrocytes and to bind sialic acid presented on glycophorin, the cell surface molecule bound by type 3 reovirus on human erythrocytes. Using a panel of two wild-type and ten chimeric and truncated ς1 proteins, the sialic acid-binding domain of type 3 ς1 was mapped to a region of sequence proposed to form the more amino terminal of two predicted β-sheet structures in the tail. This unit corresponds to morphologic region T(iii) observed in computer-processed electron micrographs of ς1 protein purified from virions. In contrast, the homologous region of T1L ς1 sequence was not implicated in carbohydrate binding; rather, sequences in the distal portion of the tail known as the neck were required. Results of these studies demonstrate that a functional receptor-binding domain, which uses sialic acid as its ligand, is contained within morphologic region T(iii) of the type 3 ς1 tail. Furthermore, our findings indicate that T1L and T3D ς1 proteins contain different arrangements of receptor-binding domains. PMID:10954547
Chappell, J D; Duong, J L; Wright, B W; Dermody, T S
2000-09-01
The reovirus attachment protein, sigma1, is responsible for strain-specific patterns of viral tropism in the murine central nervous system and receptor binding on cultured cells. The sigma1 protein consists of a fibrous tail domain proximal to the virion surface and a virion-distal globular head domain. To better understand mechanisms of reovirus attachment to cells, we conducted studies to identify the region of sigma1 that binds cell surface carbohydrate. Chimeric and truncated sigma1 proteins derived from prototype reovirus strains type 1 Lang (T1L) and type 3 Dearing (T3D) were expressed in insect cells by using a baculovirus vector. Assessment of expressed protein susceptibility to proteolytic cleavage, binding to anti-sigma1 antibodies, and oligomerization indicates that the chimeric and truncated sigma1 proteins are properly folded. To assess carbohydrate binding, recombinant sigma1 proteins were tested for the capacity to agglutinate mammalian erythrocytes and to bind sialic acid presented on glycophorin, the cell surface molecule bound by type 3 reovirus on human erythrocytes. Using a panel of two wild-type and ten chimeric and truncated sigma1 proteins, the sialic acid-binding domain of type 3 sigma1 was mapped to a region of sequence proposed to form the more amino terminal of two predicted beta-sheet structures in the tail. This unit corresponds to morphologic region T(iii) observed in computer-processed electron micrographs of sigma1 protein purified from virions. In contrast, the homologous region of T1L sigma1 sequence was not implicated in carbohydrate binding; rather, sequences in the distal portion of the tail known as the neck were required. Results of these studies demonstrate that a functional receptor-binding domain, which uses sialic acid as its ligand, is contained within morphologic region T(iii) of the type 3 sigma1 tail. Furthermore, our findings indicate that T1L and T3D sigma1 proteins contain different arrangements of receptor-binding domains.
Amaral, Catarina; Pimentel, Catarina; Matos, Rute G; Arraiano, Cecília M; Matzapetakis, Manolis; Rodrigues-Pousada, Claudina
2013-01-01
In Saccharomyces cerevisiae, the transcription factor Yap8 is a key determinant in arsenic stress response. Contrary to Yap1, another basic region-leucine zipper (bZIP) yeast regulator, Yap8 has a very restricted DNA-binding specificity and only orchestrates the expression of ACR2 and ACR3 genes. In the DNA-binding basic region, Yap8 has three distinct amino acids residues, Leu26, Ser29 and Asn31, at sites of highly conserved positions in the other Yap family of transcriptional regulators and Pap1 of Schizosaccharomyces pombe. To evaluate whether these residues are relevant to Yap8 specificity, we first built a homology model of the complex Yap8bZIP-DNA based on Pap1-DNA crystal structure. Several Yap8 mutants were then generated in order to confirm the contribution of the residues predicted to interact with DNA. Using bioinformatics analysis together with in vivo and in vitro approaches, we have identified several conserved residues critical for Yap8-DNA binding. Moreover, our data suggest that Leu26 is required for Yap8 binding to DNA and that this residue together with Asn31, hinder Yap1 response element recognition by Yap8, thus narrowing its DNA-binding specificity. Furthermore our results point to a role of these two amino acids in the stability of the Yap8-DNA complex.
Transcription Factor Map Alignment of Promoter Regions
Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic
2006-01-01
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
NASA Technical Reports Server (NTRS)
Moore, C. P.; Rodney, G.; Zhang, J. Z.; Santacruz-Toloza, L.; Strasburg, G.; Hamilton, S. L.
1999-01-01
The skeletal muscle Ca2+ release channel (RYR1) is regulated by calmodulin in both its Ca2+-free (apocalmodulin) and Ca2+-bound (Ca2+ calmodulin) states. Apocalmodulin is an activator of the channel, and Ca2+ calmodulin is an inhibitor of the channel. Both apocalmodulin and Ca2+ calmodulin binding sites on RYR1 are destroyed by a mild tryptic digestion of the sarcoplasmic reticulum membranes, but calmodulin (either form), bound to RYR1 prior to tryptic digestion, protects both the apocalmodulin and Ca2+ calmodulin sites from tryptic destruction. The protected sites are after arginines 3630 and 3637 on RYR1. These studies suggest that both Ca2+ calmodulin and apocalmodulin bind to the same or overlapping regions on RYR1 and block access of trypsin to sites at amino acids 3630 and 3637. This sequence is part of a predicted Ca2+ CaM binding site of amino acids 3614-3642 [Takeshima, H., et al. (1989) Nature 339, 439-445].
Schwartz, J A; Mizukami, H
1991-06-01
A novel arrangement is proposed for the association of the 90 kDa heat shock protein (hsp 90) dimer and the human estrogen receptor (hER) monomer. Secondary structure analyses of the hsp 90 molecule reveal the presence of a cysteine-containing, leucine-rich, heptad repeat, which we refer to as region C. Similar analyses on the hER, at its hormone binding domain (HBD), have indicated the presence of a central subdomain bordered by 2 alpha-helical flanking segments which also display the heptad substructure. Due to its predicted potential for conformational change (1) we refer to this central subdomain as the Helix Conversion Unit or HCU. It contains an HX5C peptide and shares significant homology with the metal-binding domain of a gag-encoded HIV-LAV protein (2). We predict that, by virtue of its presence in duplicate, region C may be capable of simultaneous leucine zipper-like pairing with the hER at its flanking helices, as well as the formation of a shared CCHC-box-type metal binding link with the same hER at the putative HCU which lies in between.
Conservation of Matrix Attachment Region-Binding Filament-Like Protein 1 among Higher Plants1
Harder, Patricia A.; Silverstein, Rebecca A.; Meier, Iris
2000-01-01
The interaction of chromatin with the nuclear matrix via matrix attachment regions (MARs) on the DNA is considered to be of fundamental importance for higher-order chromatin organization and the regulation of gene expression. We have previously isolated a novel nuclear matrix-localized protein (MFP1) from tomato (Lycopersicon esculentum) that preferentially binds to MAR DNA. Tomato MFP1 has a predicted filament-protein-like structure and is associated with the nuclear envelope via an N-terminal targeting domain. Based on the antigenic relationship, we report here that MFP1 is conserved in a large number of dicot and monocot species. Several cDNAs were cloned from tobacco (Nicotiana tabacum) and shown to correspond to two tobacco MFP1 genes. Comparison of the primary and predicted secondary structures of MFP1 from tomato, tobacco, and Arabidopsis indicates a high degree of conservation of the N-terminal targeting domain, the overall putative coiled-coil structure of the protein, and the C-terminal DNA-binding domain. In addition, we show that tobacco MFP1 is regulated in an organ-specific and developmental fashion, and that this regulation occurs at the level of transcription or RNA stability. PMID:10631266
Nonlinear scoring functions for similarity-based ligand docking and binding affinity prediction.
Brylinski, Michal
2013-11-25
A common strategy for virtual screening considers a systematic docking of a large library of organic compounds into the target sites in protein receptors with promising leads selected based on favorable intermolecular interactions. Despite a continuous progress in the modeling of protein-ligand interactions for pharmaceutical design, important challenges still remain, thus the development of novel techniques is required. In this communication, we describe eSimDock, a new approach to ligand docking and binding affinity prediction. eSimDock employs nonlinear machine learning-based scoring functions to improve the accuracy of ligand ranking and similarity-based binding pose prediction, and to increase the tolerance to structural imperfections in the target structures. In large-scale benchmarking using the Astex/CCDC data set, we show that 53.9% (67.9%) of the predicted ligand poses have RMSD of <2 Å (<3 Å). Moreover, using binding sites predicted by recently developed eFindSite, eSimDock models ligand binding poses with an RMSD of 4 Å for 50.0-39.7% of the complexes at the protein homology level limited to 80-40%. Simulations against non-native receptor structures, whose mean backbone rearrangements vary from 0.5 to 5.0 Å Cα-RMSD, show that the ratio of docking accuracy and the estimated upper bound is at a constant level of ∼0.65. Pearson correlation coefficient between experimental and predicted by eSimDock Ki values for a large data set of the crystal structures of protein-ligand complexes from BindingDB is 0.58, which decreases only to 0.46 when target structures distorted to 3.0 Å Cα-RMSD are used. Finally, two case studies demonstrate that eSimDock can be customized to specific applications as well. These encouraging results show that the performance of eSimDock is largely unaffected by the deformations of ligand binding regions, thus it represents a practical strategy for across-proteome virtual screening using protein models. eSimDock is freely available to the academic community as a Web server at http://www.brylinski.org/esimdock .
Sarmady, Mahdi; Dampier, William; Tozeren, Aydin
2011-01-01
Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk. PMID:21738584
Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R; Vojtesek, Borivoj; Ball, Kathryn L
2011-04-22
The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106-140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs.
HITS-CLIP yields genome-wide insights into brain alternative RNA processing
NASA Astrophysics Data System (ADS)
Licatalosi, Donny D.; Mele, Aldo; Fak, John J.; Ule, Jernej; Kayikci, Melis; Chi, Sung Wook; Clark, Tyson A.; Schweitzer, Anthony C.; Blume, John E.; Wang, Xuning; Darnell, Jennifer C.; Darnell, Robert B.
2008-11-01
Protein-RNA interactions have critical roles in all aspects of gene expression. However, applying biochemical methods to understand such interactions in living tissues has been challenging. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova revealed extremely reproducible RNA-binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3' untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.
Basharat, Zarrin; Yasmin, Azra
2015-08-01
Ebola is a highly pathogenic enveloped virus responsible for deadly outbreaks of severe hemorrhagic fever. It enters human cells by binding a multifunctional cholesterol transporter Niemann-Pick C1 (NPC1) protein. Post translational modification (PTM) information for NPC1 is crucial to understand Ebola virus (EBOV) entry and action due to changes in phosphorylation or glycosylation at the binding site. It is difficult and costly to experimentally assess this type of interaction, so in silico strategy was employed. Identification of phosphorylation sites, including conserved residues that could be possible targets for 21 predicted kinases was followed by interplay study between phosphorylation and O-β-GlcNAc modification of NPC1. Results revealed that only 4 out of 48 predicted phosphosites exhibited O-β-GlcNAc activity. Predicted outcomes were integrated with residue conservation and 3D structural information. Three Yin Yang sites were located in the α-helix regions and were conserved in studied vertebrate and mammalian species. Only one modification site S425 was found in β-turn region located near the N-terminus of NPC1 and was found to differ in pig, mouse, cobra and humans. The predictions suggest that Yin Yang sites may not be important for virus attachment to NPC1, whereas phosphosite 473 may be important for binding and hence entry of Ebola virus. This information could be useful in addressing further experimental studies and therapeutic strategies targeting PTM events in EBOV entry. Copyright © 2015 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhattacharya, Monolekha; Das, Amit Kumar, E-mail: amitk@hijli.iitkgp.ernet.in
Highlights: Black-Right-Pointing-Pointer The regulatory sequences recognized by TcrX have been identified. Black-Right-Pointing-Pointer The regulatory region comprises of inverted repeats segregated by 30 bp region. Black-Right-Pointing-Pointer The mode of binding of TcrX with regulatory sequence is unique. Black-Right-Pointing-Pointer In silico TcrX-DNA docked model binds one of the inverted repeats. Black-Right-Pointing-Pointer Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has notmore » been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by {approx}30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.« less
Schlecht, Ulrich; Erb, Ionas; Demougin, Philippe; Robine, Nicolas; Borde, Valérie; van Nimwegen, Erik; Nicolas, Alain
2008-01-01
The autonomously replicating sequence binding factor 1 (Abf1) was initially identified as an essential DNA replication factor and later shown to be a component of the regulatory network controlling mitotic and meiotic cell cycle progression in budding yeast. The protein is thought to exert its functions via specific interaction with its target site as part of distinct protein complexes, but its roles during mitotic growth and meiotic development are only partially understood. Here, we report a comprehensive approach aiming at the identification of direct Abf1-target genes expressed during fermentation, respiration, and sporulation. Computational prediction of the protein's target sites was integrated with a genome-wide DNA binding assay in growing and sporulating cells. The resulting data were combined with the output of expression profiling studies using wild-type versus temperature-sensitive alleles. This work identified 434 protein-coding loci as being transcriptionally dependent on Abf1. More than 60% of their putative promoter regions contained a computationally predicted Abf1 binding site and/or were bound by Abf1 in vivo, identifying them as direct targets. The present study revealed numerous loci previously unknown to be under Abf1 control, and it yielded evidence for the protein's variable DNA binding pattern during mitotic growth and meiotic development. PMID:18305101
Ligand deconstruction: Why some fragment binding positions are conserved and others are not.
Kozakov, Dima; Hall, David R; Jehle, Stefan; Jehle, Sefan; Luo, Lingqi; Ochiana, Stefan O; Jones, Elizabeth V; Pollastri, Michael; Allen, Karen N; Whitty, Adrian; Vajda, Sandor
2015-05-19
Fragment-based drug discovery (FBDD) relies on the premise that the fragment binding mode will be conserved on subsequent expansion to a larger ligand. However, no general condition has been established to explain when fragment binding modes will be conserved. We show that a remarkably simple condition can be developed in terms of how fragments coincide with binding energy hot spots--regions of the protein where interactions with a ligand contribute substantial binding free energy--the locations of which can easily be determined computationally. Because a substantial fraction of the free energy of ligand binding comes from interacting with the residues in the energetically most important hot spot, a ligand moiety that sufficiently overlaps with this region will retain its location even when other parts of the ligand are removed. This hypothesis is supported by eight case studies. The condition helps identify whether a protein is suitable for FBDD, predicts the size of fragments required for screening, and determines whether a fragment hit can be extended into a higher affinity ligand. Our results show that ligand binding sites can usefully be thought of in terms of an anchor site, which is the top-ranked hot spot and dominates the free energy of binding, surrounded by a number of weaker satellite sites that confer improved affinity and selectivity for a particular ligand and that it is the intrinsic binding potential of the protein surface that determines whether it can serve as a robust binding site for a suitably optimized ligand.
Kozakov, Dima; Grove, Laurie E.; Hall, David R.; Bohnuud, Tanggis; Mottarella, Scott; Luo, Lingqi; Xia, Bing; Beglov, Dmitri; Vajda, Sandor
2016-01-01
FTMap is a computational mapping server that identifies binding hot spots of macromolecules, i.e., regions of the surface with major contributions to the ligand binding free energy. To use FTMap, users submit a protein, DNA, or RNA structure in PDB format. FTMap samples billions of positions of small organic molecules used as probes and scores the probe poses using a detailed energy expression. Regions that bind clusters of multiple probe types identify the binding hot spots, in good agreement with experimental data. FTMap serves as basis for other servers, namely FTSite to predict ligand binding sites, FTFlex to account for side chain flexibility, FTMap/param to parameterize additional probes, and FTDyn to map ensembles of protein structures. Applications include determining druggability of proteins, identifying ligand moieties that are most important for binding, finding the most bound-like conformation in ensembles of unliganded protein structures, and providing input for fragment based drug design. FTMap is more accurate than classical mapping methods such as GRID and MCSS, and is much faster than the more recent approaches to protein mapping based on mixed molecular dynamics. Using 16 probe molecules, the FTMap server finds the hot spots of an average size protein in less than an hour. Since FTFlex performs mapping for all low energy conformers of side chains in the binding site, its completion time is proportionately longer. PMID:25855957
NASA Astrophysics Data System (ADS)
Smith, P. J.; Popelier, P. L. A.
2004-02-01
The present day abundance of cheap computing power enables the use of quantum chemical ab initio data in Quantitative Structure-Activity Relationships (QSARs). Optimised bond lengths are a new such class of descriptors, which we have successfully used previously in representing electronic effects in medicinal and ecological QSARs (enzyme inhibitory activity, hydrolysis rate constants and pKas). Here we use AM1 and HF/3-21G* bond lengths in conjunction with Partial Least Squares (PLS) and a Genetic Algorithm (GA) to predict the Corticosteroid-Binding Globulin (CBG) binding activity of the classic steroid data set, and the antibacterial activity of nitrofuran derivatives. The current procedure, which does not require molecular alignment, produces good r2 and q2 values. Moreover, it highlights regions in the common steroid skeleton deemed relevant to the active regions of the steroids and nitrofuran derivatives.
Prediction of Water Binding to Protein Hydration Sites with a Discrete, Semiexplicit Solvent Model.
Setny, Piotr
2015-12-08
Buried water molecules are ubiquitous in protein structures and are found at the interface of most protein-ligand complexes. Determining their distribution and thermodynamic effect is a challenging yet important task, of great of practical value for the modeling of biomolecular structures and their interactions. In this study, we present a novel method aimed at the prediction of buried water molecules in protein structures and estimation of their binding free energies. It is based on a semiexplicit, discrete solvation model, which we previously introduced in the context of small molecule hydration. The method is applicable to all macromolecular structures described by a standard all-atom force field, and predicts complete solvent distribution within a single run with modest computational cost. We demonstrate that it indicates positions of buried hydration sites, including those filled by more than one water molecule, and accurately differentiates them from sterically accessible to water but void regions. The obtained estimates of water binding free energies are in fair agreement with reference results determined with the double decoupling method.
Li, Hao; Redinbo, Matthew R.; Venkatesh, Madhukumar; Ekins, Sean; Chaudhry, Anik; Bloch, Nicolin; Negassa, Abdissa; Mukherjee, Paromita; Kalpana, Ganjam; Mani, Sridhar
2013-01-01
The pregnane X receptor (PXR) is a master regulator of xenobiotic metabolism, and its activity is critical toward understanding the pathophysiology of several diseases, including inflammation, cancer, and steatosis. Previous studies have demonstrated that ketoconazole binds to ligand-activated PXR and antagonizes receptor control of gene expression. Structure-function as well as computational docking analysis suggested a putative binding region containing critical charge clamp residues Gln-272, and Phe-264 on the AF-2 surface of PXR. To define the antagonist binding surface(s) of PXR, we developed a novel assay to identify key amino acid residues on PXR based on a yeast two-hybrid screen that examined mutant forms of PXR. This screen identified multiple “gain-of-function” mutants that were “resistant” to the PXR antagonist effects of ketoconazole. We then compared our screen results identifying key PXR residues to those predicted by computational methods. Of 15 potential or putative binding residues based on docking, we identified three residues in the yeast screen that were then systematically verified to functionally interact with ketoconazole using mammalian assays. Among the residues confirmed by our study was Ser-208, which is on the opposite side of the protein from the AF-2 region critical for receptor regulation. The identification of new locations for antagonist binding on the surface or buried in PXR indicates novel aspects to the mechanism of receptor antagonism. These results significantly expand our understanding of antagonist binding sites on the surface of PXR and suggest new avenues to regulate this receptor for clinical applications. PMID:23525103
Characterization of the molecular basis of group II intron RNA recognition by CRS1-CRM domains.
Keren, Ido; Klipcan, Liron; Bezawork-Geleta, Ayenachew; Kolton, Max; Shaya, Felix; Ostersetzer-Biran, Oren
2008-08-22
CRM (chloroplast RNA splicing and ribosome maturation) is a recently recognized RNA-binding domain of ancient origin that has been retained in eukaryotic genomes only within the plant lineage. Whereas in bacteria CRM domains exist as single domain proteins involved in ribosome maturation, in plants they are found in a family of proteins that contain between one and four repeats. Several members of this family with multiple CRM domains have been shown to be required for the splicing of specific plastidic group II introns. Detailed biochemical analysis of one of these factors in maize, CRS1, demonstrated its high affinity and specific binding to the single group II intron whose splicing it facilitates, the plastid-encoded atpF intron RNA. Through its association with two intronic regions, CRS1 guides the folding of atpF intron RNA into its predicted "catalytically active" form. To understand how multiple CRM domains cooperate to achieve high affinity sequence-specific binding to RNA, we analyzed the RNA binding affinity and specificity associated with each individual CRM domain in CRS1; whereas CRM3 bound tightly to the RNA, CRM1 associated specifically with a unique region found within atpF intron domain I. CRM2, which demonstrated only low binding affinity, also seems to form specific interactions with regions localized to domains I, III, and IV. We further show that CRM domains share structural similarities and RNA binding characteristics with the well known RNA recognition motif domain.
Identification of DNA-Binding Proteins Using Structural, Electrostatic and Evolutionary Features
Nimrod, Guy; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2009-01-01
Summary DNA binding proteins (DBPs) often take part in various crucial processes of the cell's life cycle. Therefore, the identification and characterization of these proteins are of great importance. We present here a random forests classifier for identifying DBPs among proteins with known three-dimensional structures. First, clusters of evolutionarily conserved regions (patches) on the protein's surface are detected using the PatchFinder algorithm; previous studies showed that these regions are typically the proteins' functionally important regions. Next, we train a classifier using features like the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein including its dipole moment. Using 10-fold cross validation on a dataset of 138 DNA-binding proteins and 110 proteins which do not bind DNA, the classifier achieved a sensitivity and a specificity of 0.90, which is overall better than the performance of previously published methods. Furthermore, when we tested 5 different methods on 11 new DBPs which did not appear in the original dataset, only our method annotated all correctly. The resulting classifier was applied to a collection of 757 proteins of known structure and unknown function. Of these proteins, 218 were predicted to bind DNA, and we anticipate that some of them interact with DNA using new structural motifs. The use of complementary computational tools supports the notion that at least some of them do bind DNA. PMID:19233205
Rapid evolution of cis-regulatory sequences via local point mutations
NASA Technical Reports Server (NTRS)
Stone, J. R.; Wray, G. A.
2001-01-01
Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.
Modular structural elements in the replication origin region of Tetrahymena rDNA.
Du, C; Sanzgiri, R P; Shaiu, W L; Choi, J K; Hou, Z; Benbow, R M; Dobbs, D L
1995-01-01
Computer analyses of the DNA replication origin region in the amplified rRNA genes of Tetrahymena thermophila identified a potential initiation zone in the 5'NTS [Dobbs, Shaiu and Benbow (1994), Nucleic Acids Res. 22, 2479-2489]. This region consists of a putative DNA unwinding element (DUE) aligned with predicted bent DNA segments, nuclear matrix or scaffold associated region (MAR/SAR) consensus sequences, and other common modular sequence elements previously shown to be clustered in eukaryotic chromosomal origin regions. In this study, two mung bean nuclease-hypersensitive sites in super-coiled plasmid DNA were localized within the major DUE-like element predicted by thermodynamic analyses. Three restriction fragments of the 5'NTS region predicted to contain bent DNA segments exhibited anomalous migration characteristic of bent DNA during electrophoresis on polyacrylamide gels. Restriction fragments containing the 5'NTS region bound Tetrahymena nuclear matrices in an in vitro binding assay, consistent with an association of the replication origin region with the nuclear matrix in vivo. The direct demonstration in a protozoan origin region of elements previously identified in Drosophila, chick and mammalian origin regions suggests that clusters of modular structural elements may be a conserved feature of eukaryotic chromosomal origins of replication. Images PMID:7784181
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-01-01
Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-03-01
Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
Sael, Lee; Kihara, Daisuke
2012-01-01
Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. PMID:22275074
Sael, Lee; Kihara, Daisuke
2012-04-01
Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. Copyright © 2011 Wiley Periodicals, Inc.
Shinzato, Naoya; Enoki, Miho; Sato, Hiroaki; Nakamura, Kohei; Matsui, Toru; Kamagata, Yoichi
2008-10-01
Two methyl coenzyme M reductases (MCRs) encoded by the mcr and mrt operons of the hydrogenotrophic methanogen Methanothermobacter thermautotrophicus DeltaH are expressed in response to H(2) availability. In the present study, cis elements and trans-acting factors responsible for the gene expression of MCRs were investigated by using electrophoretic mobility shift assay (EMSA) and affinity particle purification. A survey of their operator regions by EMSA with protein extracts from mrt-expressing cultures restricted them to 46- and 41-bp-long mcr and mrt upstream regions, respectively. Affinity particle purification of DNA-binding proteins conjugated with putative operator regions resulted in the retrieval of a protein attributed to IMP dehydrogenase-related protein VII (IMPDH VII). IMPDH VII is predicted to have a winged helix-turn-helix DNA-binding motif and two cystathionine beta-synthase domains, and it has been suspected to be an energy-sensing module. EMSA with oligonucleotide probes with unusual sequences showed that the binding site of IMPDH VII mostly overlaps the factor B-responsible element-TATA box of the mcr operon. The results presented here suggest that IMPDH VII encoded by MTH126 is a plausible candidate for the transcriptional regulator of the mcr operon in this methanogen.
A Structural Model for Binding of the Serine-Rich Repeat Adhesin GspB to Host Carbohydrate Receptors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pyburn, Tasia M.; Bensing, Barbara A.; Xiong, Yan Q.
2014-10-02
GspB is a serine-rich repeat (SRR) adhesin of Streptococcus gordonii that mediates binding of this organism to human platelets via its interaction with sialyl-T antigen on the receptor GPIb{alpha}. This interaction appears to be a major virulence determinant in the pathogenesis of infective endocarditis. To address the mechanism by which GspB recognizes its carbohydrate ligand, we determined the high-resolution x-ray crystal structure of the GspB binding region (GspB{sub BR}), both alone and in complex with a disaccharide precursor to sialyl-T antigen. Analysis of the GspB{sub BR} structure revealed that it is comprised of three independently folded subdomains or modules: (1)more » an Ig-fold resembling a CnaA domain from prokaryotic pathogens; (2) a second Ig-fold resembling the binding region of mammalian Siglecs; (3) a subdomain of unique fold. The disaccharide was found to bind in a pocket within the Siglec subdomain, but at a site distinct from that observed in mammalian Siglecs. Confirming the biological relevance of this binding pocket, we produced three isogenic variants of S. gordonii, each containing a single point mutation of a residue lining this binding pocket. These variants have reduced binding to carbohydrates of GPIb{alpha}. Further examination of purified GspB{sub BR}-R484E showed reduced binding to sialyl-T antigen while S. gordonii harboring this mutation did not efficiently bind platelets and showed a significant reduction in virulence, as measured by an animal model of endocarditis. Analysis of other SRR proteins revealed that the predicted binding regions of these adhesins also had a modular organization, with those known to bind carbohydrate receptors having modules homologous to the Siglec and Unique subdomains of GspBBR. This suggests that the binding specificity of the SRR family of adhesins is determined by the type and organization of discrete modules within the binding domains, which may affect the tropism of organisms for different tissues.« less
PhyloGibbs-MP: Module Prediction and Discriminative Motif-Finding by Gibbs Sampling
Siddharthan, Rahul
2008-01-01
PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules—tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other “discriminative motif-finders” have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use “informative priors” on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data. PMID:18769735
Ramírez-Iglesias, José Rubén; Pérez-Gordones, María Carolina; Del Castillo, Jesús Rafael; Mijares, Alfredo; Benaim, Gustavo; Mendoza, Marta
2018-05-09
The plasma membrane Ca 2+ -ATPase (PMCA) from trypanosomatids lacks a classical calmodulin (CaM) binding domain, although CaM stimulated activities have been detected by biochemical assays. Recently we proposed that the Trypanosoma equiperdum CaM-sensitive PMCA (TePMCA) contains a potential 1-18 CaM-binding motif at the C-terminal region of the pump. In the present study, we evaluated the potential CaM-binding motifs using CaM from Trypanosoma cruzi and either the recombinant full length TePMCA C-terminal sequence (P14) or synthetic peptides comprising different regions of the C-terminal domain. We demonstrated that P14 and a synthetic peptide corresponding to residues 1037-1062 (which contains the predicted 1-18 binding motif) competed efficiently for binding to TcCaM, exhibiting similar IC 50 s of 200 nM. A stable complex of this peptide and TcCaM was formed in the presence of Ca 2+ , as determined by native-polyacrylamide gel electrophoresis. A predicted structure obtained by molecular docking showed an interaction of the 1-18 binding motif with the Ca 2+ /CaM complex. Moreover, when the peptide was incubated with CaM and Ca 2+ , a blue shift in the tryptophan fluorescence spectrum (from 350 to 329 nm) was observed. Substitutions at W 1039 and F 1056 , strongly decreased both CaM-peptide interaction and the complex assembly. Our results demonstrated the presence of a functional 1-18 motif at the TePMCA C-terminal domain. Furthermore, on the basis of spectrofluorometric assays and the resulting structure modeled by docking we propose that the L 1042 and W 1060 residues might also participate as anchors to form a 1-4-18-22 motif. Copyright © 2018 Elsevier B.V. All rights reserved.
Sarkar, Debasree; Patra, Piya; Ghosh, Abhirupa; Saha, Sudipto
2016-01-01
A considerable proportion of protein-protein interactions (PPIs) in the cell are estimated to be mediated by very short peptide segments that approximately conform to specific sequence patterns known as linear motifs (LMs), often present in the disordered regions in the eukaryotic proteins. These peptides have been found to interact with low affinity and are able bind to multiple interactors, thus playing an important role in the PPI networks involving date hubs. In this work, PPI data and de novo motif identification based method (MEME) were used to identify such peptides in three cancer-associated hub proteins-MYC, APC and MDM2. The peptides corresponding to the significant LMs identified for each hub protein were aligned, the overlapping regions across these peptides being termed as overlapping linear peptides (OLPs). These OLPs were thus predicted to be responsible for multiple PPIs of the corresponding hub proteins and a scoring system was developed to rank them. We predicted six OLPs in MYC and five OLPs in MDM2 that scored higher than OLP predictions from randomly generated protein sets. Two OLP sequences from the C-terminal of MYC were predicted to bind with FBXW7, component of an E3 ubiquitin-protein ligase complex involved in proteasomal degradation of MYC. Similarly, we identified peptides in the C-terminal of MDM2 interacting with FKBP3, which has a specific role in auto-ubiquitinylation of MDM2. The peptide sequences predicted in MYC and MDM2 look promising for designing orthosteric inhibitors against possible disease-associated PPIs. Since these OLPs can interact with other proteins as well, these inhibitors should be specific to the targeted interactor to prevent undesired side-effects. This computational framework has been designed to predict and rank the peptide regions that may mediate multiple PPIs and can be applied to other disease-associated date hub proteins for prediction of novel therapeutic targets of small molecule PPI modulators.
A computational method for predicting regulation of human microRNAs on the influenza virus genome
2013-01-01
Background While it has been suggested that host microRNAs (miRNAs) may downregulate viral gene expression as an antiviral defense mechanism, such a mechanism has not been explored in the influenza virus for human flu studies. As it is difficult to conduct related experiments on humans, computational studies can provide some insight. Although many computational tools have been designed for miRNA target prediction, there is a need for cross-species prediction, especially for predicting viral targets of human miRNAs. However, finding putative human miRNAs targeting influenza virus genome is still challenging. Results We developed machine-learning features and conducted comprehensive data training for predicting interactions between H1N1 genome segments and host miRNA. We defined our seed region as the first ten nucleotides from the 5' end of the miRNA to the 3' end of the miRNA and integrated various features including the number of consecutive matching bases in the seed region of 10 bases, a triplet feature in seed regions, thermodynamic energy, penalty of bulges and wobbles at binding sites, and the secondary structure of viral RNA for the prediction. Conclusions Compared to general predictive models, our model fully takes into account the conservation patterns and features of viral RNA secondary structures, and greatly improves the prediction accuracy. Our model identified some key miRNAs including hsa-miR-489, hsa-miR-325, hsa-miR-876-3p and hsa-miR-2117, which target HA, PB2, MP and NS of H1N1, respectively. Our study provided an interesting hypothesis concerning the miRNA-based antiviral defense mechanism against influenza virus in human, i.e., the binding between human miRNA and viral RNAs may not result in gene silencing but rather may block the viral RNA replication. PMID:24565017
Informative priors based on transcription factor structural class improve de novo motif discovery.
Narlikar, Leelavati; Gordân, Raluca; Ohler, Uwe; Hartemink, Alexander J
2006-07-15
An important problem in molecular biology is to identify the locations at which a transcription factor (TF) binds to DNA, given a set of DNA sequences believed to be bound by that TF. In previous work, we showed that information in the DNA sequence of a binding site is sufficient to predict the structural class of the TF that binds it. In particular, this suggests that we can predict which locations in any DNA sequence are more likely to be bound by certain classes of TFs than others. Here, we argue that traditional methods for de novo motif finding can be significantly improved by adopting an informative prior probability that a TF binding site occurs at each sequence location. To demonstrate the utility of such an approach, we present priority, a powerful new de novo motif finding algorithm. Using data from TRANSFAC, we train three classifiers to recognize binding sites of basic leucine zipper, forkhead, and basic helix loop helix TFs. These classifiers are used to equip priority with three class-specific priors, in addition to a default prior to handle TFs of other classes. We apply priority and a number of popular motif finding programs to sets of yeast intergenic regions that are reported by ChIP-chip to be bound by particular TFs. priority identifies motifs the other methods fail to identify, and correctly predicts the structural class of the TF recognizing the identified binding sites. Supplementary material and code can be found at http://www.cs.duke.edu/~amink/.
Banderas, Alvaro; Guiliani, Nicolas
2013-08-16
The biomining bacterium Acidithiobacillus ferrooxidans oxidizes sulfide ores and promotes metal solubilization. The efficiency of this process depends on the attachment of cells to surfaces, a process regulated by quorum sensing (QS) cell-to-cell signalling in many Gram-negative bacteria. At. ferrooxidans has a functional QS system and the presence of AHLs enhances its attachment to pyrite. However, direct targets of the QS transcription factor AfeR remain unknown. In this study, a bioinformatic approach was used to infer possible AfeR direct targets based on the particular palindromic features of the AfeR binding site. A set of Hidden Markov Models designed to maintain palindromic regions and vary non-palindromic regions was used to screen for putative binding sites. By annotating the context of each predicted binding site (PBS), we classified them according to their positional coherence relative to other putative genomic structures such as start codons, RNA polymerase promoter elements and intergenic regions. We further used the Multiple EM for Motif Elicitation algorithm (MEME) to further filter out low homology PBSs. In summary, 75 target-genes were identified, 34 of which have a higher confidence level. Among the identified genes, we found afeR itself, zwf, genes encoding glycosyltransferase activities, metallo-beta lactamases, and active transport-related proteins. Glycosyltransferases and Zwf (Glucose 6-phosphate-1-dehydrogenase) might be directly involved in polysaccharide biosynthesis and attachment to minerals by At. ferrooxidans cells during the bioleaching process.
Banderas, Alvaro; Guiliani, Nicolas
2013-01-01
The biomining bacterium Acidithiobacillus ferrooxidans oxidizes sulfide ores and promotes metal solubilization. The efficiency of this process depends on the attachment of cells to surfaces, a process regulated by quorum sensing (QS) cell-to-cell signalling in many Gram-negative bacteria. At. ferrooxidans has a functional QS system and the presence of AHLs enhances its attachment to pyrite. However, direct targets of the QS transcription factor AfeR remain unknown. In this study, a bioinformatic approach was used to infer possible AfeR direct targets based on the particular palindromic features of the AfeR binding site. A set of Hidden Markov Models designed to maintain palindromic regions and vary non-palindromic regions was used to screen for putative binding sites. By annotating the context of each predicted binding site (PBS), we classified them according to their positional coherence relative to other putative genomic structures such as start codons, RNA polymerase promoter elements and intergenic regions. We further used the Multiple EM for Motif Elicitation algorithm (MEME) to further filter out low homology PBSs. In summary, 75 target-genes were identified, 34 of which have a higher confidence level. Among the identified genes, we found afeR itself, zwf, genes encoding glycosyltransferase activities, metallo-beta lactamases, and active transport-related proteins. Glycosyltransferases and Zwf (Glucose 6-phosphate-1-dehydrogenase) might be directly involved in polysaccharide biosynthesis and attachment to minerals by At. ferrooxidans cells during the bioleaching process. PMID:23959118
GenProBiS: web server for mapping of sequence variants to protein binding sites.
Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka
2017-07-03
Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Dayan, Avraham; Babin, Gilad; Ganoth, Assaf; Kayouf, Nivin Samir; Nitoker Eliaz, Neta; Mukkala, Srijana; Tsfadia, Yossi; Fleminger, Gideon
2017-08-01
Titanium (Ti) and its alloys are widely used in orthodontic and orthopedic implants by virtue to their high biocompatibility, mechanical strength, and high resistance to corrosion. Biointegration of the implants with the tissue requires strong interactions, which involve biological molecules, proteins in particular, with metal oxide surfaces. An exocellular high-affinity titanium dioxide (TiO 2 )-binding protein (TiBP), purified from Rhodococcus ruber, has been previously studied in our lab. This protein was shown to be homologous with the orthologous cytoplasmic rhodococcal dihydrolipoamide dehydrogenase (rhDLDH). We have found that rhDLDH and its human homolog (hDLDH) share the TiO 2 -binding capabilities with TiBP. Intrigued by the unique TiO 2 -binding properties of hDLDH, we anticipated that it may serve as a molecular bridge between Ti-based medical structures and human tissues. The objective of the current study was to locate the region and the amino acids of the protein that mediate the protein-TiO 2 surface interaction. We demonstrated the role of acidic amino acids in the nonelectrostatic enzyme/dioxide interactions at neutral pH. The observation that the interaction of DLDH with various metal oxides is independent of their isoelectric values strengthens this notion. DLDH does not lose its enzymatic activity upon binding to TiO 2 , indicating that neither the enzyme undergoes major conformational changes nor the TiO 2 binding site is blocked. Docking predictions suggest that both rhDLDH and hDLDH bind TiO 2 through similar regions located far from the active site and the dimerization sites. The putative TiO 2 -binding regions of both the bacterial and human enzymes were found to contain a CHED (Cys, His, Glu, Asp) motif, which has been shown to participate in metal-binding sites in proteins. Copyright © 2017 John Wiley & Sons, Ltd.
DuMond, Jenna F.; Ramkissoon, Kevin; Zhang, Xue; Izumi, Yuichiro; Wang, Xujing; Eguchi, Koji; Gao, Shouguo; Mukoyama, Masashi; Ferraris, Joan D.
2016-01-01
NFAT5 is an osmoregulated transcription factor that particularly increases expression of genes involved in protection against hypertonicity. Transcription factors often contain unstructured regions that bind co-regulatory proteins that are crucial for their function. The NH2-terminal region of NFAT5 contains regions predicted to be intrinsically disordered. We used peptide aptamer-based affinity chromatography coupled with mass spectrometry to identify protein preys pulled down by one or more overlapping 20 amino acid peptide baits within a predicted NH2-terminal unstructured region of NFAT5. We identify a total of 467 unique protein preys that associate with at least one NH2-terminal peptide bait from NFAT5 in either cytoplasmic or nuclear extracts from HEK293 cells treated with elevated, normal, or reduced NaCl concentrations. Different sets of proteins are pulled down from nuclear vs. cytoplasmic extracts. We used GeneCards to ascertain known functions of the protein preys. The protein preys include many that were previously known, but also many novel ones. Consideration of the novel ones suggests many aspects of NFAT5 regulation, interaction and function that were not previously appreciated, for example, hypertonicity inhibits NFAT5 by sumoylating it and the NFAT5 protein preys include components of the CHTOP complex that desumoylate proteins, an action that should contribute to activation of NFAT5. PMID:26757802
Conformational control and DNA-binding mechanism of the metazoan origin recognition complex.
Bleichert, Franziska; Leitner, Alexander; Aebersold, Ruedi; Botchan, Michael R; Berger, James M
2018-06-26
In eukaryotes, the heterohexameric origin recognition complex (ORC) coordinates replication onset by facilitating the recruitment and loading of the minichromosome maintenance 2-7 (Mcm2-7) replicative helicase onto DNA to license origins. Drosophila ORC can adopt an autoinhibited configuration that is predicted to prevent Mcm2-7 loading; how the complex is activated and whether other ORC homologs can assume this state are not known. Using chemical cross-linking and mass spectrometry, biochemical assays, and electron microscopy (EM), we show that the autoinhibited state of Drosophila ORC is populated in solution, and that human ORC can also adopt this form. ATP binding to ORC supports a transition from the autoinhibited state to an active configuration, enabling the nucleotide-dependent association of ORC with both DNA and Cdc6. An unstructured N-terminal region adjacent to the conserved ATPase domain of Orc1 is shown to be required for high-affinity ORC-DNA interactions, but not for activation. ORC optimally binds DNA duplexes longer than the predicted footprint of the ORC ATPases associated with a variety of cellular activities (AAA + ) and winged-helix (WH) folds; cryo-EM analysis of Drosophila ORC bound to DNA and Cdc6 indicates that ORC contacts DNA outside of its central core region, bending the DNA away from its central DNA-binding channel. Our findings indicate that ORC autoinhibition may be common to metazoans and that ORC-Cdc6 remodels origin DNA before Mcm2-7 recruitment and loading.
Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R.; Vojtesek, Borivoj; Ball, Kathryn L.
2011-01-01
The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106–140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs. PMID:21245151
Dehghani, Hossein; Ghobakhloo, Sepideh; Neishabury, Maryam
2016-08-01
In our previous studies on the Iranian β-thalassemia (β-thal) patients, we identified an association between the severity of the β-thal phenotype and the polymorphic palindromic site at the 5' hypersensitive site 4-locus control region (5'HS4-LCR) of the β-globin gene cluster. Furthermore, a linkage disequilibrium was observed between this region and XmnI-HBG2 in the patient population. Based on this data, it was suggested that the well-recognized phenotype-ameliorating role assigned to positive XmnI could be associated with its linked elements in the LCR. To investigate the functional significance of polymorphisms at the 5'HS4-LCR, we studied its influence on binding of transcription factors. Web-based predictions of transcription factor binding revealed a binding site for runt-related transcription factor 1 (RUNX1), when the allele at the center of the palindrome (TGGGG(A/G)CCCCA) was A but not when it was G. Furthermore, electromobility shift assay (EMSA) presented evidence in support of allele-specific binding of RUNX1 to 5'HS4. Considering that RUNX1 is a well-known regulator of hematopoiesis, these preliminary data suggest the importance of further studies to confirm this interaction and consequently investigate its functional and phenotypical relevance. These studies could help us to understand the molecular mechanism behind the phenotype modifying role of the 5'HS4-LCR polymorphic palindromic region (rs16912979), which has been observed in previous studies.
Ligand deconstruction: Why some fragment binding positions are conserved and others are not
Kozakov, Dima; Hall, David R.; Jehle, Stefan; Luo, Lingqi; Ochiana, Stefan O.; Jones, Elizabeth V.; Pollastri, Michael; Allen, Karen N.; Whitty, Adrian; Vajda, Sandor
2015-01-01
Fragment-based drug discovery (FBDD) relies on the premise that the fragment binding mode will be conserved on subsequent expansion to a larger ligand. However, no general condition has been established to explain when fragment binding modes will be conserved. We show that a remarkably simple condition can be developed in terms of how fragments coincide with binding energy hot spots—regions of the protein where interactions with a ligand contribute substantial binding free energy—the locations of which can easily be determined computationally. Because a substantial fraction of the free energy of ligand binding comes from interacting with the residues in the energetically most important hot spot, a ligand moiety that sufficiently overlaps with this region will retain its location even when other parts of the ligand are removed. This hypothesis is supported by eight case studies. The condition helps identify whether a protein is suitable for FBDD, predicts the size of fragments required for screening, and determines whether a fragment hit can be extended into a higher affinity ligand. Our results show that ligand binding sites can usefully be thought of in terms of an anchor site, which is the top-ranked hot spot and dominates the free energy of binding, surrounded by a number of weaker satellite sites that confer improved affinity and selectivity for a particular ligand and that it is the intrinsic binding potential of the protein surface that determines whether it can serve as a robust binding site for a suitably optimized ligand. PMID:25918377
Yasmin, T; Nabi, A H M Nurun
2016-05-01
Ebola virus (EBV) has become a serious threat to public health. Different approaches were applied to predict continuous and discontinuous B cell epitopes as well as T cell epitopes from the sequence-based and available three-dimensional structural analyses of each protein of EBV. Peptides '(79) VPSATKRWGFRSGVPP(94) ' from GP1 and '(515) LHYWTTQDEGAAIGLA(530) ' from GP2 of Ebola were found to be the consensus peptidic sequences predicted as linear B cell epitope of which the latter contains a region (519) TTQDEG(524) that fulfilled all the criteria of accessibility, hydrophilicity, flexibility and beta turn region for becoming an ideal B cell epitope. Different nonamers as T cell epitopes were obtained that interacted with different numbers of MHC class I and class II alleles with a binding affinity of <100 nm. Interestingly, these alleles also bound to the MHC class I alleles mostly prevalent in African and South Asian regions. Of these, 'LANETTQAL' and 'FLYDRLAST' nonamers were predicted to be the most potent T cell epitopes and they, respectively, interacted with eight and twelve class I alleles that covered 63.79% and 54.16% of world population, respectively. These nonamers were found to be the core sequences of 15mer peptides that interacted with the most common class II allele, HLA-DRB1*01:01. They were further validated for their binding to specific class I alleles using docking technique. Thus, these predicted epitopes may be used as vaccine targets against EBV and can be validated in model hosts to verify their efficacy as vaccine. © 2016 The Foundation for the Scandinavian Journal of Immunology.
Allele-Specific Transcription Factor Binding in Pig Calpastatin Promoter Regions
USDA-ARS?s Scientific Manuscript database
The identification of predictive DNA markers for pork quality would allow U.S. pork producers and breeders to more quickly and efficiently select genetically superior animals for production of consistent, high quality meat. Genome scans have identified QTL for tenderness on pig chromosome 2 which ha...
Iakhiaeva, Elena; Iakhiaev, Alexei; Zwieb, Christian
2010-11-13
Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed.
2010-01-01
Background Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. Results We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. Conclusions The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed. PMID:21073748
Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly
2014-01-01
microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.
Selection of the simplest RNA that binds isoleucine
LOZUPONE, CATHERINE; CHANGAYIL, SHANKAR; MAJERFELD, IRENE; YARUS, MICHAEL
2003-01-01
We have identified the simplest RNA binding site for isoleucine using selection-amplification (SELEX), by shrinking the size of the randomized region until affinity selection is extinguished. Such a protocol can be useful because selection does not necessarily make the simplest active motif most prominent, as is often assumed. We find an isoleucine binding site that behaves exactly as predicted for the site that requires fewest nucleotides. This UAUU motif (16 highly conserved positions; 27 total), is also the most abundant site in successful selections on short random tracts. The UAUU site, now isolated independently at least 63 times, is a small asymmetric internal loop. Conserved loop sequences include isoleucine codon and anticodon triplets, whose nucleotides are required for amino acid binding. This reproducible association between isoleucine and its coding sequences supports the idea that the genetic code is, at least in part, a stereochemical residue of the most easily isolated RNA–amino acid binding structures. PMID:14561881
Development of estrogen receptor beta binding prediction model using large sets of chemicals.
Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao
2017-11-03
We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .
NASA Astrophysics Data System (ADS)
Stornaiuolo, Mariano; Bruno, Agostino; Botta, Lorenzo; Regina, Giuseppe La; Cosconati, Sandro; Silvestri, Romano; Marinelli, Luciana; Novellino, Ettore
2015-10-01
A Cannabinoid Receptor 1 (CB1) binding site for the selective allosteric modulator ORG27569 is here identified through an integrate approach of consensus pocket prediction, mutagenesis studies and Mass Spectrometry. This unprecedented ORG27569 pocket presents the structural features of a Cholesterol Consensus Motif, a cholesterol interacting region already found in other GPCRs. ORG27569 and cholesterol affects oppositely CB1 affinity for orthosteric ligands. Moreover, the rise in cholesterol intracellular level results in CB1 trafficking to the axonal region of neuronal cells, while, on the contrary, ORG27568 binding induces CB1 enrichment at the soma. This control of receptor migration among functionally different membrane regions of the cell further contributes to downstream signalling and adds a previously unknown mechanism underpinning CB1 modulation by ORG27569 , that goes beyond a mere control of receptor affinity for orthosteric ligands.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.
Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A
2016-01-01
A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome
Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.
2016-01-01
A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274
Keilwagen, Jens; Grau, Jan; Paponov, Ivan A; Posch, Stefan; Strickert, Marc; Grosse, Ivo
2011-02-10
Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom.
Improve the prediction of RNA-binding residues using structural neighbours.
Li, Quan; Cao, Zanxia; Liu, Haiyan
2010-03-01
The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.
Vyas, Vivek K; Ghate, Manjunath; Patel, Kinjal; Qureshi, Gulamnizami; Shah, Surmil
2015-08-01
Ang II-AT1 receptors play an important role in mediating virtually all of the physiological actions of Ang II. Several drugs (SARTANs) are available, which can block the AT1 receptor effectively and lower the blood pressure in the patients with hypertension. Currently, there is no experimental Ang II-AT1 structure available; therefore, in this study we modeled Ang II-AT1 receptor structure using homology modeling followed by identification and characterization of binding sites and thereby assessing druggability of the receptor. Homology models were constructed using MODELLER and I-TASSER server, refined and validated using PROCHECK in which 96.9% of 318 residues were present in the favoured regions of the Ramachandran plots. Various Ang II-AT1 receptor antagonist drugs are available in the market as antihypertensive drug, so we have performed docking study with the binding site prediction algorithms to predict different binding pockets on the modeled proteins. The identification of 3D structures and binding sites for various known drugs will guide us for the structure-based drug design of novel compounds as Ang II-AT1 receptor antagonists for the treatment of hypertension. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Computational design of an endo-1,4-[beta]-xylanase ligand binding site
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morin, Andrew; Kaufmann, Kristian W.; Fortenberry, Carie
2012-09-05
The field of computational protein design has experienced important recent success. However, the de novo computational design of high-affinity protein-ligand interfaces is still largely an open challenge. Using the Rosetta program, we attempted the in silico design of a high-affinity protein interface to a small peptide ligand. We chose the thermophilic endo-1,4-{beta}-xylanase from Nonomuraea flexuosa as the protein scaffold on which to perform our designs. Over the course of the study, 12 proteins derived from this scaffold were produced and assayed for binding to the target ligand. Unfortunately, none of the designed proteins displayed evidence of high-affinity binding. Structural characterizationmore » of four designed proteins revealed that although the predicted structure of the protein model was highly accurate, this structural accuracy did not translate into accurate prediction of binding affinity. Crystallographic analyses indicate that the lack of binding affinity is possibly due to unaccounted for protein dynamics in the 'thumb' region of our design scaffold intrinsic to the family 11 {beta}-xylanase fold. Further computational analysis revealed two specific, single amino acid substitutions responsible for an observed change in backbone conformation, and decreased dynamic stability of the catalytic cleft. These findings offer new insight into the dynamic and structural determinants of the {beta}-xylanase proteins.« less
Vieira, Marcos C; Zinder, Daniel; Cobey, Sarah
2018-01-01
Abstract High-affinity antibodies arise within weeks of infection from the evolution of B-cell receptors under selection to improve antigen recognition. This rapid adaptation is enabled by the distribution of highly mutable “hotspot” motifs in B-cell receptor genes. High mutability in antigen-binding regions (complementarity determining regions [CDRs]) creates variation in binding affinity, whereas low mutability in structurally important regions (framework regions [FRs]) may reduce the frequency of destabilizing mutations. During the response, loss of mutational hotspots and changes in their distribution across CDRs and FRs are predicted to compromise the adaptability of B-cell receptors, yet the contributions of different mechanisms to gains and losses of hotspots remain unclear. We reconstructed changes in anti-HIV B-cell receptor sequences and show that mutability losses were ∼56% more frequent than gains in both CDRs and FRs, with the higher relative mutability of CDRs maintained throughout the response. At least 21% of the total mutability loss was caused by synonymous mutations. However, nonsynonymous substitutions caused most (79%) of the mutability loss in CDRs. Because CDRs also show strong positive selection, this result suggests that selection for mutations that increase binding affinity contributed to loss of mutability in antigen-binding regions. Although recurrent adaptation to evolving viruses could indirectly select for high mutation rates, we found no evidence of indirect selection to increase or retain hotspots. Our results suggest mutability losses are intrinsic to both the neutral and adaptive evolution of B-cell populations and might constrain their adaptation to rapidly evolving pathogens such as HIV and influenza. PMID:29688540
Dromey, James A; Weenink, Sarah M; Peters, Günther H; Endl, Josef; Tighe, Patrick J; Todd, Ian; Christie, Michael R
2004-04-01
IA-2 is a major target of autoimmunity in type 1 diabetes. IA-2 responsive T cells recognize determinants within regions represented by amino acids 787-817 and 841-869 of the molecule. Epitopes for IA-2 autoantibodies are largely conformational and not well defined. In this study, we used peptide phage display and homology modeling to characterize the epitope of a monoclonal IA-2 Ab (96/3) from a human type 1 diabetic patient. This Ab competes for IA-2 binding with Abs from the majority of patients with type 1 diabetes and therefore binds a region close to common autoantibody epitopes. Alignment of peptides obtained after screening phage-displayed peptide libraries with purified 96/3 identified a consensus binding sequence of Asn-x-Glu-x-x-(aromatic)-x-x-Gly. The predicted surface on a three-dimensional homology model of the tyrosine phosphatase domain of IA-2 was analyzed for clusters of Asn, Glu, and aromatic residues and amino acids contributing to the epitope investigated using site-directed mutagenesis. Mutation of each of amino acids Asn(858), Glu(836), and Trp(799) reduced 96/3 Ab binding by >45%. Mutations of these residues also inhibited binding of serum autoantibodies from IA-2 Ab-positive type 1 diabetic patients. This study identifies a region commonly recognized by autoantibodies in type 1 diabetes that overlaps with dominant T cell determinants.
Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.
Nimrod, Guy; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2009-04-10
DNA-binding proteins (DBPs) participate in various crucial processes in the life-cycle of the cells, and the identification and characterization of these proteins is of great importance. We present here a random forests classifier for identifying DBPs among proteins with known 3D structures. First, clusters of evolutionarily conserved regions (patches) on the surface of proteins were detected using the PatchFinder algorithm; earlier studies showed that these regions are typically the functionally important regions of proteins. Next, we trained a classifier using features like the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein, including its dipole moment. Using 10-fold cross-validation on a dataset of 138 DBPs and 110 proteins that do not bind DNA, the classifier achieved a sensitivity and a specificity of 0.90, which is overall better than the performance of published methods. Furthermore, when we tested five different methods on 11 new DBPs that did not appear in the original dataset, only our method annotated all correctly. The resulting classifier was applied to a collection of 757 proteins of known structure and unknown function. Of these proteins, 218 were predicted to bind DNA, and we anticipate that some of them interact with DNA using new structural motifs. The use of complementary computational tools supports the notion that at least some of them do bind DNA.
Schaefke, Bernhard; Wang, Tzi-Yuan; Wang, Chuen-Yi; Li, Wen-Hsiung
2015-01-01
Gene expression evolution occurs through changes in cis- or trans-regulatory elements or both. Interactions between transcription factors (TFs) and their binding sites (TFBSs) constitute one of the most important points where these two regulatory components intersect. In this study, we investigated the evolution of TFBSs in the promoter regions of different Saccharomyces strains and species. We divided the promoter of a gene into the proximal region and the distal region, which are defined, respectively, as the 200-bp region upstream of the transcription starting site and as the 200-bp region upstream of the proximal region. We found that the predicted TFBSs in the proximal promoter regions tend to be evolutionarily more conserved than those in the distal promoter regions. Additionally, Saccharomyces cerevisiae strains used in the fermentation of alcoholic drinks have experienced more TFBS losses than gains compared with strains from other environments (wild strains, laboratory strains, and clinical strains). We also showed that differences in TFBSs correlate with the cis component of gene expression evolution between species (comparing S. cerevisiae and its sister species Saccharomyces paradoxus) and within species (comparing two closely related S. cerevisiae strains). PMID:26220934
NASA Astrophysics Data System (ADS)
Basu, Sankar; Söderquist, Fredrik; Wallner, Björn
2017-05-01
The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs/IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP/IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents `Proteus', a random forest classifier that predicts the likelihood of a residue undergoing a disorder-to-order transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55 vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible `disorder-to-order' transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.
Resetca, Diana; Haftchenary, Sina; Gunning, Patrick T; Wilson, Derek J
2014-11-21
The activity of the transcription factor signal transducer and activator of transcription 3 (STAT3) is dysregulated in a number of hematological and solid malignancies. Development of pharmacological STAT3 Src homology 2 (SH2) domain interaction inhibitors holds great promise for cancer therapy, and a novel class of salicylic acid-based STAT3 dimerization inhibitors that includes orally bioavailable drug candidates has been recently developed. The compounds SF-1-066 and BP-1-102 are predicted to bind to the STAT3 SH2 domain. However, given the highly unstructured and dynamic nature of the SH2 domain, experimental confirmation of this prediction was elusive. We have interrogated the protein-ligand interaction of STAT3 with these small molecule inhibitors by means of time-resolved electrospray ionization hydrogen-deuterium exchange mass spectrometry. Analysis of site-specific evolution of deuterium uptake induced by the complexation of STAT3 with SF-1-066 or BP-1-102 under physiological conditions enabled the mapping of the in silico predicted inhibitor binding site to the STAT3 SH2 domain. The binding of both inhibitors to the SH2 domain resulted in significant local decreases in dynamics, consistent with solvent exclusion at the inhibitor binding site and increased rigidity of the inhibitor-complexed SH2 domain. Interestingly, inhibitor binding induced hot spots of allosteric perturbations outside of the SH2 domain, manifesting mainly as increased deuterium uptake, in regions of STAT3 important for DNA binding and nuclear localization. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Hogan, Daniel J; Riordan, Daniel P; Gerber, André P; Herschlag, Daniel; Brown, Patrick O
2008-10-28
RNA-binding proteins (RBPs) have roles in the regulation of many post-transcriptional steps in gene expression, but relatively few RBPs have been systematically studied. We searched for the RNA targets of 40 proteins in the yeast Saccharomyces cerevisiae: a selective sample of the approximately 600 annotated and predicted RBPs, as well as several proteins not annotated as RBPs. At least 33 of these 40 proteins, including three of the four proteins that were not previously known or predicted to be RBPs, were reproducibly associated with specific sets of a few to several hundred RNAs. Remarkably, many of the RBPs we studied bound mRNAs whose protein products share identifiable functional or cytotopic features. We identified specific sequences or predicted structures significantly enriched in target mRNAs of 16 RBPs. These potential RNA-recognition elements were diverse in sequence, structure, and location: some were found predominantly in 3'-untranslated regions, others in 5'-untranslated regions, some in coding sequences, and many in two or more of these features. Although this study only examined a small fraction of the universe of yeast RBPs, 70% of the mRNA transcriptome had significant associations with at least one of these RBPs, and on average, each distinct yeast mRNA interacted with three of the RBPs, suggesting the potential for a rich, multidimensional network of regulation. These results strongly suggest that combinatorial binding of RBPs to specific recognition elements in mRNAs is a pervasive mechanism for multi-dimensional regulation of their post-transcriptional fate.
Development of Design Rules for Reliable Antisense RNA Behavior in E. coli.
Hoynes-O'Connor, Allison; Moon, Tae Seok
2016-12-16
A key driver of synthetic biology is the development of designable genetic parts with predictable behaviors that can be quickly implemented in complex genetic systems. However, the intrinsic complexity of gene regulation can make the rational design of genetic parts challenging. This challenge is apparent in the design of antisense RNA (asRNA) regulators. Though asRNAs are well-known regulators, the literature governing their design is conflicting and leaves the synthetic biology community without clear asRNA design rules. The goal of this study is to perform a comprehensive experimental characterization and statistical analysis of 121 unique asRNA regulators in order to resolve the conflicts that currently exist in the literature. asRNAs usually consist of two regions, the Hfq binding site and the target binding region (TBR). First, the behaviors of several high-performing Hfq binding sites were compared, in terms of their ability to improve repression efficiencies and their orthogonality. Next, a large-scale analysis of TBR design parameters identified asRNA length, the thermodynamics of asRNA-mRNA complex formation, and the percent of target mismatch as key parameters for TBR design. These parameters were used to develop simple asRNA design rules. Finally, these design rules were applied to construct both a simple and a complex genetic circuit containing different asRNAs, and predictable behavior was observed in both circuits. The results presented in this study will drive synthetic biology forward by providing useful design guidelines for the construction of asRNA regulators with predictable behaviors.
Yokoyama, Ken Daigoro; Pollock, David D
2012-01-01
Functional modification of regulatory proteins can affect hundreds of genes throughout the genome, and is therefore thought to be almost universally deleterious. This belief, however, has recently been challenged. A potential example comes from transcription factor SP1, for which statistical evidence indicates that motif preferences were altered in eutherian mammals. Here, we set out to discover possible structural and theoretical explanations, evaluate the role of selection in SP1 evolution, and discover effects on coregulatory proteins. We show that SP1 motif preferences were convergently altered in birds as well as mammals, inducing coevolutionary changes in over 800 regulatory regions. Structural and phylogenic evidence implicates a single causative amino acid replacement at the same SP1 position along both lineages. Furthermore, paralogs SP3 and SP4, which coregulate SP1 target genes through competitive binding to the same sites, have accumulated convergent replacements at the homologous position multiple times during eutherian and bird evolution, presumably to preserve competitive binding. To determine plausibility, we developed and implemented a simple model of transcription factor and binding site coevolution. This model predicts that, in contrast to prevailing beliefs, even small selective benefits per locus can drive concurrent fixation of transcription factor and binding site mutants under a broad range of conditions. Novel binding sites tend to arise de novo, rather than by mutation from ancestral sites, a prediction substantiated by SP1-binding site alignments. Thus, multiple lines of evidence indicate that selection has driven convergent evolution of transcription factors along with their binding sites and coregulatory proteins.
Yokoyama, Ken Daigoro; Pollock, David D.
2012-01-01
Functional modification of regulatory proteins can affect hundreds of genes throughout the genome, and is therefore thought to be almost universally deleterious. This belief, however, has recently been challenged. A potential example comes from transcription factor SP1, for which statistical evidence indicates that motif preferences were altered in eutherian mammals. Here, we set out to discover possible structural and theoretical explanations, evaluate the role of selection in SP1 evolution, and discover effects on coregulatory proteins. We show that SP1 motif preferences were convergently altered in birds as well as mammals, inducing coevolutionary changes in over 800 regulatory regions. Structural and phylogenic evidence implicates a single causative amino acid replacement at the same SP1 position along both lineages. Furthermore, paralogs SP3 and SP4, which coregulate SP1 target genes through competitive binding to the same sites, have accumulated convergent replacements at the homologous position multiple times during eutherian and bird evolution, presumably to preserve competitive binding. To determine plausibility, we developed and implemented a simple model of transcription factor and binding site coevolution. This model predicts that, in contrast to prevailing beliefs, even small selective benefits per locus can drive concurrent fixation of transcription factor and binding site mutants under a broad range of conditions. Novel binding sites tend to arise de novo, rather than by mutation from ancestral sites, a prediction substantiated by SP1-binding site alignments. Thus, multiple lines of evidence indicate that selection has driven convergent evolution of transcription factors along with their binding sites and coregulatory proteins. PMID:23019068
Determinants of BH3 Binding Specificity for Mcl-1 versus Bcl-x[subscript L
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott
2010-06-25
Interactions among Bcl-2 family proteins are important for regulating apoptosis. Prosurvival members of the family interact with proapoptotic BH3 (Bcl-2-homology-3)-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the {alpha}-helical BH3 region of the proapoptotic proteins to a conserved hydrophobic groove on the prosurvival proteins. Native BH3-only proteins exhibit selectivity in binding prosurvival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the designmore » of new classes of selective inhibitors to serve as reagents or therapeutics. In this work, we used two complementary techniques - yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis - to elucidate specificity determinants for binding to Bcl-x{sub L} versus Mcl-1, two prominent prosurvival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-x{sub L} selectively or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1-selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-x{sub L}, Bcl-2, Bcl-w, and Bfl-1, whereas Bcl-x{sub L}-selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 versus Bcl-x{sub L} binders.« less
In silico analysis of miRNA-mediated gene regulation in OCA and OA genes.
Kamaraj, Balu; Gopalakrishnan, Chandrasekhar; Purohit, Rituraj
2014-12-01
Albinism is an autosomal recessive genetic disorder due to low secretion of melanin. The oculocutaneous albinism (OCA) and ocular albinism (OA) genes are responsible for melanin production and also act as a potential targets for miRNAs. The role of miRNA is to inhibit the protein synthesis partially or completely by binding with the 3'UTR of the mRNA thus regulating gene expression. In this analysis, we predicted the genetic variation that occurred in 3'UTR of the transcript which can be a reason for low melanin production thus causing albinism. The single nucleotide polymorphisms (SNPs) in 3'UTR cause more new binding sites for miRNA which binds with mRNA which leads to inhibit the translation process either partially or completely. The SNPs in the mRNA of OCA and OA genes can create new binding sites for miRNA which may control the gene expression and lead to hypopigmentation. We have developed a computational procedure to determine the SNPs in the 3'UTR region of mRNA of OCA (TYR, OCA2, TYRP1 and SLC45A2) and OA (GPR143) genes which will be a potential cause for albinism. We identified 37 SNPs in five genes that are predicted to create 87 new binding sites on mRNA, which may lead to abrogation of the translation process. Expression analysis confirms that these genes are highly expressed in skin and eye regions. It is well supported by enrichment analysis that these genes are mainly involved in eye pigmentation and melanin biosynthesis process. The network analysis also shows how the genes are interacting and expressing in a complex network. This insight provides clue to wet-lab researches to understand the expression pattern of OCA and OA genes and binding phenomenon of mRNA and miRNA upon mutation, which is responsible for inhibition of translation process at genomic levels.
Determinants of BH3 binding specificity for Mcl-1 vs. Bcl-xL
Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott; Fire, Emiko; Grant, Robert A.; Keating, Amy E.
2010-01-01
Interactions among Bcl-2 family proteins are important for regulating apoptosis. Pro-survival members of the family interact with pro-apoptotic BH3-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the alpha-helical BH3 region of the pro-apoptotic proteins to a conserved hydrophobic groove on the pro-survival proteins. Native BH3-only proteins exhibit selectivity in binding pro-survival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the design of new classes of selective inhibitors to serve as reagents or therapeutics. In this work we used two complementary techniques, yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis, to elucidate specificity determinants for binding to Bcl-xL vs. Mcl-1, two prominent pro-survival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-xL selectively, or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1 selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-xL, Bcl-2, Bcl-w and Bfl-1, whereas Bcl-xL selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 vs. Bcl-xL binders. PMID:20363230
Nissan, Gal; Manulis-Sasson, Shulamit; Chalupowicz, Laura; Teper, Doron; Yeheskel, Adva; Pasmanik-Chor, Metsada; Sessa, Guido; Barash, Isaac
2012-02-01
The type III effector HsvG of the gall-forming Pantoea agglomerans pv. gypsophilae is a DNA-binding protein that is imported to the host nucleus and involved in host specificity. The DNA-binding region of HsvG was delineated to 266 amino acids located within a secondary structure region near the N-terminus of the protein but did not display any homology to canonical DNA-binding motifs. A binding site selection procedure was used to isolate a target gene of HsvG, named HSVGT, in Gypsophila paniculata. HSVGT is a predicted acidic protein of the DnaJ family with 244 amino acids. It harbors characteristic conserved motifs of a eukaryotic transcription factor, including a bipartite nuclear localization signal, zinc finger, and leucine zipper DNA-binding motifs. Quantitative real-time polymerase chain reaction analysis demonstrated that HSVGT transcription is specifically induced in planta within 2 h after inoculation with the wild-type P. agglomerans pv. gypsophilae compared with the hsvG mutant. Induction of HSVGT reached a peak of sixfold at 4 h after inoculation and progressively declined thereafter. Gel-shift assay demonstrated that HsvG binds to the HSVGT promoter, indicating that HSVGT is a direct target of HsvG. Our results support the hypothesis that HsvG functions as a transcription factor in gypsophila.
Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei
2012-01-01
Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404
Owczarek, C M; Layton, M J; Metcalf, D; Lock, P; Willson, T A; Gough, N M; Nicola, N A
1993-01-01
Human leukaemia inhibitory factor (hLIF) binds to both human and mouse LIF receptors (LIF-R), while mouse LIF (mLIF) binds only to mouse LIF-R. Moreover, hLIF binds with higher affinity to the mLIF-R than does mLIF. In order to define the regions of the hLIF molecule responsible for species-specific interaction with the hLIF-R and for the unusual high-affinity binding to the mLIF-R, a series of 15 mouse/human LIF hybrids has been generated. Perhaps surprisingly, both of these properties mapped to the same region of the hLIF molecule. The predominant contribution was from residues in the loop linking the third and fourth helices, with lesser contributions from residues in the third helix and the loop connecting the second and third helices in the predicted three-dimensional structure. Since all chimeras retained full biological activity and receptor-binding activity on mouse cells, and there was little variation in the specific biological activity of the purified proteins, it can be concluded that the overall secondary and tertiary structures of each chimera were intact. This observation also implied that the primary binding sites on mLIF and hLIF for the mLIF-R were unaltered by inter-species domain swapping. Consequently, the site on the hLIF molecule that confers species-specific binding to the hLIF-R and higher affinity binding to the mLIF-R, must constitute an additional interaction site to that used by both mLIF and hLIF to bind to the mLIF-R. These studies define a maximum of 15 amino acid differences between hLIF and mLIF that are responsible for the different properties of these proteins. Images PMID:8253075
Andreotti, Renato; Pedroso, Marisela S; Caetano, Alexandre R; Martins, Natália F
2008-01-01
This paper reports the sequence analysis of Bm86 Campo Grande strain comparing it with Bm86 and Bm95 antigens from the preparations TickGardPLUS and Gavac, respectively. The PCR product was cloned into pMOSBlue and sequenced. The secondary structure prediction tool PSIPRED was used to calculate alpha helices and beta strand contents of the predicted polypeptide. The hydrophobicity profile was calculated using the algorithms from the Hopp and Woods method, in addition to identification of potential MHC class-I binding regions in the antigens. Pair-wise alignment revealed that the similarity between Bm86 Campo Grande strain and Bm86 is 0.2% higher than that between Bm86 Campo Grande strain and Bm95 antigens. The identities were 96.5% and 96.3% respectively. Major suggestive differences in hydrophobicity were predicted among the sequences in two specific regions.
USDA-ARS?s Scientific Manuscript database
Transcription factors (TFs) are proteins that regulate the expression of target genes by binding to specific elements in their regulatory regions. Transcriptional regulators (TRs) also regulate the expression of target genes; however, they operate indirectly via interaction with the basal transcript...
Assessing the binding of cholinesterase inhibitors by docking and molecular dynamics studies.
Ali, M Rejwan; Sadoqi, Mostafa; Møller, Simon G; Boutajangout, Allal; Mezei, Mihaly
2017-09-01
In this report we assessed by docking and molecular dynamics the binding mechanisms of three FDA-approved Alzheimer drugs, inhibitors of the enzyme acetylcholinesterase (AChE): donepezil, galantamine and rivastigmine. Dockings by the softwares Autodock-Vina, PatchDock and Plant reproduced the docked conformations of the inhibitor-enzyme complexes within 2Å of RMSD of the X-ray structure. Free-energy scores show strong affinity of the inhibitors for the enzyme binding pocket. Three independent Molecular Dynamics simulation runs indicated general stability of donepezil, galantamine and rivastigmine in their respective enzyme binding pocket (also referred to as gorge) as well as the tendency to form hydrogen bonds with the water molecules. The binding of rivastigmine in the Torpedo California AChE binding pocket is interesting as it eventually undergoes carbamylation and breaks apart according to the X-ray structure of the complex. Similarity search in the ZINC database and targeted docking on the gorge region of the AChE enzyme gave new putative inhibitor molecules with high predicted binding affinity, suitable for potential biophysical and biological assessments. Copyright © 2017 Elsevier Inc. All rights reserved.
Dissociation of glucocerebrosidase dimer in solution by its co-factor, saposin C
Gruschus, James M.; Jiang, Zhiping; Yap, Thai Leong; ...
2015-01-16
Mutations in the gene for the lysosomal enzyme glucocerebrosidase (GCase) cause Gaucher disease and are the most common risk factor for Parkinson disease (PD). Analytical ultracentrifugation of 8 μM GCase shows equilibrium between monomer and dimer forms. However, in the presence of its co-factor saposin C (Sap C), only monomer GCase is seen. Isothermal calorimetry confirms that Sap C associates with GCase in solution in a 1:1 complex (K d = 2.1 ± 1.1 μM). Saturation cross-transfer NMR determined that the region of Sap C contacting GCase includes residues 63–66 and 74–76, which is distinct from the region known tomore » enhance GCase activity. Because α-synuclein (α-syn), a protein closely associated with PD etiology, competes with Sap C for GCase binding, its interaction with GCase was also measured by ultracentrifugation and saturation cross-transfer. Unlike Sap C, binding of α-syn to GCase does not affect multimerization. However, adding α-syn reduces saturation cross-transfer from Sap C to GCase, confirming displacement. To explore where Sap C might disrupt multimeric GCase, GCase x-ray structures were analyzed using the program PISA, which predicted stable dimer and tetramer forms. In conclusion, for the most frequently predicted multimer interface, the GCase active sites are partially buried, suggesting that Sap C might disrupt the multimer by binding near the active site.« less
Detection of functionally important regions in "hypothetical proteins" of known structure.
Nimrod, Guy; Schushan, Maya; Steinberg, David M; Ben-Tal, Nir
2008-12-10
Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.
Solution Structure and Molecular Interactions of Lamin B Receptor Tudor Domain*
Liokatis, Stamatis; Edlich, Christian; Soupsana, Katerina; Giannios, Ioannis; Panagiotidou, Parthena; Tripsianes, Konstantinos; Sattler, Michael; Georgatos, Spyros D.; Politou, Anastasia S.
2012-01-01
Lamin B receptor (LBR) is a polytopic protein of the nuclear envelope thought to connect the inner nuclear membrane with the underlying nuclear lamina and peripheral heterochromatin. To better understand the function of this protein, we have examined in detail its nucleoplasmic region, which is predicted to harbor a Tudor domain (LBR-TD). Structural analysis by multidimensional NMR spectroscopy establishes that LBR-TD indeed adopts a classical β-barrel Tudor fold in solution, which, however, features an incomplete aromatic cage. Removal of LBR-TD renders LBR more mobile at the plane of the nuclear envelope, but the isolated module does not bind to nuclear lamins, heterochromatin proteins (MeCP2), and nucleosomes, nor does it associate with methylated Arg/Lys residues through its aromatic cage. Instead, LBR-TD exhibits tight and stoichiometric binding to the “histone-fold” region of unassembled, free histone H3, suggesting an interesting role in histone assembly. Consistent with such a role, robust binding to native nucleosomes is observed when LBR-TD is extended toward its carboxyl terminus, to include an area rich in Ser-Arg residues. The Ser-Arg region, alone or in combination with LBR-TD, binds both unassembled and assembled H3/H4 histones, suggesting that the TD/RS interface may operate as a “histone chaperone-like platform.” PMID:22052904
Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.
Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I
2001-08-01
DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.
New Cysteine-Rich Ice-Binding Protein Secreted from Antarctic Microalga, Chloromonas sp.
Jung, Woongsic; Campbell, Robert L; Gwak, Yunho; Kim, Jong Im; Davies, Peter L; Jin, EonSeon
2016-01-01
Many microorganisms in Antarctica survive in the cold environment there by producing ice-binding proteins (IBPs) to control the growth of ice around them. An IBP from the Antarctic freshwater microalga, Chloromonas sp., was identified and characterized. The length of the Chloromonas sp. IBP (ChloroIBP) gene was 3.2 kb with 12 exons, and the molecular weight of the protein deduced from the ChloroIBP cDNA was 34.0 kDa. Expression of the ChloroIBP gene was up- and down-regulated by freezing and warming conditions, respectively. Western blot analysis revealed that native ChloroIBP was secreted into the culture medium. This protein has fifteen cysteines and is extensively disulfide bonded as shown by in-gel mobility shifts between oxidizing and reducing conditions. The open-reading frame of ChloroIBP was cloned and over-expressed in Escherichia coli to investigate the IBP's biochemical characteristics. Recombinant ChloroIBP produced as a fusion protein with thioredoxin was purified by affinity chromatography and formed single ice crystals of a dendritic shape with a thermal hysteresis activity of 0.4±0.02°C at a concentration of 5 mg/ml. In silico structural modeling indicated that the three-dimensional structure of ChloroIBP was that of a right-handed β-helix. Site-directed mutagenesis of ChloroIBP showed that a conserved region of six parallel T-X-T motifs on the β-2 face was the ice-binding region, as predicted from the model. In addition to disulfide bonding, hydrophobic interactions between inward-pointing residues on the β-1 and β-2 faces, in the region of ice-binding motifs, were crucial to maintaining the structural conformation of ice-binding site and the ice-binding activity of ChloroIBP.
New Cysteine-Rich Ice-Binding Protein Secreted from Antarctic Microalga, Chloromonas sp.
Jung, Woongsic; Gwak, Yunho; Kim, Jong Im; Davies, Peter L.; Jin, EonSeon
2016-01-01
Many microorganisms in Antarctica survive in the cold environment there by producing ice-binding proteins (IBPs) to control the growth of ice around them. An IBP from the Antarctic freshwater microalga, Chloromonas sp., was identified and characterized. The length of the Chloromonas sp. IBP (ChloroIBP) gene was 3.2 kb with 12 exons, and the molecular weight of the protein deduced from the ChloroIBP cDNA was 34.0 kDa. Expression of the ChloroIBP gene was up- and down-regulated by freezing and warming conditions, respectively. Western blot analysis revealed that native ChloroIBP was secreted into the culture medium. This protein has fifteen cysteines and is extensively disulfide bonded as shown by in-gel mobility shifts between oxidizing and reducing conditions. The open-reading frame of ChloroIBP was cloned and over-expressed in Escherichia coli to investigate the IBP’s biochemical characteristics. Recombinant ChloroIBP produced as a fusion protein with thioredoxin was purified by affinity chromatography and formed single ice crystals of a dendritic shape with a thermal hysteresis activity of 0.4±0.02°C at a concentration of 5 mg/ml. In silico structural modeling indicated that the three-dimensional structure of ChloroIBP was that of a right-handed β-helix. Site-directed mutagenesis of ChloroIBP showed that a conserved region of six parallel T-X-T motifs on the β-2 face was the ice-binding region, as predicted from the model. In addition to disulfide bonding, hydrophobic interactions between inward-pointing residues on the β-1 and β-2 faces, in the region of ice-binding motifs, were crucial to maintaining the structural conformation of ice-binding site and the ice-binding activity of ChloroIBP. PMID:27097164
Cogan, Peter S; Koch, Tad H
2003-11-20
The synthesis and preliminary evaluation of a doxorubicin-formaldehyde conjugate tethered to the nonsteroidal antiandrogen, cyanonilutamide (RU 56279), for the treatment of prostate cancer are reported. The relative ability of the targeting group to bind to the human androgen receptor was studied as a function of tether. The tether served to attach the antiandrogen to the doxorubicin-formaldehyde conjugate via an N-Mannich base of a salicylamide derivative. The salicylamide was selected to serve as a trigger release mechanism to separate the doxorubicin-formaldehyde conjugate from the targeting group after it has bound to the androgen receptor. The remaining part of the tether consisted of a linear group that spanned from the 5-position of the salicylamide to the 3'-position of cyanonilutamide. The structures explored for the linear region of the tether were derivatives of di(ethylene glycol), tri(ethylene glycol), N,N'-disubstituted-piperazine, and 2-butyne-1,4-diol. Relative binding affinity of the tethers bound to the targeting group for human androgen receptor were measured using a (3)H-Mibolerone competition assay and varied from 18% of nilutamide binding for the butynediol-based linear region to less than 1% for one of the piperazine derivatives. The complete targeted drug with the butynediol-based linear region has a relative binding affinity of 10%. This relative binding affinity is encouraging in light of the cocrystal structure of human androgen receptor ligand binding domain bound to the steroid Metribolone which predicts very limited space for a tether connecting the antiandrogen on the inside to the cytotoxin on the outside.
Identifying mRNA sequence elements for target recognition by human Argonaute proteins
Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei
2014-01-01
It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241
Smaczniak, Cezary; Muiño, Jose M; Chen, Dijun; Angenent, Gerco C; Kaufmann, Kerstin
2017-08-01
Floral organ identities in plants are specified by the combinatorial action of homeotic master regulatory transcription factors. However, how these factors achieve their regulatory specificities is still largely unclear. Genome-wide in vivo DNA binding data show that homeotic MADS domain proteins recognize partly distinct genomic regions, suggesting that DNA binding specificity contributes to functional differences of homeotic protein complexes. We used in vitro systematic evolution of ligands by exponential enrichment followed by high-throughput DNA sequencing (SELEX-seq) on several floral MADS domain protein homo- and heterodimers to measure their DNA binding specificities. We show that specification of reproductive organs is associated with distinct binding preferences of a complex formed by SEPALLATA3 and AGAMOUS. Binding specificity is further modulated by different binding site spacing preferences. Combination of SELEX-seq and genome-wide DNA binding data allows differentiation between targets in specification of reproductive versus perianth organs in the flower. We validate the importance of DNA binding specificity for organ-specific gene regulation by modulating promoter activity through targeted mutagenesis. Our study shows that intrafamily protein interactions affect DNA binding specificity of floral MADS domain proteins. Differential DNA binding of MADS domain protein complexes plays a role in the specificity of target gene regulation. © 2017 American Society of Plant Biologists. All rights reserved.
Laje, Gonzalo; Cannon, Dara M; Allen, Andrew S; Klaver, Jackie M; Peck, Summer A; Liu, Xinmin; Manji, Husseini K; Drevets, Wayne C; McMahon, Francis J
2010-07-01
In a previous study we showed that genetic variation in HTR2A, which encodes the serotonin 2A receptor, influenced outcome of citalopram treatment in patients with major depressive disorder. Since chronic administration of citalopram, which selectively and potently inhibits the serotonin transporter (5-HTT), putatively enhances serotonergic transmission, it is conceivable that genetic variation within HTR2A also influences pretreatment 5-HTT function or serotonergic transmission. The present study used positron emission tomography (PET) and the selective 5-HTT ligand, [11C]DASB, to investigate whether the HTR2A marker alleles that predict treatment outcome also predict differences in 5-HTT binding. Brain levels of 5-HTT were assessed in vivo using PET measures of the non-displaceable component of the [11C]DASB binding potential (BPND). DNA from 43 patients and healthy volunteers, all unmedicated, was genotyped with 14 single nucleotide polymorphisms located within or around HTR2A. Allelic association with BPND was assessed in eight brain regions, with covariates to control for race and ethnicity. We detected allelic association between [11C]DASB BPND in thalamus and three markers in a region spanning the 3' untranslated region and second intron of HTR2A (rs7333412, p=0.000045; rs7997012, p=0.000086; rs977003, p=0.000069). The association signal at rs7333412 remained significant (p<0.05) after applying corrections for multiple testing via permutation. Genetic variation in HTR2A that was previously associated with citalopram treatment outcome was also associated with thalamic 5-HTT binding. While further work is needed to identify the actual functional genetic variants involved, these results suggest that a relationship exists between genetic variation in HTR2A and either 5-HTT expression or central serotonergic transmission that influences the therapeutic response to 5-HTT inhibition in major depression.
CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.
Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar
2017-09-01
Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.
Hieber, A David; Bugos, Robert C; Verhoeven, Amy S; Yamamoto, Harry Y
2002-01-01
Violaxanthin de-epoxidase (VDE) is localized in the thylakoid lumen and catalyzes the de-epoxidation of violaxanthin to form antheraxanthin and zeaxanthin. VDE is predicted to be a lipocalin protein with a central barrel structure flanked by a cysteine-rich N-terminal domain and a glutamate-rich C-terminal domain. A full-length Arabidopsis thaliana (L.) Heynh. VDE and deletion mutants of the N- and C-terminal regions were expressed in Escherichia coli and tobacco (Nicotiana tabacum L. cv. Xanthi) plants. High expression of VDE in E. coli was achieved after adding the argU gene that encodes the E. coli arginine AGA tRNA. However, the specific activity of VDE expressed in E. coli was low, possibly due to incorrect folding. Removal of just 4 amino acids from the N-terminal region abolished all VDE activity whereas 71 C-terminal amino acids could be removed without affecting activity. The difficulties with expression in E. coli were overcome by expressing the Arabidopsis VDE in tobacco. The transformed tobacco exhibited a 13- to 19-fold increase in VDE specific activity, indicating correct protein folding. These plants also demonstrated an increase in the initial rate of nonphotochemical quenching consistent with an increased initial rate of de-epoxidation. Deletion mutations of the C-terminal region suggest that this region is important for binding of VDE to the thylakoid membrane. Accordingly, in vitro lipid-micelle binding experiments identified a region of 12 amino acids that is potentially part of a membrane-binding domain. The transformed tobacco plants are the first reported example of plants with an increased level of VDE activity.
Ma, Xin; Guo, Jing; Sun, Xiao
2016-01-01
DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.
Hillmer, Ansel T.; Wooten, Dustin W.; Tudorascu, Dana L.; Barnhart, Todd E.; Ahlers, Elizabeth O.; Resch, Leslie M.; Larson, Julie A.; Converse, Alexander K.; Moore, Colleen F.; Schneider, Mary L.; Christian, Bradley T.
2014-01-01
Background Previous studies have found interrelationships between the serotonin system and alcohol self-administration. The goal of this work was to directly observe in vivo effects of chronic ethanol self-administration on serotonin 5-HT1A receptor binding with [18F]mefway PET neuroimaging in rhesus monkeys. Subjects were first imaged alcohol-naïve and again during chronic ethanol self-administration to quantify changes in 5-HT1A receptor binding. Methods Fourteen rhesus monkey subjects (10.7-12.8 years) underwent baseline [18F]mefway PET scans prior to alcohol exposure. Subjects then drank gradually increasing ethanol doses over four months as an induction period, immediately followed by at least nine months ad libidum ethanol access. A post [18F]mefway PET scan was acquired during the final three months of ad libidum ethanol self-administration. 5-HT1A receptor binding was assayed with binding potential (BPND) using the cerebellum as a reference region. Changes in 5-HT1A binding during chronic ethanol self-administration were examined. Relationships of binding metrics with daily ethanol self-administration were also assessed. Results Widespread increases in 5-HT1A binding were observed during chronic ethanol self-administration, independent of the amount of ethanol consumed. A positive correlation between 5-HT1A binding in the raphe nuclei and average daily ethanol self-administration was also observed, indicating that baseline 5-HT1A binding in this region predicted drinking levels. Conclusions The increase in 5-HT1A binding levels during chronic ethanol self-administration demonstrates an important modulation of the serotonin system due to chronic alcohol exposure. Furthermore, the correlation between 5-HT1A binding in the raphe nuclei and daily ethanol self-administration indicates a relationship between the serotonin system and alcohol self-administration. PMID:25220896
T-Epitope Designer: A HLA-peptide binding prediction server.
Kangueane, Pandjassarame; Sakharkar, Meena Kishore
2005-05-15
The current challenge in synthetic vaccine design is the development of a methodology to identify and test short antigen peptides as potential T-cell epitopes. Recently, we described a HLA-peptide binding model (using structural properties) capable of predicting peptides binding to any HLA allele. Consequently, we have developed a web server named T-EPITOPE DESIGNER to facilitate HLA-peptide binding prediction. The prediction server is based on a model that defines peptide binding pockets using information gleaned from X-ray crystal structures of HLA-peptide complexes, followed by the estimation of peptide binding to binding pockets. Thus, the prediction server enables the calculation of peptide binding to HLA alleles. This model is superior to many existing methods because of its potential application to any given HLA allele whose sequence is clearly defined. The web server finds potential application in T cell epitope vaccine design. http://www.bioinformation.net/ted/
Position specific variation in the rate of evolution in transcription factor binding sites
Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B
2003-01-01
Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282
Busby, Ben; Oashi, Taiji; Willis, Chris D.; Ackermann, Maegen A.; Kontrogianni-Konstantopoulos, Aikaterini; MacKerell, Alexander D.; Bloch, Robert J.
2012-01-01
Small ankyrin 1 (sAnk1; also Ank1.5) is an integral protein of the sarcoplasmic reticulum in skeletal and cardiac muscle cells, where it is thought to bind to the C-terminal region of obscurin, a large modular protein that surrounds the contractile apparatus. Using fusion proteins in vitro, in combination with site directed mutagenesis and surface plasmon resonance measurements, we previously showed that the binding site on sAnk1 for obscurin consists in part of six lysine and arginine residues. Here we show that four charged residues in the high affinity binding site on obscurin for sAnk1, between residues 6316-6345, consisting of three glutamates and a lysine, are necessary, but not sufficient, for this site on obscurin to bind with high affinity to sAnk1. We also identify specific complementary mutations in sAnk1 that can partially or completely compensate for the changes in binding caused by charge-switching mutations in obscurin. We used molecular modeling to develop structural models of residues 6322-6339 of obscurin bound to sAnk1. The models, based on a combination of Brownian and molecular dynamics simulations, predict that the binding site on sAnk1 for obscurin is organized as two ankyrin-like repeats, with the last α-helical segment oriented at an angle to the nearby helices, allowing lysine-6338 of obscurin to form an ionic interaction with aspartate-111 of sAnk1. This prediction was validated by double mutant cycle experiments. Our results are consistent with a model in which electrostatic interactions between specific pairs of side chains on obscurin and sAnk1 promote binding and complex formation. PMID:21333652
Clifford, Jacob; Adami, Christoph
2015-09-02
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
2013-01-01
Background Heparin cofactor II (HCII) is a circulating protease inhibitor, one which contains an N-terminal acidic extension (HCII 1-75) unique within the serpin superfamily. Deletion of HCII 1-75 greatly reduces the ability of glycosaminoglycans (GAGs) to accelerate the inhibition of thrombin, and abrogates HCII binding to thrombin exosite 1. While a minor portion of HCII 1-75 can be visualized in a crystallized HCII-thrombin S195A complex, the role of the rest of the extension is not well understood and the affinity of the HCII 1-75 interaction has not been quantitatively characterized. To address these issues, we expressed HCII 1-75 as a small, N-terminally hexahistidine-tagged polypeptide in E. coli. Results Immobilized purified HCII 1-75 bound active α-thrombin and active-site inhibited FPR-ck- or S195A-thrombin, but not exosite-1-disrupted γT-thrombin, in microtiter plate assays. Biotinylated HCII 1-75 immobilized on streptavidin chips bound α-thrombin and FPR-ck-thrombin with similar KD values of 330-340 nM. HCII 1-75 competed thrombin binding to chip-immobilized HCII 1-75 more effectively than HCII 54-75 but less effectively than the C-terminal dodecapeptide of hirudin (mean Ki values of 2.6, 8.5, and 0.29 μM, respectively). This superiority over HCII 54-75 was also demonstrated in plasma clotting assays and in competing the heparin-catalysed inhibition of thrombin by plasma-derived HCII; HCII 1-53 had no effect in either assay. Molecular modelling of HCII 1-75 correctly predicted those portions of the acidic extension that had been previously visualized in crystal structures, and suggested that an α-helix found between residues 26 and 36 stabilizes one found between residues 61-67. The latter region has been previously shown by deletion mutagenesis and crystallography to play a crucial role in the binding of HCII to thrombin exosite 1. Conclusions Assuming that the KD value for HCII 1-75 of 330-340 nM faithfully predicts that of this region in intact HCII, and that 1-75 binding to exosite 1 is GAG-dependent, our results support a model in which thrombin first binds to GAGs, followed by HCII addition to the ternary complex and release of HCII 1-75 for exosite 1 binding and serpin mechanism inhibition. They further suggest that, in isolated or transferred form, the entire HCII 1-75 region is required to ensure maximal binding of thrombin exosite 1. PMID:23496873
Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain
2012-01-01
Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826
Schaefke, Bernhard; Wang, Tzi-Yuan; Wang, Chuen-Yi; Li, Wen-Hsiung
2015-07-27
Gene expression evolution occurs through changes in cis- or trans-regulatory elements or both. Interactions between transcription factors (TFs) and their binding sites (TFBSs) constitute one of the most important points where these two regulatory components intersect. In this study, we investigated the evolution of TFBSs in the promoter regions of different Saccharomyces strains and species. We divided the promoter of a gene into the proximal region and the distal region, which are defined, respectively, as the 200-bp region upstream of the transcription starting site and as the 200-bp region upstream of the proximal region. We found that the predicted TFBSs in the proximal promoter regions tend to be evolutionarily more conserved than those in the distal promoter regions. Additionally, Saccharomyces cerevisiae strains used in the fermentation of alcoholic drinks have experienced more TFBS losses than gains compared with strains from other environments (wild strains, laboratory strains, and clinical strains). We also showed that differences in TFBSs correlate with the cis component of gene expression evolution between species (comparing S. cerevisiae and its sister species Saccharomyces paradoxus) and within species (comparing two closely related S. cerevisiae strains). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris
2016-10-01
Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Baresic, Mario; Salatino, Silvia; Kupr, Barbara
2014-01-01
Skeletal muscle tissue shows an extraordinary cellular plasticity, but the underlying molecular mechanisms are still poorly understood. Here, we use a combination of experimental and computational approaches to unravel the complex transcriptional network of muscle cell plasticity centered on the peroxisome proliferator-activated receptor γ coactivator 1α (PGC-1α), a regulatory nexus in endurance training adaptation. By integrating data on genome-wide binding of PGC-1α and gene expression upon PGC-1α overexpression with comprehensive computational prediction of transcription factor binding sites (TFBSs), we uncover a hitherto-underestimated number of transcription factor partners involved in mediating PGC-1α action. In particular, principal component analysis of TFBSs at PGC-1α binding regions predicts that, besides the well-known role of the estrogen-related receptor α (ERRα), the activator protein 1 complex (AP-1) plays a major role in regulating the PGC-1α-controlled gene program of the hypoxia response. Our findings thus reveal the complex transcriptional network of muscle cell plasticity controlled by PGC-1α. PMID:24912679
Tsai, Hung-Ji; Baller, Joshua A.; Liachko, Ivan; Koren, Amnon; Burrack, Laura S.; Hickman, Meleah A.; Thevandavakkam, Mathuravani A.; Rusche, Laura N.
2014-01-01
ABSTRACT Origins of DNA replication are key genetic elements, yet their identification remains elusive in most organisms. In previous work, we found that centromeres contain origins of replication (ORIs) that are determined epigenetically in the pathogenic yeast Candida albicans. In this study, we used origin recognition complex (ORC) binding and nucleosome occupancy patterns in Saccharomyces cerevisiae and Kluyveromyces lactis to train a machine learning algorithm to predict the position of active arm (noncentromeric) origins in the C. albicans genome. The model identified bona fide active origins as determined by the presence of replication intermediates on nondenaturing two-dimensional (2D) gels. Importantly, these origins function at their native chromosomal loci and also as autonomously replicating sequences (ARSs) on a linear plasmid. A “mini-ARS screen” identified at least one and often two ARS regions of ≥100 bp within each bona fide origin. Furthermore, a 15-bp AC-rich consensus motif was associated with the predicted origins and conferred autonomous replicating activity to the mini-ARSs. Thus, while centromeres and the origins associated with them are epigenetic, arm origins are dependent upon critical DNA features, such as a binding site for ORC and a propensity for nucleosome exclusion. PMID:25182328
Empirically Optimized Flow Cytometric Immunoassay Validates Ambient Analyte Theory
Parpia, Zaheer A.; Kelso, David M.
2010-01-01
Ekins’ ambient analyte theory predicts, counter intuitively, that an immunoassay’s limit of detection can be improved by reducing the amount of capture antibody. In addition, it also anticipates that results should be insensitive to the volume of sample as well as the amount of capture antibody added. The objective of this study is to empirically validate all of the performance characteristics predicted by Ekins’ theory. Flow cytometric analysis was used to detect binding between a fluorescent ligand and capture microparticles since it can directly measure fractional occupancy, the primary response variable in ambient analyte theory. After experimentally determining ambient analyte conditions, comparisons were carried out between ambient and non-ambient assays in terms of their signal strengths, limits of detection, and their sensitivity to variations in reaction volume and number of particles. The critical number of binding sites required for an assay to be in the ambient analyte region was estimated to be 0.1VKd. As predicted, such assays exhibited superior signal/noise levels and limits of detection; and were not affected by variations in sample volume and number of binding sites. When the signal detected measures fractional occupancy, ambient analyte theory is an excellent guide to developing assays with superior performance characteristics. PMID:20152793
Shilling, F M; Krätzschmar, J; Cai, H; Weskamp, G; Gayko, U; Leibow, J; Myles, D G; Nuccitelli, R; Blobel, C P
1997-06-15
Proteins containing a membrane-anchored metalloprotease domain, a disintegrin domain, and a cysteine-rich region (MDC proteins) are thought to play an important role in mammalian fertilization, as well as in somatic cell-cell interactions. We have identified PCR sequence tags encoding the disintegrin domain of five distinct MDC proteins from Xenopus laevis testis cDNA. Four of these sequence tags (xMDC9, xMDC11.1, xMDC11.2, and xMDC13) showed strong similarity to known mammalian MDC proteins, whereas the fifth (xMDC16) apparently represents a novel family member. Northern blot analysis revealed that the mRNA for xMDC16 was only expressed in testis, and not in heart, muscle, liver, ovaries, or eggs, whereas the mRNAs corresponding to the four other PCR products were expressed in testis and in some or all somatic tissues tested. The xMDC16 protein sequence, as predicted from the full-length cDNA, contains a metalloprotease domain with the active-site sequence HEXXH, a disintegrin domain, a cysteine-rich region, an EGF repeat, a transmembrane domain, and a short cytoplasmic tail. To study a potential role for these xMDC proteins in fertilization, peptides corresponding to the predicted integrin-binding domain of each protein were tested for their ability to inhibit X. laevis fertilization. Cyclic and linear xMDC16 peptides inhibited fertilization in a concentration-dependent manner, whereas xMDC16 peptides that were scrambled or had certain amino acid replacements in the predicted integrin-binding domain did not affect fertilization. Cyclic and linear xMDC9 peptides and linear xMDC13 peptides also inhibited fertilization similarly to xMDC16 peptides, whereas peptides corresponding to the predicted integrin-binding site of xMDC11.1 and xMDC11.2 did not. These results are discussed in the context of a model in which multiple MDC protein-receptor interactions are necessary for fertilization to occur.
Effects of the target aspect ratio and intrinsic reactivity onto diffusive search in bounded domains
NASA Astrophysics Data System (ADS)
Grebenkov, Denis S.; Metzler, Ralf; Oshanin, Gleb
2017-10-01
We study the mean first passage time (MFPT) to a reaction event on a specific site in a cylindrical geometry—characteristic, for instance, for bacterial cells, with a concentric inner cylinder representing the nuclear region of the bacterial cell. A similar problem emerges in the description of a diffusive search by a transcription factor protein for a specific binding region on a single strand of DNA. We develop a unified theoretical approach to study the underlying boundary value problem which is based on a self-consistent approximation of the mixed boundary condition. Our approach permits us to derive explicit, novel, closed-form expressions for the MFPT valid for a generic setting with an arbitrary relation between the system parameters. We analyse this general result in the asymptotic limits appropriate for the above-mentioned biophysical problems. Our investigation reveals the crucial role of the target aspect ratio and of the intrinsic reactivity of the binding region, which were disregarded in previous studies. Theoretical predictions are confirmed by numerical simulations.
Nielsen, Morten; Justesen, Sune; Lund, Ole; Lundegaard, Claus; Buus, Søren
2010-11-13
Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option. Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy. The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at http://www.cbs.dtu.dk/services/NetMHCIIpan-2.0.
Improved prediction of antibody VL–VH orientation
Marze, Nicholas A.; Lyskov, Sergey; Gray, Jeffrey J.
2016-01-01
Antibodies are important immune molecules with high commercial value and therapeutic interest because of their ability to bind diverse antigens. Computational prediction of antibody structure can quickly reveal valuable information about the nature of these antigen-binding interactions, but only if the models are of sufficient quality. To achieve high model quality during complementarity-determining region (CDR) structural prediction, one must account for the VL–VH orientation. We developed a novel four-metric VL–VH orientation coordinate frame. Additionally, we extended the CDR grafting protocol in RosettaAntibody with a new method that diversifies VL–VH orientation by using 10 VL–VH orientation templates rather than a single one. We tested the multiple-template grafting protocol on two datasets of known antibody crystal structures. During the template-grafting phase, the new protocol improved the fraction of accurate VL–VH orientation predictions from only 26% (12/46) to 72% (33/46) of targets. After the full RosettaAntibody protocol, including CDR H3 remodeling and VL–VH re-orientation, the new protocol produced more candidate structures with accurate VL–VH orientation than the standard protocol in 43/46 targets (93%). The improved ability to predict VL–VH orientation will bolster predictions of other parts of the paratope, including the conformation of CDR H3, a grand challenge of antibody homology modeling. PMID:27276984
Grossman, M J; Lampen, J O
1987-01-01
The location of the repressor gene, blaI, for the beta-lactamase gene blaP of Bacillus licheniformis 749, on the 5' side of blaP, was confirmed by sequencing the bla region of the constitutive mutant 749/C. An amber stop codon, likely to result in a nonfunctional truncated repressor, was found at codon 32 of the 128 codon blaI open reading frame (ORF) located 5' to blaP. In order to study the DNA binding activity of the repressor, the structural gene for blaI, from strain 749, with its ribosome binding site was expressed using a two plasmid T7 RNA polymerase/promotor system (S. Tabor and C. C. Richardson. Proc. Natl. Acad. Sci. 82, 1074-1078 (1985). Heat induction of this system in Escherichia coli K38 resulted in the production of BlaI as 5-10% of the soluble cell protein. Repressor protein was then purified by ammonium sulfate fractionation and cation exchange chromatography. The sequence of the N-terminal 28 amino acid residues was determined and was as predicted from the DNA. Binding of BlaI to DNA was detected by the slower migration of protein DNA complexes during polyacrylamide gel electrophoresis. BlaI was shown to selectively bind DNA fragments carrying the promoter regions of blaI and blaP. Images PMID:3498148
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.
Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja
2017-02-01
Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
Protein docking prediction using predicted protein-protein interface.
Li, Bin; Kihara, Daisuke
2012-01-10
Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Many human accelerated regions are developmental enhancers
Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.
2013-01-01
The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637
Ding, Xi-Qin; Pinon, Delia I; Furse, Kristina E; Lybrand, Terry P; Miller, Laurence J
2002-05-01
Insight into the molecular basis of cholecystokinin (CCK) binding to its receptor has come from receptor mutagenesis and photoaffinity labeling studies, with both contributing to the current hypothesis that the acidic Tyr-sulfate-27 residue within the peptide is situated adjacent to basic Arg(197) in the second loop of the receptor. Here, we refine our understanding of this region of interaction by examining a structure-activity series of these positions within both ligand and receptor and by performing three-dimensional molecular modeling of key pairs of modified ligand and receptor constructs. The important roles of Arg(197) and Tyr-sulfate-27 were supported by the marked negative impact on binding and biological response with their natural partner molecule when the receptor residue was replaced by acidic Asp or Glu and when the peptide residue was replaced by basic Arg, Lys, p-amino-Phe, p-guanidino-Phe, or p-methylamino-Phe. Complementary ligand-receptor charge-exchange experiments were unable to regain the lost function. This was supported by the molecular modeling, which demonstrated that the charge-reversed double mutants could not form a good interaction without extensive rearrangement of receptor conformation. The models further predicted that R197D and R197E mutations would lead to conformational changes in the extracellular domain, and this was experimentally supported by data showing that these mutations decreased peptide agonist and antagonist binding and increased nonpeptidyl antagonist binding. These receptor constructs also had increased susceptibility to trypsin degradation relative to the wild-type receptor. In contrast, the relatively conservative R197K mutation had modest negative impact on peptide agonist binding, again consistent with the modeling demonstration of loss of a series of stabilizing inter- and intramolecular bonds. The strong correlation between predicted and experimental results support the reported refinement in the three-dimensional structure of the CCK-occupied receptor.
Lun, Cheng Man; Samuel, Robin L.; Gillmor, Susan D.; Boyd, Anthony; Smith, L. Courtney
2017-01-01
The purple sea urchin, Strongylocentrotus purpuratus, possesses a sophisticated innate immune system that functions without adaptive capabilities and responds to pathogens effectively by expressing the highly diverse SpTransformer gene family (formerly the Sp185/333 gene family). The swift gene expression response and the sequence diversity of SpTransformer cDNAs suggest that the encoded proteins have immune functions. Individual sea urchins can express up to 260 distinct SpTransformer proteins, and their diversity suggests that different versions may have different functions. Although the deduced proteins are diverse, they share an overall structure of a hydrophobic leader, a glycine-rich N-terminal region, a histidine-rich region, and a C-terminal region. Circular dichroism analysis of a recombinant SpTransformer protein, rSpTransformer-E1 (rSpTrf-E1) demonstrates that it is intrinsically disordered and transforms to α helical in the presence of buffer additives and binding targets. Although native SpTrf proteins are associated with the membranes of perinuclear vesicles in the phagocyte class of coelomocytes and are present on the surface of small phagocytes, they have no predicted transmembrane region or conserved site for glycophosphatidylinositol linkage. To determine whether native SpTrf proteins associate with phagocyte membranes through interactions with lipids, when rSpTrf-E1 is incubated with lipid-embedded nylon strips, it binds to phosphatidic acid (PA) through both the glycine-rich region and the histidine-rich region. Synthetic liposomes composed of PA and phosphatidylcholine show binding between rSpTrf-E1 and PA by fluorescence resonance energy transfer, which is associated with leakage of luminal contents suggesting changes in lipid organization and perhaps liposome lysis. Interactions with liposomes also change membrane curvature leading to liposome budding, fusion, and invagination, which is associated with PA clustering induced by rSpTrf-E1 binding. Longer incubations result in the extraction of PA from the liposomes, which form disorganized clusters. CD shows that when rSpTrf-E1 binds to PA, it changes its secondary structure from disordered to α helical. These results provide evidence for how SpTransformer proteins may associate with molecules that have exposed phosphates including PA on cell membranes and how the characteristic of protein multimerization may drive changes in the organization of membrane lipids. PMID:28553283
2010-01-01
Background The binding of peptide fragments of extracellular peptides to class II MHC is a crucial event in the adaptive immune response. Each MHC allotype generally binds a distinct subset of peptides and the enormous number of possible peptide epitopes prevents their complete experimental characterization. Computational methods can utilize the limited experimental data to predict the binding affinities of peptides to class II MHC. Results We have developed the Regularized Thermodynamic Average, or RTA, method for predicting the affinities of peptides binding to class II MHC. RTA accounts for all possible peptide binding conformations using a thermodynamic average and includes a parameter constraint for regularization to improve accuracy on novel data. RTA was shown to achieve higher accuracy, as measured by AUC, than SMM-align on the same data for all 17 MHC allotypes examined. RTA also gave the highest accuracy on all but three allotypes when compared with results from 9 different prediction methods applied to the same data. In addition, the method correctly predicted the peptide binding register of 17 out of 18 peptide-MHC complexes. Finally, we found that suboptimal peptide binding registers, which are often ignored in other prediction methods, made significant contributions of at least 50% of the total binding energy for approximately 20% of the peptides. Conclusions The RTA method accurately predicts peptide binding affinities to class II MHC and accounts for multiple peptide binding registers while reducing overfitting through regularization. The method has potential applications in vaccine design and in understanding autoimmune disorders. A web server implementing the RTA prediction method is available at http://bordnerlab.org/RTA/. PMID:20089173
Wei, Qing; La, David; Kihara, Daisuke
2017-01-01
Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .
Papandreou, Nikos C.; Iconomidou, Vassiliki A.; Willis, Judith H.; Hamodrakas, Stavros J.
2010-01-01
The physical properties of cuticle are determined by the structure of its two major components, cuticular proteins (CPs) and chitin, and, also, by their interactions. A common consensus region (extended R&R Consensus) found in the majority of cuticular proteins, the CPRs, binds to chitin. Previous work established that β-pleated sheet predominates in the Consensus region and we proposed that it is responsible for the formation of helicoidal cuticle. Remote sequence similarity between CPRs and a lipocalin, bovine plasma retinol binding protein (RBP), led us to suggest an antiparallel β-sheet half-barrel structure as the basic folding motif of the R&R Consensus. There are several other families of cuticular proteins. One of the best defined is CPF. Its four members in Anopheles gambiae are expressed during the early stages of either pharate pupal or pharate adult development, suggesting that the proteins contribute to the outer regions of the cuticle, the epi- and/or exocuticle. These proteins did not bind to chitin in the same assay used successfully for CPRs. Although CPFs are distinct in sequence from CPRs, the same lipocalin could also be used to derive homology models for one Anopheles gambiae and one Drosophila melanogaster CPF. For the CPFs, the basic folding motif predicted is an eight-stranded, antiparallel β-sheet, full-barrel structure. Possible implications of this structure are discussed and docking experiments were carried out with one possible Drosophila ligand, 7(Z), 11(Z)-heptacosadiene. PMID:20417215
NASA Astrophysics Data System (ADS)
Athanasiou, Christina; Vasilakaki, Sofia; Dellis, Dimitris; Cournia, Zoe
2018-01-01
Computer-aided drug design has become an integral part of drug discovery and development in the pharmaceutical and biotechnology industry, and is nowadays extensively used in the lead identification and lead optimization phases. The drug design data resource (D3R) organizes challenges against blinded experimental data to prospectively test computational methodologies as an opportunity for improved methods and algorithms to emerge. We participated in Grand Challenge 2 to predict the crystallographic poses of 36 Farnesoid X Receptor (FXR)-bound ligands and the relative binding affinities for two designated subsets of 18 and 15 FXR-bound ligands. Here, we present our methodology for pose and affinity predictions and its evaluation after the release of the experimental data. For predicting the crystallographic poses, we used docking and physics-based pose prediction methods guided by the binding poses of native ligands. For FXR ligands with known chemotypes in the PDB, we accurately predicted their binding modes, while for those with unknown chemotypes the predictions were more challenging. Our group ranked #1st (based on the median RMSD) out of 46 groups, which submitted complete entries for the binding pose prediction challenge. For the relative binding affinity prediction challenge, we performed free energy perturbation (FEP) calculations coupled with molecular dynamics (MD) simulations. FEP/MD calculations displayed a high success rate in identifying compounds with better or worse binding affinity than the reference (parent) compound. Our studies suggest that when ligands with chemical precedent are available in the literature, binding pose predictions using docking and physics-based methods are reliable; however, predictions are challenging for ligands with completely unknown chemotypes. We also show that FEP/MD calculations hold predictive value and can nowadays be used in a high throughput mode in a lead optimization project provided that crystal structures of sufficiently high quality are available.
SUCROSE SYNTHASE: ELUCIDATION OF COMPLEX POST-TRANSLATIONAL REGULATORY MECHANISMS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steven C. Huber
2009-05-12
Studies have focused on the enzyme sucrose synthase, which plays an important role in the metabolism of sucrose in seeds and tubers. There are three isoforms of SUS in maize, referred to as SUS1, SUS-SH1, and SUS2. SUS is generally considered to be tetrameric protein but recent evidence suggests that SUS can also occur as a dimeric protein. The formation of tetrameric SUS is regulated by sucrose concentration in vitro and this could also be an important factor in the cellular localization of the protein. We found that high sucrose concentrations, which promote tetramer formation, also inhibit the binding ofmore » SUS1 to actin filaments in vitro. Previously, high sucrose concentrations were shown to promote SUS association with the plasma membrane. The specific regions of the SUS molecule involved in oligomerization are not known, but we identified a region of the SUS1 moelcule by bioinformatic analysis that was predicted to form a coiled coil. We demonstrated that this sequence could, in fact, self-associate as predicted for a coiled coil, but truncation analysis with the full-length recombinant protein suggested that it was not responsible for formation of dimers or tetramers. However, the coiled coil may function in binding of other proteins to SUS1. Overall, sugar availability may differentially influence the binding of SUS to cellular structures, and these effects may be mediated by changes in the oligomeric nature of the enzyme.« less
Bergner, Laura M.; Hickman, F. Edward; Wood, Kathleen H.; Wakeman, Carolyn M.; Stone, Hunter H.; Campbell, Tessa J.; Lightcap, Samantha B.; Favors, Sheena M.; Aldridge, Amanda C.
2010-01-01
Temporal coordination of meiosis with spermatid morphogenesis is crucial for successful generation of mature sperm cells. We identified a recessive male sterile Drosophila melanogaster mutant, mitoshell, in which events of spermatid morphogenesis are initiated too early, before meiotic onset. Premature mitochondrial aggregation and fusion lead to an aberrant mitochondrial shell around premeiotic nuclei. Despite successful meiotic karyokinesis, improper mitochondrial localization in mitoshell testes is associated with defective astral central spindles and a lack of contractile rings, leading to meiotic cytokinesis failure. We mapped and cloned the mitoshell gene and found that it encodes a novel protein with a bromodomain-related region. It is conserved in some insect lineages. Bromodomains typically bind to histone acetyl-lysine residues and therefore are often associated with chromatin. The Mitoshell bromodomain-related region is predicted to have an alpha helical structure similar to that of bromodomains, but not all the crucial residues in the ligand-binding loops are conserved. We speculate that Mitoshell may participate in transcriptional regulation of spermatogenesis-specific genes, though perhaps with different ligand specificity compared to traditional bromodomains. PMID:20491580
The identification and functional annotation of RNA structures conserved in vertebrates
Seemann, Stefan E.; Mirza, Aashiq H.; Hansen, Claus; Bang-Berthelsen, Claus H.; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T.; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L.; Gorodkin, Jan
2017-01-01
Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human–mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3′ ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. PMID:28487280
Martinez, David R; Vandergrift, Nathan; Douglas, Ayooluwa O; McGuire, Erin; Bainbridge, John; Nicely, Nathan I; Montefiori, David C; Tomaras, Georgia D; Fouda, Genevieve G; Permar, Sallie R
2017-05-01
The development of an effective maternal HIV-1 vaccine that could synergize with antiretroviral therapy (ART) to eliminate pediatric HIV-1 infection will require the characterization of maternal immune responses capable of blocking transmission of autologous HIV to the infant. We previously determined that maternal plasma antibody binding to linear epitopes within the variable loop 3 (V3) region of HIV envelope (Env) and neutralizing responses against easy-to-neutralize tier 1 viruses were associated with reduced risk of peripartum HIV infection in the historic U.S. Woman and Infant Transmission Study (WITS) cohort. Here, we defined the fine specificity and function of the potentially protective maternal V3-specific IgG antibodies associated with reduced peripartum HIV transmission risk in this cohort. The V3-specific IgG binding that predicted low risk of mother-to-child-transmission (MTCT) was dependent on the C-terminal flank of the V3 crown and particularly on amino acid position 317, a residue that has also been associated with breakthrough transmission in the RV144 vaccine trial. Remarkably, the fine specificity of potentially protective maternal plasma V3-specific tier 1 virus-neutralizing responses was dependent on the same region in the V3 loop. Our findings suggest that MTCT risk is associated with neutralizing maternal IgG that targets amino acid residues in the C-terminal region of the V3 loop crown, suggesting the importance of the region in immunogen design for maternal vaccines to prevent MTCT. IMPORTANCE Efforts to curb HIV-1 transmission in pediatric populations by antiretroviral therapy (ART) have been highly successful in both developed and developing countries. However, more than 150,000 infants continue to be infected each year, likely due to a combination of late maternal HIV diagnosis, lack of ART access or adherence, and drug-resistant viral strains. Defining the fine specificity of maternal humoral responses that partially protect against MTCT of HIV is required to inform the development of a maternal HIV vaccine that will enhance these responses during pregnancy. In this study, we identified amino acid residues targeted by potentially protective maternal V3-specific IgG binding and neutralizing responses, localizing the potentially protective response in the C-terminal region of the V3 loop crown. Our findings have important implications for the design of maternal vaccination strategies that could synergize with ART during pregnancy to achieve the elimination of pediatric HIV infections. Copyright © 2017 American Society for Microbiology.
Identification of key residues for protein conformational transition using elastic network model.
Su, Ji Guo; Xu, Xian Jin; Li, Chun Hua; Chen, Wei Zu; Wang, Cun Xin
2011-11-07
Proteins usually undergo conformational transitions between structurally disparate states to fulfill their functions. The large-scale allosteric conformational transitions are believed to involve some key residues that mediate the conformational movements between different regions of the protein. In the present work, a thermodynamic method based on the elastic network model is proposed to predict the key residues involved in protein conformational transitions. In our method, the key functional sites are identified as the residues whose perturbations largely influence the free energy difference between the protein states before and after transition. Two proteins, nucleotide binding domain of the heat shock protein 70 and human/rat DNA polymerase β, are used as case studies to identify the critical residues responsible for their open-closed conformational transitions. The results show that the functionally important residues mainly locate at the following regions for these two proteins: (1) the bridging point at the interface between the subdomains that control the opening and closure of the binding cleft; (2) the hinge region between different subdomains, which mediates the cooperative motions between the corresponding subdomains; and (3) the substrate binding sites. The similarity in the positions of the key residues for these two proteins may indicate a common mechanism in their conformational transitions.
Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.
Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P
2018-05-22
Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.
Łochowska, Anna; Iwanicka-Nowicka, Roksana; Zielak, Agata; Modelewska, Anna; Thomas, Mark S.; Hryniewicz, Monika M.
2011-01-01
The genome of Burkholderia cenocepacia contains two genes encoding closely related LysR-type transcriptional regulators, CysB and SsuR, involved in control of sulfur assimilation processes. In this study we show that the function of SsuR is essential for the utilization of a number of organic sulfur sources of either environmental or human origin. Among the genes upregulated by SsuR identified here are the tauABC operon encoding a predicted taurine transporter, three tauD-type genes encoding putative taurine dioxygenases, and atsA encoding a putative arylsulfatase. The role of SsuR in expression of these genes/operons was characterized through (i) construction of transcriptional reporter fusions to candidate promoter regions and analysis of their expression in the presence/absence of SsuR and (ii) testing the ability of SsuR to bind SsuR-responsive promoter regions. We also demonstrate that expression of SsuR-activated genes is not repressed in the presence of inorganic sulfate. A more detailed analysis of four SsuR-responsive promoter regions indicated that ∼44 bp of the DNA sequence preceding and/or overlapping the predicted −35 element of such promoters is sufficient for SsuR binding. The DNA sequence homology among SsuR “recognition motifs” at different responsive promoters appears to be limited. PMID:21317335
Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying
2015-01-01
Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
2015-01-01
Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483
Boehm, M K; Corper, A L; Wan, T; Sohi, M K; Sutton, B J; Thornton, J D; Keep, P A; Chester, K A; Begent, R H; Perkins, S J
2000-03-01
MFE-23 is the first single-chain Fv antibody molecule to be used in patients and is used to target colorectal cancer through its high affinity for carcinoembryonic antigen (CEA), a cell-surface member of the immunoglobulin superfamily. MFE-23 contains an N-terminal variable heavy-chain domain joined by a (Gly(4)Ser)(3) linker to a variable light-chain (V(L)) domain (kappa chain) with an 11-residue C-terminal Myc-tag. Its crystal structure was determined at 2.4 A resolution by molecular replacement with an R(cryst) of 19.0%. Five of the six antigen-binding loops, L1, L2, L3, H1 and H2, conformed to known canonical structures. The sixth loop, H3, displayed a unique structure, with a beta-hairpin loop and a bifurcated apex characterized by a buried Thr residue. In the crystal lattice, two MFE-23 molecules were associated back-to-back in a manner not seen before. The antigen-binding site displayed a large acidic region located mainly within the H2 loop and a large hydrophobic region within the H3 loop. Even though this structure is unliganded within the crystal, there is an unusually large region of contact between the H1, H2 and H3 loops and the beta-sheet of the V(L) domain of an adjacent molecule (strands DEBA) as a result of intermolecular packing. These interactions exhibited remarkably high surface and electrostatic complementarity. Of seven MFE-23 residues predicted to make contact with antigen, five participated in these lattice contacts, and this model for antigen binding is consistent with previously reported site-specific mutagenesis of MFE-23 and its effect on CEA binding.
Hu, Xiaodan; Zhang, Xiao; Zhong, Jianfeng; Liu, Yuan; Zhang, Cunzheng; Xie, Yajing; Lin, Manman; Xu, Chongxin; Lu, Lina; Zhu, Qing; Liu, Xianjin
2018-05-01
Cadherin-like protein has been identified as the primary Bacillus thuringiensis (Bt) Cry toxin receptor in Lepidoptera pests and plays a key role in Cry toxin insecticidal. In this study, we successfully expressed the putative Cry1Ac toxin-binding region (CR7-CR11) of Plutella xylostella cadherin-like in Escherichia coli BL21 (DE3). The expressed CR7-CR11 fragment showed binding ability to Cry1Ac toxin under denaturing (Ligand blot) and non-denaturing (ELISA) conditions. The three-dimensional structure of CR7-CR11 was constructed by homology modeling. Molecular docking results of CR7-CR11 and Cry1Ac showed that domain II and domain III of Cry1Ac were taking part in binding to CR7-CR11, while CR7-CR8 was the region of CR7-CR11 in interacting with Cry1Ac. The interaction of toxin-receptor complex was found to arise from hydrogen bond and hydrophobic interaction. Through the computer-aided alanine mutation scanning, amino acid residues of Cry1Ac (Met341, Asn442 and Ser486) and CR7-CR11 (Asp32, Arg101 and Arg127) were predicted as the hot spot residues involved in the interaction of the toxin-receptor complex. At last, we verified the importance role of these key amino acid residues by binding assay. These results will lay a foundation for further elucidating the insecticidal mechanism of Cry toxin and enhancing Cry toxin insecticidal activity by molecular modification. Copyright © 2018 Elsevier B.V. All rights reserved.
Mahdavi, Jafar; Oldfield, Neil J.; Wheldon, Lee M.; Wooldridge, Karl G.; Ala'Aldeen, Dlawer A. A.
2012-01-01
Neisseria meningitidis, Haemophilus influenzae and Streptococcus pneumoniae are major bacterial agents of meningitis. They each bind the 37/67-kDa laminin receptor (LamR) via the surface protein adhesins: meningococcal PilQ and PorA, H. influenzae OmpP2 and pneumococcal CbpA. We have previously reported that a surface-exposed loop of the R2 domain of CbpA mediates LamR-binding. Here we have identified the LamR-binding regions of PorA and OmpP2. Using truncated recombinant proteins we show that binding is dependent on amino acids 171–240 and 91–99 of PorA and OmpP2, respectively, which are predicted to localize to the fourth and second surface-exposed loops, respectively, of these proteins. Synthetic peptides corresponding to the loops bound LamR and could block LamR-binding to bacterial ligands in a dose dependant manner. Meningococci expressing PorA lacking the apex of loop 4 and H. influenzae expressing OmpP2 lacking the apex of loop 2 showed significantly reduced LamR binding. Since both loops are hyper-variable, our data may suggest a molecular basis for the range of LamR-binding capabilities previously reported among different meningococcal and H. influenzae strains. PMID:23049988
Abouseada, Noha M; Assafi, Mahde Saleh A; Mahdavi, Jafar; Oldfield, Neil J; Wheldon, Lee M; Wooldridge, Karl G; Ala'Aldeen, Dlawer A A
2012-01-01
Neisseria meningitidis, Haemophilus influenzae and Streptococcus pneumoniae are major bacterial agents of meningitis. They each bind the 37/67-kDa laminin receptor (LamR) via the surface protein adhesins: meningococcal PilQ and PorA, H. influenzae OmpP2 and pneumococcal CbpA. We have previously reported that a surface-exposed loop of the R2 domain of CbpA mediates LamR-binding. Here we have identified the LamR-binding regions of PorA and OmpP2. Using truncated recombinant proteins we show that binding is dependent on amino acids 171-240 and 91-99 of PorA and OmpP2, respectively, which are predicted to localize to the fourth and second surface-exposed loops, respectively, of these proteins. Synthetic peptides corresponding to the loops bound LamR and could block LamR-binding to bacterial ligands in a dose dependant manner. Meningococci expressing PorA lacking the apex of loop 4 and H. influenzae expressing OmpP2 lacking the apex of loop 2 showed significantly reduced LamR binding. Since both loops are hyper-variable, our data may suggest a molecular basis for the range of LamR-binding capabilities previously reported among different meningococcal and H. influenzae strains.
Bernini, Andrea; Henrici De Angelis, Lucia; Morandi, Edoardo; Spiga, Ottavia; Santucci, Annalisa; Assfalg, Michael; Molinari, Henriette; Pillozzi, Serena; Arcangeli, Annarosa; Niccolai, Neri
2014-03-01
Hotspot delineation on protein surfaces represents a fundamental step for targeting protein-protein interfaces. Disruptors of protein-protein interactions can be designed provided that the sterical features of binding pockets, including the transient ones, can be defined. Molecular Dynamics, MD, simulations have been used as a reliable framework for identifying transient pocket openings on the protein surface. Accessible surface area and intramolecular H-bond involvement of protein backbone amides are proposed as descriptors for characterizing binding pocket occurrence and evolution along MD trajectories. TEMPOL induced paramagnetic perturbations on (1)H-(15)N HSQC signals of protein backbone amides have been analyzed as a fragment-based search for surface hotspots, in order to validate MD predicted pockets. This procedure has been applied to CXCL12, a small chemokine responsible for tumor progression and proliferation. From combined analysis of MD data and paramagnetic profiles, two CXCL12 sites suitable for the binding of small molecules were identified. One of these sites is the already well characterized CXCL12 region involved in the binding to CXCR4 receptor. The other one is a transient pocket predicted by Molecular Dynamics simulations, which could not be observed from static analysis of CXCL12 PDB structures. The present results indicate how TEMPOL, instrumental in identifying this transient pocket, can be a powerful tool to delineate minor conformations which can be highly relevant in dynamic discovery of antitumoral drugs. Copyright © 2013 Elsevier B.V. All rights reserved.
A deep learning framework for modeling structural features of RNA-binding protein targets
Zhang, Sai; Zhou, Jingtian; Hu, Hailin; Gong, Haipeng; Chen, Ligong; Cheng, Chao; Zeng, Jianyang
2016-01-01
RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp. PMID:26467480
Mochly-Rosen, D; Miller, K G; Scheller, R H; Khaner, H; Lopez, J; Smith, B L
1992-09-08
Receptors for activated protein kinase C (RACKs) have been isolated from the particulate cell fraction of heart and brain. We previously demonstrated that binding of protein kinase C (PKC) to RACKs requires PKC activators and is via a site on PKC that is distinct from the substrate binding site. Here, we examine the possibility that the C2 region in the regulatory domain of PKC is involved in binding of PKC to RACKs. The synaptic vesicle-specific p65 protein contains two regions homologous to the C2 region of PKC. We found that three p65 fragments, containing either one or two of these PKC C2 homologous regions, bound to highly purified RACKs. Binding of the p65 fragments and PKC to RACKs was mutually exclusive; preincubation of RACKs with the p65 fragments inhibited PKC binding, and preincubation of RACKs with PKC inhibited binding of the p65 fragments. Preincubation of the p65 fragments with a peptide resembling the PKC binding site on RACKs also inhibited p65 binding to RACKs, suggesting that PKC and p65 bind to the same or nearby regions on RACKs. Since the only homologous region between PKC and the p65 fragments is the C2 region, these results suggest that the C2 region on PKC contains at least part of the RACK binding site.
Grzesik, Paul; Kreuchwig, Annika; Rutz, Claudia; Furkert, Jens; Wiesner, Burkhard; Schuelein, Ralf; Kleinau, Gunnar; Gromoll, Joerg; Krause, Gerd
2015-01-01
The human lutropin (hLH)/choriogonadotropin (hCG) receptor (LHCGR) can be activated by binding two slightly different gonadotropic glycoprotein hormones, choriogonadotropin (CG) – secreted by the placenta, and lutropin (LH) – produced by the pituitary. They induce different signaling profiles at the LHCGR. This cannot be explained by binding to the receptor’s leucine-rich-repeat domain (LRRD), as this binding is similar for the two hormones. We therefore speculate that there are previously unknown differences in the hormone/receptor interaction at the extracellular hinge region, which might help to understand functional differences between the two hormones. We have therefore performed a detailed study of the binding and action of LH and CG at the LHCGR hinge region. We focused on a primate-specific additional exon in the hinge region, which is located between LRRD and the serpentine domain. The segment of the hinge region encoded by exon10 was previously reported to be only relevant to hLH signaling, as the exon10-deletion receptor exhibits decreased hLH signaling, but unchanged hCG signaling. We designed an advanced homology model of the hormone/LHCGR complex, followed by experimental characterization of relevant fragments in the hinge region. In addition, we examined predictions of a helical exon10-encoded conformation by block-wise polyalanine (helix supporting) mutations. These helix preserving modifications showed no effect on hormone-induced signaling. However, introduction of a structure-disturbing double-proline mutant LHCGR-Q303P/E305P within the exon10-helix has, in contrast to exon10-deletion, no impact on hLH, but only on hCG signaling. This opposite effect on signaling by hLH and hCG can be explained by distinct sites of hormone interaction in the hinge region. In conclusion, our analysis provides details of the differences between hLH- and hCG-induced signaling that are mainly determined in the L2-beta loop of the hormones and in the hinge region of the receptor. PMID:26441830
A Mixed QM/MM Scoring Function to Predict Protein-Ligand Binding Affinity
Hayik, Seth A.; Dunbrack, Roland; Merz, Kenneth M.
2010-01-01
Computational methods for predicting protein-ligand binding free energy continue to be popular as a potential cost-cutting method in the drug discovery process. However, accurate predictions are often difficult to make as estimates must be made for certain electronic and entropic terms in conventional force field based scoring functions. Mixed quantum mechanics/molecular mechanics (QM/MM) methods allow electronic effects for a small region of the protein to be calculated, treating the remaining atoms as a fixed charge background for the active site. Such a semi-empirical QM/MM scoring function has been implemented in AMBER using DivCon and tested on a set of 23 metalloprotein-ligand complexes, where QM/MM methods provide a particular advantage in the modeling of the metal ion. The binding affinity of this set of proteins can be calculated with an R2 of 0.64 and a standard deviation of 1.88 kcal/mol without fitting and 0.71 and a standard deviation of 1.69 kcal/mol with fitted weighting of the individual scoring terms. In this study we explore using various methods to calculate terms in the binding free energy equation, including entropy estimates and minimization standards. From these studies we found that using the rotational bond estimate to ligand entropy results in a reasonable R2 of 0.63 without fitting. We also found that using the ESCF energy of the proteins without minimization resulted in an R2 of 0.57, when using the rotatable bond entropy estimate. PMID:21221417
Ashford, Paul; Moss, David S; Alex, Alexander; Yeap, Siew K; Povia, Alice; Nobeli, Irene; Williams, Mark A
2012-03-14
Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible. We have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding. Through post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.
sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Heng; Ye, Hao; Ng, Hui Wen
Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less
sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides
Luo, Heng; Ye, Hao; Ng, Hui Wen; Sakkiah, Sugunadevi; Mendrick, Donna L.; Hong, Huixiao
2016-01-01
Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. This algorithm can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system. PMID:27558848
sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides
Luo, Heng; Ye, Hao; Ng, Hui Wen; ...
2016-08-25
Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less
Holdsworth, Gill; Slocombe, Patrick; Doyle, Carl; Sweeney, Bernadette; Veverka, Vaclav; Le Riche, Kelly; Franklin, Richard J.; Compson, Joanne; Brookings, Daniel; Turner, James; Kennedy, Jeffery; Garlish, Rachael; Shi, Jiye; Newnham, Laura; McMillan, David; Muzylak, Mariusz; Carr, Mark D.; Henry, Alistair J.; Ceska, Thomas; Robinson, Martyn K.
2012-01-01
LRP5 and LRP6 are proteins predicted to contain four six-bladed β-propeller domains and both bind the bone-specific Wnt signaling antagonist sclerostin. Here, we report the crystal structure of the amino-terminal region of LRP6 and using NMR show that the ability of sclerostin to bind to this molecule is mediated by the central core of sclerostin and does not involve the amino- and carboxyl-terminal flexible arm regions. We show that this structured core region interacts with LRP5 and LRP6 via an NXI motif (found in the sequence PNAIG) within a flexible loop region (loop 2) within the central core region. This sequence is related closely to a previously identified motif in laminin that mediates its interaction with the β-propeller domain of nidogen. However, the NXI motif is not involved in the interaction of sclerostin with LRP4 (another β-propeller containing protein in the LRP family). A peptide derived from the loop 2 region of sclerostin blocked the interaction of sclerostin with LRP5/6 and also inhibited Wnt1 but not Wnt3A or Wnt9B signaling. This suggests that these Wnts interact with LRP6 in different ways. PMID:22696217
Binding Sites Analyser (BiSA): Software for Genomic Binding Sites Archiving and Overlap Analysis
Khushi, Matloob; Liddle, Christopher; Clarke, Christine L.; Graham, J. Dinny
2014-01-01
Genome-wide mapping of transcription factor binding and histone modification reveals complex patterns of interactions. Identifying overlaps in binding patterns by different factors is a major objective of genomic studies, but existing methods to archive large numbers of datasets in a personalised database lack sophistication and utility. Therefore we have developed transcription factor DNA binding site analyser software (BiSA), for archiving of binding regions and easy identification of overlap with or proximity to other regions of interest. Analysis results can be restricted by chromosome or base pair overlap between regions or maximum distance between binding peaks. BiSA is capable of reporting overlapping regions that share common base pairs; regions that are nearby; regions that are not overlapping; and average region sizes. BiSA can identify genes located near binding regions of interest, genomic features near a gene or locus of interest and statistical significance of overlapping regions can also be reported. Overlapping results can be visualized as Venn diagrams. A major strength of BiSA is that it is supported by a comprehensive database of publicly available transcription factor binding sites and histone modifications, which can be directly compared to user data. The documentation and source code are available on http://bisa.sourceforge.net PMID:24533055
Zhang, Hui; Wang, Yi-Jun; Zhang, Yun-Kai; Wang, De-Shen; Kathawala, Rishil J; Patel, Atish; Talele, Tanaji T; Chen, Zhe-Sheng; Fu, Li-Wu
2014-08-01
AST1306, an inhibitor of EGFR and ErbB2, is currently in phase I of clinical trials. We evaluated the effect of AST306 on the reversal of multidrug resistance (MDR) induced by ATP-binding cassette (ABC) transporters. We found that AST1306 significantly sensitized the ABC subfamily G member 2 (ABCG2)-overexpressing cells to ABCG2 substrate chemotherapeutics. AST1306 significantly increased intracellular accumulation of [(3)H]-mitoxantrone in ABCG2-overexpressing cells by blocking ABCG2 efflux function. Moreover, AST1306 stimulated the ATPase activity of ABCG2. Homology modeling predicted the binding conformation of AST1306 to be within the transmembrane region of ABCG2. In conclusion, AST1306 could notably reverse ABCG2-mediated MDR. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Accurate modeling of defects in graphene transport calculations
NASA Astrophysics Data System (ADS)
Linhart, Lukas; Burgdörfer, Joachim; Libisch, Florian
2018-01-01
We present an approach for embedding defect structures modeled by density functional theory into large-scale tight-binding simulations. We extract local tight-binding parameters for the vicinity of the defect site using Wannier functions. In the transition region between the bulk lattice and the defect the tight-binding parameters are continuously adjusted to approach the bulk limit far away from the defect. This embedding approach allows for an accurate high-level treatment of the defect orbitals using as many as ten nearest neighbors while keeping a small number of nearest neighbors in the bulk to render the overall computational cost reasonable. As an example of our approach, we consider an extended graphene lattice decorated with Stone-Wales defects, flower defects, double vacancies, or silicon substitutes. We predict distinct scattering patterns mirroring the defect symmetries and magnitude that should be experimentally accessible.
RBind: computational network method to predict RNA binding sites.
Wang, Kaili; Jian, Yiren; Wang, Huiwen; Zeng, Chen; Zhao, Yunjie
2018-04-26
Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. The codes and datasets are available at https://zhaolab.com.cn/RBind. yjzhaowh@mail.ccnu.edu.cn. Supplementary data are available at Bioinformatics online.
Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J
2015-03-07
Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10(-100)), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation.
Rawat, Manmeet; Vijay, Sonam; Gupta, Yash; Tiwari, Pramod Kumar; Sharma, Arun
2013-01-01
Plasmepsin V (PM-V) have functionally conserved orthologues across the Plasmodium genus who's binding and antigenic processing at the PEXEL motifs for export about 200-300 essential proteins is important for the virulence and viability of the causative Plasmodium species. This study was undertaken to determine P. vivax plasmepsin V Ind (PvPM-V-Ind) PEXEL motif export pathway for pathogenicity-related proteins/antigens export thereby altering plasmodium exportome during erythrocytic stages. We identify and characterize Plasmodium vivax plasmepsin-V-Ind (mutant) gene by cloning, sequence analysis, in silico bioinformatic protocols and structural modeling predictions based on docking studies on binding capacity with PEXEL motifs processing in terms of binding and accessibility of export proteins. Cloning and sequence analysis for genetic diversity demonstrates PvPM-V-Ind (mutant) gene is highly conserved among all isolates from different geographical regions of India. Imperfect duplicate insertion types of mutations (SVSE from 246-249 AA and SLSE from 266-269 AA) were identified among all Indian isolates in comparison to P.vivax Sal-1 (PvPM-V-Sal 1) isolate. In silico bioinformatics interaction studies of PEXEL peptide and active enzyme reveal that PvPM-V-Ind (mutant) is only active in endoplasmic reticulum lumen and membrane embedding is essential for activation of plasmepsin V. Structural modeling predictions based on docking studies with PEXEL motif show significant variation in substrate protein binding of these imperfect mutations with data mined PEXEL sequences. The predicted variation in the docking score and interacting amino acids of PvPM-V-Ind (mutant) proteins with PEXEL and lopinavir suggests a modulation in the activity of PvPM-V in terms of binding and accessibility at these sites. Our functional modeled validation of PvPM-V-Ind (mutant) imperfect duplicate insertions with data mined PEXEL sequences leading to altered binding and substrate accessibility of the enzyme makes it a plausible target to investigate export mechanisms for in silico virtual screening and novel pharmacophore designing.
Rawat, Manmeet; Vijay, Sonam; Gupta, Yash; Tiwari, Pramod Kumar; Sharma, Arun
2013-01-01
Introduction Plasmepsin V (PM-V) have functionally conserved orthologues across the Plasmodium genus who's binding and antigenic processing at the PEXEL motifs for export about 200–300 essential proteins is important for the virulence and viability of the causative Plasmodium species. This study was undertaken to determine P. vivax plasmepsin V Ind (PvPM-V-Ind) PEXEL motif export pathway for pathogenicity-related proteins/antigens export thereby altering plasmodium exportome during erythrocytic stages. Method We identify and characterize Plasmodium vivax plasmepsin-V-Ind (mutant) gene by cloning, sequence analysis, in silico bioinformatic protocols and structural modeling predictions based on docking studies on binding capacity with PEXEL motifs processing in terms of binding and accessibility of export proteins. Results Cloning and sequence analysis for genetic diversity demonstrates PvPM-V-Ind (mutant) gene is highly conserved among all isolates from different geographical regions of India. Imperfect duplicate insertion types of mutations (SVSE from 246–249 AA and SLSE from 266–269 AA) were identified among all Indian isolates in comparison to P.vivax Sal-1 (PvPM-V-Sal 1) isolate. In silico bioinformatics interaction studies of PEXEL peptide and active enzyme reveal that PvPM-V-Ind (mutant) is only active in endoplasmic reticulum lumen and membrane embedding is essential for activation of plasmepsin V. Structural modeling predictions based on docking studies with PEXEL motif show significant variation in substrate protein binding of these imperfect mutations with data mined PEXEL sequences. The predicted variation in the docking score and interacting amino acids of PvPM-V-Ind (mutant) proteins with PEXEL and lopinavir suggests a modulation in the activity of PvPM-V in terms of binding and accessibility at these sites. Conclusion/Significance Our functional modeled validation of PvPM-V-Ind (mutant) imperfect duplicate insertions with data mined PEXEL sequences leading to altered binding and substrate accessibility of the enzyme makes it a plausible target to investigate export mechanisms for in silico virtual screening and novel pharmacophore designing. PMID:23555891
Predicting protein-binding RNA nucleotides with consideration of binding partners.
Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook
2015-06-01
In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Crystal Structure of the Heterotrimeric Integrin-Binding Region of Laminin-111.
Pulido, David; Hussain, Sadaf-Ahmahni; Hohenester, Erhard
2017-03-07
Laminins are cell-adhesive glycoproteins that are essential for basement membrane assembly and function. Integrins are important laminin receptors, but their binding site on the heterotrimeric laminins is poorly defined structurally. We report the crystal structure at 2.13 Å resolution of a minimal integrin-binding fragment of mouse laminin-111, consisting of ∼50 residues of α1β1γ1 coiled coil and the first three laminin G-like (LG) domains of the α1 chain. The LG domains adopt a triangular arrangement, with the C terminus of the coiled coil situated between LG1 and LG2. The critical integrin-binding glutamic acid residue in the γ1 chain tail is surface exposed and predicted to bind to the metal ion-dependent adhesion site in the integrin β1 subunit. Additional contacts to the integrin are likely to be made by the LG1 and LG2 surfaces adjacent to the γ1 chain tail, which are notably conserved and free of obstructing glycans. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Veronese, Mattia; Zanotti-Fregonara, Paolo; Rizzo, Gaia; Bertoldo, Alessandra; Innis, Robert B; Turkheimer, Federico E
2016-04-15
PET studies allow in vivo imaging of the density of brain receptor species. The PET signal, however, is the sum of the fraction of radioligand that is specifically bound to the target receptor and the non-displaceable fraction (i.e. the non-specifically bound radioligand plus the free ligand in tissue). Therefore, measuring the non-displaceable fraction, which is generally assumed to be constant across the brain, is a necessary step to obtain regional estimates of the specific fractions. The nondisplaceable binding can be directly measured if a reference region, i.e. a region devoid of any specific binding, is available. Many receptors are however widely expressed across the brain, and a true reference region is rarely available. In these cases, the nonspecific binding can be obtained after competitive pharmacological blockade, which is often contraindicated in humans. In this work we introduce the genomic plot for estimating the nondisplaceable fraction using baseline scans only. The genomic plot is a transformation of the Lassen graphical method in which the brain maps of mRNA transcripts of the target receptor obtained from the Allen brain atlas are used as a surrogate measure of the specific binding. Thus, the genomic plot allows the calculation of the specific and nondisplaceable components of radioligand uptake without the need of pharmacological blockade. We first assessed the statistical properties of the method with computer simulations. Then we sought ground-truth validation using human PET datasets of seven different neuroreceptor radioligands, where nonspecific fractions were either obtained separately using drug displacement or available from a true reference region. The population nondisplaceable fractions estimated by the genomic plot were very close to those measured by actual human blocking studies (mean relative difference between 2% and 7%). However, these estimates were valid only when mRNA expressions were predictive of protein levels (i.e. there were no significant post-transcriptional changes). This condition can be readily established a priori by assessing the correlation between PET and mRNA expression. Copyright © 2016 Elsevier Inc. All rights reserved.
Veronese, Mattia; Zanotti-Fregonara, Paolo; Rizzo, Gaia; Bertoldo, Alessandra; Innis, Robert B.; Turkheimer, Federico E.
2016-01-01
PET studies allow in vivo imaging of the density of brain receptor species. The PET signal, however, is the sum of the fraction of radioligand that is specifically bound to the target receptor and the non-displaceable fraction (i.e. the non-specifically bound radioligand plus the free ligand in tissue). Therefore, measuring the non-displaceable fraction, which is generally assumed to be constant across the brain, is a necessary step to obtain regional estimates of the specific fractions. The nondisplaceable binding can be directly measured if a reference region, i.e. a region devoid of any specific binding, is available. Many receptors are however widely expressed across the brain, and a true reference region is rarely available. In these cases, the nonspecific binding can be obtained after competitive pharmacological blockade, which is often contraindicated in humans. In this work we introduce the genomic plot for estimating the nondisplaceable fraction using baseline scans only. The genomic plot is a transformation of the Lassen graphical method in which the brain maps of mRNA transcripts of the target receptor obtained from the Allen brain atlas are used as a surrogate measure of the specific binding. Thus, the genomic plot allows the calculation of the specific and nondisplaceable components of radioligand uptake without the need of pharmacological blockade. We first assessed the statistical properties of the method with computer simulations. Then we sought ground-truth validation using human PET datasets of seven different neuroreceptor radioligands, where nonspecific fractions were either obtained separately using drug displacement or available from a true reference region. The population nondisplaceable fractions estimated by the genomic plot were very close to those measured by actual human blocking studies (mean relative difference between 2% and 7%). However, these estimates were valid only when mRNA expressions were predictive of protein levels (i.e. there were no significant post-transcriptional changes). This condition can be readily established a priori by assessing the correlation between PET and mRNA expression. PMID:26850512
Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M
2017-04-17
Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.
A close relative of the nuclear, chromosomal high-mobility group protein HMG1 in yeast mitochondria.
Diffley, J F; Stillman, B
1991-01-01
ABF2 (ARS-binding factor 2), a small, basic DNA-binding protein that binds specifically to the autonomously replicating sequence ARS1, is located primarily in the mitochondria of the yeast Saccharomyces cerevisiae. The abundance of ABF2 and the phenotype of abf2- null mutants argue that this protein plays a key role in the structure, maintenance, and expression of the yeast mitochondrial genome. The predicted amino acid sequence of ABF2 is closely related to the high-mobility group proteins HMG1 and HMG2 from vertebrate cell nuclei and to several other DNA-binding proteins. Additionally, ABF2 and the other HMG-related proteins are related to a globular domain from the heat shock protein hsp70 family. ABF2 interacts with DNA both nonspecifically and in a specific manner within regulatory regions, suggesting a mechanism whereby it may aid in compacting the mitochondrial genome without interfering with expression. Images PMID:1881919
Molecular cloning and characterization of the promoter region of the porcine apolipoprotein E gene.
Xia, Jihan; Hu, Bingjun; Mu, Yulian; Xin, Leilei; Yang, Shulin; Li, Kui
2014-05-01
Apolipoprotein E (APOE), a component of lipoproteins plays an important role in the transport and metabolism of cholesterol, and is associated with hyperlipoproteinemia and Alzheimer's disease. In order to further understand the characterization of APOE gene, the promoter of APOE gene of Landrace pigs was analyzed in the present study. The genomic structure and amino acid sequence in pigs were analyzed and found to share high similarity in those of human but low similarity in promoter region. Real-time PCR revealed the APOE gene expression pattern of pigs in diverse tissues. The highest expression level was observed in liver, relatively low expression in other tissues, especially in stomach and muscle. Furthermore, the promoter expressing in Hepa 1-6 was significantly better at driving luciferase expression compared with C2C12 cell. After analysis of porcine APOE gene promoter regions, potential transcription factor binding sites were predicted and two GC signals, a TATA box were indicated. Results of promoter activity analysis indicated that one of potential regulatory elements was located in the region -669 to -259, which was essential for a high expression of the APOE gene. Promoter mutation and deletion analysis further suggested that the C/EBPA binding site within the APOE promoter was responsible for the regulation of APOE transcription. Electrophoretic mobility shift assays also showed the binding site of the transcription factor C/EBPA. This study advances our knowledge of the promoter of the porcine APOE gene.
Kel, AlexanderE
2017-02-01
Computational analysis of master regulators through the search for transcription factor binding sites followed by analysis of signal transduction networks of a cell is a new approach of causal analysis of multi-omics data. This paper contains results on analysis of multi-omics data that include transcriptomics, proteomics and epigenomics data of methotrexate (MTX) resistant colon cancer cell line. The data were used for analysis of mechanisms of resistance and for prediction of potential drug targets and promising compounds for reverting the MTX resistance of these cancer cells. We present all results of the analysis including the lists of identified transcription factors and their binding sites in genome and the list of predicted master regulators - potential drug targets. This data was generated in the study recently published in the article "Multi-omics "Upstream Analysis" of regulatory genomic regions helps identifying targets against methotrexate resistance of colon cancer" (Kel et al., 2016) [4]. These data are of interest for researchers from the field of multi-omics data analysis and for biologists who are interested in identification of novel drug targets against NTX resistance.
Rydzak, Joanna; Kaczmarek, Radoslaw; Czerwinski, Marcin; Lukasiewicz, Jolanta; Tyborowska, Jolanta; Szewczyk, Boguslaw; Jaskiewicz, Ewa
2015-01-01
The erythrocyte binding ligand 140 (EBA-140) is a member of the Plasmodium falciparum DBL family of erythrocyte binding proteins, which are considered as prospective candidates for malaria vaccine development. The EBA-140 ligand is a paralogue of the well-characterized P. falciparum EBA-175 protein. They share homology of domain structure, including Region II, which consists of two homologous F1 and F2 domains and is responsible for ligand-erythrocyte receptor interaction during invasion. In this report we describe, for the first time, the glycophorin C specificity of the recombinant, baculovirus-expressed binding region (Region II) of P. falciparum EBA-140 ligand. It was found that the recombinant EBA-140 Region II binds to the endogenous and recombinant glycophorin C, but does not bind to Gerbich-type glycophorin C, neither normal nor recombinant, which lacks amino acid residues 36–63 of its polypeptide chain. Our results emphasize the crucial role of this glycophorin C region in EBA-140 ligand binding. Moreover, the EBA-140 Region II did not bind either to glycophorin D, the truncated form of glycophorin C lacking the N-glycan or to desialylated GPC. These results draw attention to the role of glycophorin C glycans in EBA-140 binding. The full identification of the EBA-140 binding site on glycophorin C molecule, consisting most likely of its glycans and peptide backbone, may help to design therapeutics or vaccines that target the erythrocyte binding merozoite ligands. PMID:25588042
Back, C R; Douglas, S K; Emerson, J E; Nobbs, A H; Jenkinson, H F
2015-10-01
Streptococcus gordonii SspA and SspB proteins, members of the antigen I/II (AgI/II) family of Streptococcus adhesins, mediate adherence to cysteine-rich scavenger glycoprotein gp340 and cells of other oral microbial species. In this article we investigated further the mechanism of coaggregation between S. gordonii DL1 and Actinomyces oris T14V. Previous mutational analysis of S. gordonii suggested that SspB was necessary for coaggregation with A. oris T14V. We have confirmed this by showing that Lactococcus lactis surrogate host cells expressing SspB coaggregated with A. oris T14V and PK606 cells, while L. lactis cells expressing SspA did not. Coaggregation occurred independently of expression of A. oris type 1 (FimP) or type 2 (FimA) fimbriae. Polysaccharide was prepared from cells of A. oris T14V and found to contain 1,4-, 4,6- and 3,4-linked glucose, 1,4-linked mannose, and 2,4-linked galactose residues. When immobilized onto plastic wells this polysaccharide supported binding of L. lactis expressing SspB, but not binding of L. lactis expressing other AgI/II family proteins. Purified recombinant NAVP region of SspB, comprising amino acid (aa) residues 41-847, bound A. oris polysaccharide but the C-domain (932-1470 aa residues) did not. A site-directed deletion of 29 aa residues (Δ691-718) close to the predicted binding cleft within the SspB V-region ablated binding of the NAVP region to polysaccharide. These results infer that the V-region head of SspB recognizes an actinomyces polysaccharide ligand, so further characterizing a lectin-like coaggregation mechanism occurring between two important primary colonizers. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Ribonucleoprotein complexes in neurologic diseases.
Ule, Jernej
2008-10-01
Ribonucleoprotein (RNP) complexes regulate the tissue-specific RNA processing and transport that increases the coding capacity of our genome and the ability to respond quickly and precisely to the diverse set of signals. This review focuses on three proteins that are part of RNP complexes in most cells of our body: TAR DNA-binding protein (TDP-43), the survival motor neuron protein (SMN), and fragile-X mental retardation protein (FMRP). In particular, the review asks the question why these ubiquitous proteins are primarily associated with defects in specific regions of the central nervous system? To understand this question, it is important to understand the role of genetic and cellular environment in causing the defect in the protein, as well as how the defective protein leads to misregulation of specific target RNAs. Two approaches for comprehensive analysis of defective RNA-protein interactions are presented. The first approach defines the RNA code or the collection of proteins that bind to a certain cis-acting RNA site in order to lead to a predictable outcome. The second approach defines the RNA map or the summary of positions on target RNAs where binding of a particular RNA-binding protein leads to a predictable outcome. As we learn more about the RNA codes and maps that guide the action of the dynamic RNP world in our brain, possibilities for new treatments of neurologic diseases are bound to emerge.
The Receptor-Binding Domain in the VP1u Region of Parvovirus B19.
Leisi, Remo; Di Tommaso, Chiarina; Kempf, Christoph; Ros, Carlos
2016-02-24
Parvovirus B19 (B19V) is known as the human pathogen causing the mild childhood disease erythema infectiosum. B19V shows an extraordinary narrow tissue tropism for erythroid progenitor cells in the bone marrow, which is determined by a highly restricted uptake. We have previously shown that the specific internalization is mediated by the interaction of the viral protein 1 unique region (VP1u) with a yet unknown cellular receptor. To locate the receptor-binding domain (RBD) within the VP1u, we analyzed the effect of truncations and mutations on the internalization capacity of the recombinant protein into UT7/Epo cells. Here we report that the N-terminal amino acids 5-80 of the VP1u are necessary and sufficient for cellular binding and internalization; thus, this N-terminal region represents the RBD required for B19V uptake. Using site-directed mutagenesis, we further identified a cluster of important amino acids playing a critical role in VP1u internalization. In silico predictions and experimental results suggest that the RBD is structured as a rigid fold of three α-helices. Finally, we found that dimerization of the VP1u leads to a considerably enhanced cellular binding and internalization. Taken together, we identified the RBD that mediates B19V uptake and mapped functional and structural motifs within this sequence. The findings reveal insights into the uptake process of B19V, which contribute to understand the pathogenesis of the infection and the neutralization of the virus by the immune system.
The Receptor-Binding Domain in the VP1u Region of Parvovirus B19
Leisi, Remo; Di Tommaso, Chiarina; Kempf, Christoph; Ros, Carlos
2016-01-01
Parvovirus B19 (B19V) is known as the human pathogen causing the mild childhood disease erythema infectiosum. B19V shows an extraordinary narrow tissue tropism for erythroid progenitor cells in the bone marrow, which is determined by a highly restricted uptake. We have previously shown that the specific internalization is mediated by the interaction of the viral protein 1 unique region (VP1u) with a yet unknown cellular receptor. To locate the receptor-binding domain (RBD) within the VP1u, we analyzed the effect of truncations and mutations on the internalization capacity of the recombinant protein into UT7/Epo cells. Here we report that the N-terminal amino acids 5–80 of the VP1u are necessary and sufficient for cellular binding and internalization; thus, this N-terminal region represents the RBD required for B19V uptake. Using site-directed mutagenesis, we further identified a cluster of important amino acids playing a critical role in VP1u internalization. In silico predictions and experimental results suggest that the RBD is structured as a rigid fold of three α-helices. Finally, we found that dimerization of the VP1u leads to a considerably enhanced cellular binding and internalization. Taken together, we identified the RBD that mediates B19V uptake and mapped functional and structural motifs within this sequence. The findings reveal insights into the uptake process of B19V, which contribute to understand the pathogenesis of the infection and the neutralization of the virus by the immune system. PMID:26927158
Cooperative Allosteric Ligand Binding in Calmodulin
NASA Astrophysics Data System (ADS)
Nandigrami, Prithviraj
Conformational dynamics is often essential for a protein's function. For example, proteins are able to communicate the effect of binding at one site to a distal region of the molecule through changes in its conformational dynamics. This so called allosteric coupling fine tunes the sensitivity of ligand binding to changes in concentration. A conformational change between a "closed" (apo) and an "open" (holo) conformation upon ligation often produces this coupling between binding sites. Enhanced sensitivity between the unbound and bound ensembles leads to a sharper binding curve. There are two basic conceptual frameworks that guide our visualization about ligand binding mechanisms. First, a ligand can stabilize the unstable "open" state from a dynamic ensemble of conformations within the unbound basin. This binding mechanism is called conformational selection. Second, a ligand can weakly bind to the low-affinity "closed" state followed by a conformational transition to the "open" state. In this dissertation, I focus on molecular dynamics simulations to understand microscopic origins of ligand binding cooperativity. A minimal model of allosteric binding transitions must include ligand binding/unbinding events, while capturing the transition mechanism between two distinct meta-stable free energy basins. Due in part to computational timescales limitations, work in this dissertation describes large-scale conformational transitions through a simplified, coarse-grained model based on the energy basins defined by the open and closed conformations of the protein Calmodulin (CaM). CaM is a ubiquitous calcium-binding protein consisting of two structurally similar globular domains connected by a flexible linker. The two domains of CaM, N-terminal domain (nCaM) and C-terminal domain (cCaM) consists of two helix-loop-helix motifs (the EF-hands) connected by a flexible linker. Each domain of CaM consists of two binding loops and binds 2 calcium ions each. The intact domain binds up to 4 calcium ions. The simulations use a coupled molecular dynamics/monte carlo scheme where the protein dynamics is simulated explicitly, while ligand binding/unbinding are treated implicitly. In the model, ligand binding/unbinding events coupled with a conformational change of the protein within the grand canonical ensemble. Here, ligand concentration is controlled through the chemical potential (micro). This allows us to use a simple thermodynamic model to analyze the simulated data and quantify binding cooperativity. Simulated binding titration curves are calculated through equilibrium simulations at different values of micro. First, I study domain opening transitions of isolated nCaM and cCaM in the absence of calcium. This work is motivated by results from a recent analytic variational model that predicts distinct domain opening transition mechanism for the domains of CaM. This is a surprising result because the domains have the same folded state topology. In the simulations, I find the two domains of CaM have distinct transition mechanism over a broad range of temperature, in harmony with the analytic predictions. In particular, the simulated transition mechanism of nCaM follows a two-state behavior, while domain opening in cCaM involves global unfolding and refolding of the tertiary structure. The unfolded intermediate also appears in the landscape of nCaM, but at a higher temperature than it appears in cCaM's energy landscape. This is consistent with nCaM's higher thermal stability. Under approximate physiological conditions, majority of the sampled transitions in cCaM involves unfolding and refolding during conformational change. Kinetically, the transient unfolding and refolding in cCaM significantly slows the domain opening and closing rates in cCaM. Second, I investigate the structural origins of binding affinity and allosteric cooperativity of binding 2 calcium-ions to each domain of CaM. In my work, I predict the order of binding strength of CaM's loops. I analyze simulated binding curves within the framework of the classic Monod-Wyman-Changeux (MWC) model of allostery to extract the binding free energies to the closed and open ensembles. The simulations predict that cCaM binds calcium with higher affinity and greater cooperativity than nCaM. Where it is possible to compare, these predictions are in good agreement with experimental results. The analysis of the simulations offers a rationale for why the two domains differ in cooperativity: the higher cooperativity of cCaM is due to larger difference in affinity of its binding loops. Third, I extend the work to investigate structural origins of binding cooperativity of 4 calcium-ions to intact CaM. I characterize the microscopic cooperativities of each ligation state and provide a kinetic description of the binding mechanism. Due to the heterogeneous nature of CaM's loops, as predicted in our simulations of isolated domains, I focus on investigating the influence of this heterogeneity on the kinetic flux of binding pathways as a function of concentration. The formalism developed for Network Models of protein folding kinetics, is used to evaluate the directed flux of all possible pathways between unligated and fully loaded CaM. (Abstract shortened by ProQuest.).
Cang, Zixuan; Wei, Guo-Wei
2018-02-01
Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40Å away from the binding site, which has a significant ramification to drug and protein design. Copyright © 2017 John Wiley & Sons, Ltd.
Sakkal, Leon A; Rajkowski, Kyle Z; Armen, Roger S
2017-06-05
Following insights from recent crystal structures of the muscarinic acetylcholine receptor, binding modes of Positive Allosteric Modulators (PAMs) were predicted under the assumption that PAMs should bind to the extracellular surface of the active state. A series of well-characterized PAMs for adenosine (A 1 R, A 2A R, A 3 R) and muscarinic acetylcholine (M 1 R, M 5 R) receptors were modeled using both rigid and flexible receptor CHARMM-based molecular docking. Studies of adenosine receptors investigated the molecular basis of the probe-dependence of PAM activity by modeling in complex with specific agonist radioligands. Consensus binding modes map common pharmacophore features of several chemical series to specific binding interactions. These models provide a rationalization of how PAM binding slows agonist radioligand dissociation kinetics. M 1 R PAMs were predicted to bind in the analogous M 2 R PAM LY2119620 binding site. The M 5 R NAM (ML-375) was predicted to bind in the PAM (ML-380) binding site with a unique induced-fit receptor conformation. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Datta, Deepshikha; Vaidehi, Nagarajan; Floriano, Wely B; Kim, Kwang S; Prasadarao, Nemani V; Goddard, William A
2003-02-01
Esherichia coli, the most common gram-negative bacteria, can penetrate the brain microvascular endothelial cells (BMECs) during the neonatal period to cause meningitis with significant morbidity and mortality. Experimental studies have shown that outer-membrane protein A (OmpA) of E. coli plays a key role in the initial steps of the invasion process by binding to specific sugar moieties present on the glycoproteins of BMEC. These experiments also show that polymers of chitobiose (GlcNAcbeta1-4GlcNAc) block the invasion, while epitopes substituted with the L-fucosyl group do not. We used HierDock computational technique that consists of a hierarchy of coarse grain docking method with molecular dynamics (MD) to predict the binding sites and energies of interactions of GlcNAcbeta1-4GlcNAc and other sugars with OmpA. The results suggest two important binding sites for the interaction of carbohydrate epitopes of BMEC glycoproteins to OmpA. We identify one site as the binding pocket for chitobiose (GlcNAcbeta1-4GlcNAc) in OmpA, while the second region (including loops 1 and 2) may be important for recognition of specific sugars. We find that the site involving loops 1 and 2 has relative binding energies that correlate well with experimental observations. This theoretical study elucidates the interaction sites of chitobiose with OmpA and the binding site predictions made in this article are testable either by mutation studies or invasion assays. These results can be further extended in suggesting possible peptide antagonists and drug design for therapeutic strategies. Copyright 2002 Wiley-Liss, Inc.
Kumar, Charanya; Williams, Gregory M; Havens, Brett; Dinicola, Michelle K; Surtees, Jennifer A
2013-06-12
In Saccharomyces cerevisiae, repair of insertion/deletion loops is carried out by Msh2-Msh3-mediated mismatch repair (MMR). Msh2-Msh3 is also required for 3' non-homologous tail removal (3' NHTR) in double-strand break repair. In both pathways, Msh2-Msh3 binds double-strand/single-strand junctions and initiates repair in an ATP-dependent manner. However, the kinetics of the two processes appear different; MMR is likely rapid in order to coordinate with the replication fork, whereas 3' NHTR has been shown to be a slower process. To understand the molecular requirements in both repair pathways, we performed an in vivo analysis of well-conserved residues in Msh3 that are hypothesized to be required for MMR and/or 3' NHTR. These residues are predicted to be involved in either communication between the DNA-binding and ATPase domains within the complex or nucleotide binding and/or exchange within Msh2-Msh3. We identified a set of aromatic residues within the FLY motif of the predicted Msh3 nucleotide binding pocket that are essential for Msh2-Msh3-mediated MMR but are largely dispensable for 3' NHTR. In contrast, mutations in other regions gave similar phenotypes in both assays. Based on these results, we suggest that the two pathways have distinct requirements with respect to the position of the bound ATP within Msh3. We propose that the differences are related, at least in part, to the kinetics of each pathway. Proper binding and positioning of ATP is required to induce rapid conformational changes at the replication fork, but is less important when more time is available for repair, as in 3' NHTR. Copyright © 2013 Elsevier Ltd. All rights reserved.
Kumar, Charanya; Williams, Gregory M.; Havens, Brett; Dinicola, Michelle; Surtees, Jennifer A.
2013-01-01
In Saccharomyces cerevisiae, repair of insertion/deletion loops is carried out by Msh2-Msh3-mediated mismatch repair (MMR). Msh2-Msh3 is also required for 3’ non-homologous tail removal (3’NHTR) in double-strand break repair. In both pathways, Msh2-Msh3 binds double-strand/single-strand junctions and initiates repair in an ATP-dependent manner. However, the kinetics of the two processes appear different; MMR is likely rapid in order to coordinate with the replication fork, whereas 3’ NHTR has been shown to be a slower process. To understand the molecular requirements in both repair pathways, we performed an in vivo analysis of well conserved residues in Msh3 that are hypothesized to be required for MMR and/or 3’NHTR. These residues are predicted to be involved in either communication between the DNA-binding and ATPase domains within the complex or nucleotide binding and/or exchange within Msh2-Msh3. We identified a set of aromatic residues within the FLY motif of the predicted Msh3 nucleotide binding pocket that are essential for Msh2-Msh3-mediated MMR but are largely dispensable for 3’NHTR. In contrast, mutations in other regions gave similar phenotypes in both assays. Based on these results, we suggest the two pathways have distinct requirements with respect to the position of the bound ATP within Msh3. We propose that the differences are related, at least in part, to the kinetics of each pathway. Proper binding and positioning of ATP is required to induce rapid conformational changes at the replication fork, but is less important when more time is available for repair, as in 3’ NHTR. PMID:23458407
In Silico Prediction Analysis of Idiotope-Driven T–B Cell Collaboration in Multiple Sclerosis
Høglund, Rune A.; Lossius, Andreas; Johansen, Jorunn N.; Homan, Jane; Benth, Jūratė Šaltytė; Robins, Harlan; Bogen, Bjarne; Bremel, Robert D.; Holmøy, Trygve
2017-01-01
Memory B cells acting as antigen-presenting cells are believed to be important in multiple sclerosis (MS), but the antigen they present remains unknown. We hypothesized that B cells may activate CD4+ T cells in the central nervous system of MS patients by presenting idiotopes from their own immunoglobulin variable regions on human leukocyte antigen (HLA) class II molecules. Here, we use bioinformatics prediction analysis of B cell immunoglobulin variable regions from 11 MS patients and 6 controls with other inflammatory neurological disorders (OINDs), to assess whether the prerequisites for such idiotope-driven T–B cell collaboration are present. Our findings indicate that idiotopes from the complementarity determining region (CDR) 3 of MS patients on average have high predicted affinities for disease associated HLA-DRB1*15:01 molecules and are predicted to be endosomally processed by cathepsin S and L in positions that allows such HLA binding to occur. Additionally, complementarity determining region 3 sequences from cerebrospinal fluid (CSF) B cells from MS patients contain on average more rare T cell-exposed motifs that could potentially escape tolerance and stimulate CD4+ T cells than CSF B cells from OIND patients. Many of these features were associated with preferential use of the IGHV4 gene family by CSF B cells from MS patients. This is the first study to combine high-throughput sequencing of patient immune repertoires with large-scale prediction analysis and provides key indicators for future in vitro and in vivo analyses. PMID:29038659
In Silico Prediction Analysis of Idiotope-Driven T-B Cell Collaboration in Multiple Sclerosis.
Høglund, Rune A; Lossius, Andreas; Johansen, Jorunn N; Homan, Jane; Benth, Jūratė Šaltytė; Robins, Harlan; Bogen, Bjarne; Bremel, Robert D; Holmøy, Trygve
2017-01-01
Memory B cells acting as antigen-presenting cells are believed to be important in multiple sclerosis (MS), but the antigen they present remains unknown. We hypothesized that B cells may activate CD4 + T cells in the central nervous system of MS patients by presenting idiotopes from their own immunoglobulin variable regions on human leukocyte antigen (HLA) class II molecules. Here, we use bioinformatics prediction analysis of B cell immunoglobulin variable regions from 11 MS patients and 6 controls with other inflammatory neurological disorders (OINDs), to assess whether the prerequisites for such idiotope-driven T-B cell collaboration are present. Our findings indicate that idiotopes from the complementarity determining region (CDR) 3 of MS patients on average have high predicted affinities for disease associated HLA-DRB1*15:01 molecules and are predicted to be endosomally processed by cathepsin S and L in positions that allows such HLA binding to occur. Additionally, complementarity determining region 3 sequences from cerebrospinal fluid (CSF) B cells from MS patients contain on average more rare T cell-exposed motifs that could potentially escape tolerance and stimulate CD4 + T cells than CSF B cells from OIND patients. Many of these features were associated with preferential use of the IGHV4 gene family by CSF B cells from MS patients. This is the first study to combine high-throughput sequencing of patient immune repertoires with large-scale prediction analysis and provides key indicators for future in vitro and in vivo analyses.
NASA Astrophysics Data System (ADS)
Salaeh, Salsabila; Chong, Wei Lim; Dokmaisrijan, Supaporn; Payaka, Apirak; Yana, Janchai; Nimmanpipug, Piyarat; Lee, Vannajan Sanghiran; Dumri, Kanchana; Anh, Dau Hung
2014-10-01
Cyanine dyes have been widely used as a fluorescence probe for biomolecules and protein labeling. The mostly used cyanine dyes for nucleic acids labeling are DiSC2(3), DiSC2(5), and DiSC2(7). The possible structures and binding energies of RNA-RNA/Cyanine dyes were predicted theoretically using AutoDock Vina. The results showed that cyanine dyes and bases of RNA-RNA have the van der Waals and pi-pi interactions. The maximum absorption wavelength in the visible region obtained from the TD-DFT calculations of all cyanine dyes in the absence of the RNA-RNA double strand showed the bathochromic shift.
Hsiao, Hao-Ching; Gonzalez, Kim L.; Catanese, Daniel J.; Jordy, Kristopher E.; Matthews, Kathleen S.; Bondos, Sarah E.
2014-01-01
Interactions between structured proteins require a complementary topology and surface chemistry to form sufficient contacts for stable binding. However, approximately one third of protein interactions are estimated to involve intrinsically disordered regions of proteins. The dynamic nature of disordered regions before and, in some cases, after binding calls into question the role of partner topology in forming protein interactions. To understand how intrinsically disordered proteins identify the correct interacting partner proteins, we evaluated interactions formed by the Drosophila melanogaster Hox transcription factor Ultrabithorax (Ubx), which contains both structured and disordered regions. Ubx binding proteins are enriched in specific folds: 23 of its 39 partners include one of 7 folds, out of the 1195 folds recognized by SCOP. For the proteins harboring the two most populated folds, DNA-RNA binding 3-helical bundles and α-α superhelices, the regions of the partner proteins that exhibit these preferred folds are sufficient for Ubx binding. Three disorder-containing regions in Ubx are required to bind these partners. These regions are either alternatively spliced or multiply phosphorylated, providing a mechanism for cellular processes to regulate Ubx-partner interactions. Indeed, partner topology correlates with the ability of individual partner proteins to bind Ubx spliceoforms. Partners bind different disordered regions within Ubx to varying extents, creating the potential for competition between partners and cooperative binding by partners. The ability of partners to bind regions of Ubx that activate transcription and regulate DNA binding provides a mechanism for partners to modulate transcription regulation by Ubx, and suggests that one role of disorder in Ubx is to coordinate multiple molecular functions in response to tissue-specific cues. PMID:25286318
A split motor domain in a cytoplasmic dynein
Straube, Anne; Enard, Wolfgang; Berner, Alexandra; Wedlich-Söldner, Roland; Kahmann, Regine; Steinberg, Gero
2001-01-01
The heavy chain of dynein forms a globular motor domain that tightly couples the ATP-cleavage region and the microtubule-binding site to transform chemical energy into motion along the cytoskeleton. Here we show that, in the fungus Ustilago maydis, two genes, dyn1 and dyn2, encode the dynein heavy chain. The putative ATPase region is provided by dyn1, while dyn2 includes the predicted microtubule-binding site. Both genes are located on different chromosomes, are transcribed into independent mRNAs and are translated into separate polypeptides. Both Dyn1 and Dyn2 co-immunoprecipitated and co-localized within growing cells, and Dyn1–Dyn2 fusion proteins partially rescued mutant phenotypes, suggesting that both polypeptides interact to form a complex. In cell extracts the Dyn1–Dyn2 complex dissociated, and microtubule affinity purification indicated that Dyn1 or associated polypeptides bind microtubules independently of Dyn2. Both Dyn1 and Dyn2 were essential for cell survival, and conditional mutants revealed a common role in nuclear migration, cell morphogenesis and microtubule organization, indicating that the Dyn1–Dyn2 complex serves multiple cellular functions. PMID:11566874
Xu, Yingying; Lee, Jinhyuk; Lü, Zhi-Rong; Mu, Hang; Zhang, Qian; Park, Yong-Doo
2016-07-01
Understanding the mechanism of acetaldehyde dehydrogenase 1 (ALDH1) folding is important because this enzyme is directly involved in several types of cancers and other diseases. We investigated the urea-mediated unfolding of ALDH1 by integrating kinetic inhibition studies with computational molecular dynamics (MD) simulations. Conformational changes in the enzyme structure were also analyzed using intrinsic and 1-anilinonaphthalene-8-sulfonate (ANS)-binding fluorescence measurements. Kinetic studies revealed that the direct binding of urea to ALDH1 induces inactivation of ALDH1 in a manner of mixed-type inhibition. Tertiary structural changes associated with regional hydrophobic exposure of the active site were observed. The urea binding regions on ALDH1 were predicted by docking simulations and were partly shared with active site residues of ALDH1 and with interface residues of the oligomerization domain for tetramer formation. The docking results suggest that urea prevents formation of the ALDH1 normal shape for the tetramer state as well as entrance of the substrate into the active site. Our study provides insight into the structural changes that accompany urea-mediated unfolding of ALDH1 and the catalytic role associated with conformational changes.
Structure-Templated Predictions of Novel Protein Interactions from Sequence Information
Betel, Doron; Breitkreuz, Kevin E; Isserlin, Ruth; Dewar-Darch, Danielle; Tyers, Mike; Hogue, Christopher W. V
2007-01-01
The multitude of functions performed in the cell are largely controlled by a set of carefully orchestrated protein interactions often facilitated by specific binding of conserved domains in the interacting proteins. Interacting domains commonly exhibit distinct binding specificity to short and conserved recognition peptides called binding profiles. Although many conserved domains are known in nature, only a few have well-characterized binding profiles. Here, we describe a novel predictive method known as domain–motif interactions from structural topology (D-MIST) for elucidating the binding profiles of interacting domains. A set of domains and their corresponding binding profiles were derived from extant protein structures and protein interaction data and then used to predict novel protein interactions in yeast. A number of the predicted interactions were verified experimentally, including new interactions of the mitotic exit network, RNA polymerases, nucleotide metabolism enzymes, and the chaperone complex. These results demonstrate that new protein interactions can be predicted exclusively from sequence information. PMID:17892321
The identification and functional annotation of RNA structures conserved in vertebrates.
Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan
2017-08-01
Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.
SCOWLP classification: Structural comparison and analysis of protein binding regions
Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa
2008-01-01
Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098
Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A
2013-06-14
Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.
Chang, Yi-Wen; Su, Ying-Jhen; Hsiao, Michael; Wei, Kuo-Chen; Lin, Wei-Hsin; Liang, Chi-Lung; Chen, Shin-Cheh; Lee, Jia-Lin
2015-08-15
Wnt signaling contributes to the reprogramming and maintenance of cancer stem cell (CSC) states that are activated by epithelial-mesenchymal transition (EMT). However, the mechanistic relationship between EMT and the Wnt pathway in CSC is not entirely clear. Chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) indicated that EMT induces a switch from the β-catenin/E-cadherin/Sox15 complex to the β-catenin/Twist1/TCF4 complex, the latter of which then binds to CSC-related gene promoters. Tandem coimmunoprecipitation and re-ChIP experiments with epithelial-type cells further revealed that Sox15 associates with the β-catenin/E-cadherin complex, which then binds to the proximal promoter region of CASP3. Through this mechanism, Twist1 cleavage is triggered to regulate a β-catenin-elicited promotion of the CSC phenotype. During EMT, we documented that Twist1 binding to β-catenin enhanced the transcriptional activity of the β-catenin/TCF4 complex, including by binding to the proximal promoter region of ABCG2, a CSC marker. In terms of clinical application, our definition of a five-gene CSC signature (nuclear β-catenin(High)/nuclear Twist1(High)/E-cadherin(Low)/Sox15(Low)/CD133(High)) may provide a useful prognostic marker for human lung cancer. ©2015 American Association for Cancer Research.
On the Selective Packaging of Genomic RNA by HIV-1.
Comas-Garcia, Mauricio; Davis, Sean R; Rein, Alan
2016-09-12
Like other retroviruses, human immunodeficiency virus type 1 (HIV-1) selectively packages genomic RNA (gRNA) during virus assembly. However, in the absence of the gRNA, cellular messenger RNAs (mRNAs) are packaged. While the gRNA is selected because of its cis-acting packaging signal, the mechanism of this selection is not understood. The affinity of Gag (the viral structural protein) for cellular RNAs at physiological ionic strength is not much higher than that for the gRNA. However, binding to the gRNA is more salt-resistant, implying that it has a higher non-electrostatic component. We have previously studied the spacer 1 (SP1) region of Gag and showed that it can undergo a concentration-dependent conformational transition. We proposed that this transition represents the first step in assembly, i.e., the conversion of Gag to an assembly-ready state. To explain selective packaging of gRNA, we suggest here that binding of Gag to gRNA, with its high non-electrostatic component, triggers this conversion more readily than binding to other RNAs; thus we predict that a Gag-gRNA complex will nucleate particle assembly more efficiently than other Gag-RNA complexes. New data shows that among cellular mRNAs, those with long 3'-untranslated regions (UTR) are selectively packaged. It seems plausible that the 3'-UTR, a stretch of RNA not occupied by ribosomes, offers a favorable binding site for Gag.
Domain architectures of the Scm3p protein provide insights into centromere function and evolution.
Aravind, L; Iyer, Lakshminarayan M; Wu, Carl
2007-10-15
Recently, Scm3p has been shown to be a nonhistone component of centromeric chromatin that binds stoichiometrically to CenH3-H4 histones, and to be required for the assembly of kinetochores in Saccharomyces cerevisiae. Scm3p is conserved across fungi, and displays a remarkable variation in protein size, ranging from approximately 200 amino acids in S. cerevisiae to approximately 1300 amino acids in Neurospora crassa. This is primarily due a variable C-terminal segment that is linked to a conserved N-terminal, CenH3-interacting domain. We have discovered that the extended C-terminal region of Scm3p is strikingly characterized by lineage-specific fusions of single or multiple predicted DNA-binding domains different versions of the MYB and C2H2 zinc finger domains, AT-hooks, and a novel cysteine-rich metal-chelating cluster that are absent from the small versions of Scm3. Instead, S. cerevisiae point centromeres are recognized by components of the CBF3 DNA binding complex, which are conserved amongst close relatives of budding yeast, but are correspondingly absent from more distant fungi that possess regional centromeres. Hence, the C-terminal DNA binding motifs found in large Scm3p proteins may, along with CenH3, serve as a key epigenetic signal by recognizing and accommodating the lineage-specific diversity of centromere DNA in course of evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
van der Graaf, M.; van Mierlo, C.P.M.; Hemminga, M.A.
1991-06-11
The first 25 amino acids of the coat protein of cowpea chlorotic mottle virus are essential for binding the encapsidated RNA. Although an {alpha}-helical conformation has been predicted for this highly positively charged N-terminal region. No experimental evidence for this conformation has been presented so far. In this study, two-dimensional proton NMR experiments were performed on a chemically synthesized pentacosapeptide containing the first 25 amino acids of this coat protein. All resonances could be assigned by a combined use of two-dimensional correlated spectroscopy and nuclear Overhauser enhancement spectroscopy carried out at four different temperatures. Various NMR parameters indicate the presencemore » of a conformational ensemble consisting of helical structures rapidly converting into more extended states. Differences in chemical shifts and nuclear Overhauser effects indicate that lowering the temperature induces a shift of the dynamic equilibrium toward more helical structures. At 10{degrees}C, a perceptible fraction of the conformational ensemble consists of structures with an {alpha}-helical conformation between residues 9 and 17, likely starting with a turnlike structure around Thr9 and Arg10. Both the conformation and the position of this helical region agree well with the secondary structure predictions mentioned above.« less
Gill, Samuel C; Lim, Nathan M; Grinaway, Patrick B; Rustenburg, Ariën S; Fass, Josh; Ross, Gregory A; Chodera, John D; Mobley, David L
2018-05-31
Accurately predicting protein-ligand binding affinities and binding modes is a major goal in computational chemistry, but even the prediction of ligand binding modes in proteins poses major challenges. Here, we focus on solving the binding mode prediction problem for rigid fragments. That is, we focus on computing the dominant placement, conformation, and orientations of a relatively rigid, fragment-like ligand in a receptor, and the populations of the multiple binding modes which may be relevant. This problem is important in its own right, but is even more timely given the recent success of alchemical free energy calculations. Alchemical calculations are increasingly used to predict binding free energies of ligands to receptors. However, the accuracy of these calculations is dependent on proper sampling of the relevant ligand binding modes. Unfortunately, ligand binding modes may often be uncertain, hard to predict, and/or slow to interconvert on simulation time scales, so proper sampling with current techniques can require prohibitively long simulations. We need new methods which dramatically improve sampling of ligand binding modes. Here, we develop and apply a nonequilibrium candidate Monte Carlo (NCMC) method to improve sampling of ligand binding modes. In this technique, the ligand is rotated and subsequently allowed to relax in its new position through alchemical perturbation before accepting or rejecting the rotation and relaxation as a nonequilibrium Monte Carlo move. When applied to a T4 lysozyme model binding system, this NCMC method shows over 2 orders of magnitude improvement in binding mode sampling efficiency compared to a brute force molecular dynamics simulation. This is a first step toward applying this methodology to pharmaceutically relevant binding of fragments and, eventually, drug-like molecules. We are making this approach available via our new Binding modes of ligands using enhanced sampling (BLUES) package which is freely available on GitHub.
Munday, J; Kerr, S; Ni, J; Cornish, A L; Zhang, J Q; Nicoll, G; Floyd, H; Mattei, M G; Moore, P; Liu, D; Crocker, P R
2001-01-01
Here we characterize Siglec-10 as a new member of the Siglec family of sialic acid-binding Ig-like lectins. A full-length cDNA was isolated from a human spleen library and the corresponding gene identified. Siglec-10 is predicted to contain five extracellular Ig-like domains and a cytoplasmic tail containing three putative tyrosine-based signalling motifs. Siglec-10 exhibited a high degree of sequence similarity to CD33-related Siglecs and mapped to the same region, on chromosome 19q13.3. The expressed protein was able to mediate sialic acid-dependent binding to human erythrocytes and soluble sialoglycoconjugates. Using specific antibodies, Siglec-10 was detected on subsets of human leucocytes including eosinophils, monocytes and a minor population of natural killer-like cells. The molecular properties and expression pattern suggest that Siglec-10 may function as an inhibitory receptor within the innate immune system. PMID:11284738
Walz, Antje-Christine; Demel, Rudy A; de Kruijff, Ben; Mutzel, Rupert
2002-01-01
sn-Glycerol-3-phosphate dehydrogenase (GlpD) from Escherichia coli is a peripheral membrane enzyme involved in respiratory electron transfer. For it to display its enzymic activity, binding to the inner membrane is required. The way the enzyme interacts with the membrane and how this controls activity has not been elucidated. In the present study we provide evidence for direct protein-lipid interaction. Using the monolayer technique, we observed insertion of GlpD into lipid monolayers with a clear preference for anionic phospholipids. GlpD variants with point mutations in their predicted amphipathic helices showed a decreased ability to penetrate anionic phospholipid monolayers. From these data we propose that membrane binding of GlpD occurs by insertion of an amphipathic helix into the acyl-chain region of lipids mediated by negatively charged phospholipids. PMID:11955283
NASA Technical Reports Server (NTRS)
Mehandru, S. P.; Anderson, A. B.; Ross, P. N.
1985-01-01
The CO adsorption on a 40 atom cluster model of the (111) surface and a 36 atom cluster model of the (100) surface of the Pt3Ti alloy was studied. Parallel binding to high coordinate sites associated with Ti and low CO bond scission barriers are predicted for both surfaces. The binding of CO to Pt sites occurs in an upright orientation. These orientations are a consequence of the nature of the CO pi donation interactions with the surface. On the Ti sites the orbitals donate to the nearly empty Ti 3d band and the antibonding counterpart orbitals are empty. On the Pt sites, however, they are in the filled Pt 5d region of the alloy band, which causes CO to bond in a vertical orientation by 5 delta donation from the carbon end.
Isolation and characterization of a novel calmodulin-binding protein from potato
NASA Technical Reports Server (NTRS)
Reddy, Anireddy S N.; Day, Irene S.; Narasimhulu, S. B.; Safadi, Farida; Reddy, Vaka S.; Golovkin, Maxim; Harnly, Melissa J.
2002-01-01
Tuberization in potato is controlled by hormonal and environmental signals. Ca(2+), an important intracellular messenger, and calmodulin (CaM), one of the primary Ca(2+) sensors, have been implicated in controlling diverse cellular processes in plants including tuberization. The regulation of cellular processes by CaM involves its interaction with other proteins. To understand the role of Ca(2+)/CaM in tuberization, we have screened an expression library prepared from developing tubers with biotinylated CaM. This screening resulted in isolation of a cDNA encoding a novel CaM-binding protein (potato calmodulin-binding protein (PCBP)). Ca(2+)-dependent binding of the cDNA-encoded protein to CaM is confirmed by (35)S-labeled CaM. The full-length cDNA is 5 kb long and encodes a protein of 1309 amino acids. The deduced amino acid sequence showed significant similarity with a hypothetical protein from another plant, Arabidopsis. However, no homologs of PCBP are found in nonplant systems, suggesting that it is likely to be specific to plants. Using truncated versions of the protein and a synthetic peptide in CaM binding assays we mapped the CaM-binding region to a 20-amino acid stretch (residues 1216-1237). The bacterially expressed protein containing the CaM-binding domain interacted with three CaM isoforms (CaM2, CaM4, and CaM6). PCBP is encoded by a single gene and is expressed differentially in the tissues tested. The expression of CaM, PCBP, and another CaM-binding protein is similar in different tissues and organs. The predicted protein contained seven putative nuclear localization signals and several strong PEST motifs. Fusion of the N-terminal region of the protein containing six of the seven nuclear localization signals to the reporter gene beta-glucuronidase targeted the reporter gene to the nucleus, suggesting a nuclear role for PCBP.
Urate is a ligand for the transcriptional regulator PecS.
Perera, Inoka C; Grove, Anne
2010-09-24
PecS is a member of the MarR (multiple antibiotic resistance regulator) family, which has been shown in Erwinia to regulate the expression of virulence genes. MarR homologs typically bind a small molecule ligand, resulting in attenuated DNA binding. For PecS, the natural ligand has not been identified. We have previously shown that urate is a ligand for the Deinococcus radiodurans-encoded MarR homolog HucR (hypothetical uricase regulator) and identified residues responsible for ligand binding. We show here that all four residues involved in urate binding and propagation of conformational changes to DNA recognition helices are conserved in PecS homologs, suggesting that urate is the ligand for PecS. Consistent with this prediction, Agrobacterium tumefaciens PecS specifically binds urate, and urate attenuates DNA binding in vitro. PecS binds two operator sites in the intergenic region between the divergent pecS gene and pecM genes, one of which features two partially overlapping repeats to which PecS binds as a dimer on opposite faces of the duplex. Notably, urate dissociates PecS from cognate DNA, allowing transcription of both genes in vivo. Taken together, our data show that urate is a ligand for PecS and suggest that urate serves a novel function in signaling the colonization of a host plant. Copyright © 2010 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sershen, H.; Reith, M.E.; Hashim, A.
1985-06-01
In a continuing study of nicotine binding sites, the authors determined the relative amount of nicotine binding and acetylcholine binding in various brain regions of C57/BL and of DBA mice. Although midbrain showed the highest and cerebellum the lowest binding for both (/sup 3/H)nicotine and (/sup 3/H)acetylcholine, the ratio of nicotine to acetylcholine binding showed a three-fold regional variation. Acetylcholine inhibition of (/sup 3/H)nicotine binding indicated that a portion of nicotine binding was not inhibited by acetylcholine. These results indicate important differences between the binding of (+/-)-(/sup 3/H)nicotine and that of (/sup 3/H)acetylcholine.
A Prediction Method of Binding Free Energy of Protein and Ligand
NASA Astrophysics Data System (ADS)
Yang, Kun; Wang, Xicheng
2010-05-01
Predicting the binding free energy is an important problem in bimolecular simulation. Such prediction would be great benefit in understanding protein functions, and may be useful for computational prediction of ligand binding strengths, e.g., in discovering pharmaceutical drugs. Free energy perturbation (FEP)/thermodynamics integration (TI) is a classical method to explicitly predict free energy. However, this method need plenty of time to collect datum, and that attempts to deal with some simple systems and small changes of molecular structures. Another one for estimating ligand binding affinities is linear interaction energy (LIE) method. This method employs averages of interaction potential energy terms from molecular dynamics simulations or other thermal conformational sampling techniques. Incorporation of systematic deviations from electrostatic linear response, derived from free energy perturbation studies, into the absolute binding free energy expression significantly enhances the accuracy of the approach. However, it also is time-consuming work. In this paper, a new prediction method based on steered molecular dynamics (SMD) with direction optimization is developed to compute binding free energy. Jarzynski's equality is used to derive the PMF or free-energy. The results for two numerical examples are presented, showing that the method has good accuracy and efficiency. The novel method can also simulate whole binding proceeding and give some important structural information about development of new drugs.
Arcon, Juan Pablo; Defelipe, Lucas A; Modenutti, Carlos P; López, Elias D; Alvarez-Garcia, Daniel; Barril, Xavier; Turjanski, Adrián G; Martí, Marcelo A
2017-04-24
One of the most important biological processes at the molecular level is the formation of protein-ligand complexes. Therefore, determining their structure and underlying key interactions is of paramount relevance and has direct applications in drug development. Because of its low cost relative to its experimental sibling, molecular dynamics (MD) simulations in the presence of different solvent probes mimicking specific types of interactions have been increasingly used to analyze protein binding sites and reveal protein-ligand interaction hot spots. However, a systematic comparison of different probes and their real predictive power from a quantitative and thermodynamic point of view is still missing. In the present work, we have performed MD simulations of 18 different proteins in pure water as well as water mixtures of ethanol, acetamide, acetonitrile and methylammonium acetate, leading to a total of 5.4 μs simulation time. For each system, we determined the corresponding solvent sites, defined as space regions adjacent to the protein surface where the probability of finding a probe atom is higher than that in the bulk solvent. Finally, we compared the identified solvent sites with 121 different protein-ligand complexes and used them to perform molecular docking and ligand binding free energy estimates. Our results show that combining solely water and ethanol sites allows sampling over 70% of all possible protein-ligand interactions, especially those that coincide with ligand-based pharmacophoric points. Most important, we also show how the solvent sites can be used to significantly improve ligand docking in terms of both accuracy and precision, and that accurate predictions of ligand binding free energies, along with relative ranking of ligand affinity, can be performed.
Koebnik, Ralf; Krüger, Antje; Thieme, Frank; Urban, Alexander; Bonas, Ulla
2006-11-01
The pathogenicity of the plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria depends on a type III secretion system which is encoded by the 23-kb hrp (hypersensitive response and pathogenicity) gene cluster. Expression of the hrp operons is strongly induced in planta and in a special minimal medium and depends on two regulatory proteins, HrpG and HrpX. In this study, DNA affinity enrichment was used to demonstrate that the AraC-type transcriptional activator HrpX binds to a conserved cis-regulatory element, the plant-inducible promoter (PIP) box (TTCGC-N(15)-TTCGC), present in the promoter regions of four hrp operons. No binding of HrpX was observed when DNA fragments lacking a PIP box were used. HrpX also bound to a DNA fragment containing an imperfect PIP box (TTCGC-N(8)-TTCGT). Dinucleotide replacements in each half-site of the PIP box strongly decreased binding of HrpX, while simultaneous dinucleotide replacements in both half-sites completely abolished binding. Based on the complete genome sequence of Xanthomonas campestris pv. vesicatoria, putative plant-inducible promoters consisting of a PIP box and a -10 promoter motif were identified in the promoter regions of almost all HrpX-activated genes. Bioinformatic analyses and reverse transcription-PCR experiments revealed novel HrpX-dependent genes, among them a NUDIX hydrolase gene and several genes with a predicted role in the degradation of the plant cell wall. We conclude that HrpX is the most downstream component of the hrp regulatory cascade, which is proposed to directly activate most genes of the hrpX regulon via binding to corresponding PIP boxes.
Madeja, Michael; Steffen, Wibke; Mesic, Ivana; Garic, Bojan; Zhorov, Boris S.
2010-01-01
Kv2.1 channels, which are expressed in brain, heart, pancreas, and other organs and tissues, are important targets for drug design. Flecainide and propafenone are known to block Kv2.1 channels more potently than other Kv channels. Here, we sought to explore structural determinants of this selectivity. We demonstrated that flecainide reduced the K+ currents through Kv2.1 channels expressed in Xenopus laevis oocytes in a voltage- and time-dependent manner. By systematically exchanging various segments of Kv2.1 with those from Kv1.2, we determined flecainide-sensing residues in the P-helix and inner helix S6. These residues are not exposed to the inner pore, a conventional binding region of open channel blockers. The flecainide-sensing residues also contribute to propafenone binding, suggesting overlapping receptors for the drugs. Indeed, propafenone and flecainide compete for binding in Kv2.1. We further used Monte Carlo-energy minimizations to map the receptors of the drugs. Flecainide docking in the Kv1.2-based homology model of Kv2.1 predicts the ligand ammonium group in the central cavity and the benzamide moiety in a niche between S6 and the P-helix. Propafenone also binds in the niche. Its carbonyl group accepts an H-bond from the P-helix, the amino group donates an H-bond to the P-loop turn, whereas the propyl group protrudes in the pore and blocks the access to the selectivity filter. Thus, besides the binding region in the central cavity, certain K+ channel ligands can expand in the subunit interface whose residues are less conserved between K+ channels and hence may be targets for design of highly desirable subtype-specific K+ channel drugs. PMID:20709754
Wu, R.; Wilton, R.; Cuff, M. E.; ...
2017-02-07
The tandem Per-Arnt-Sim (PAS) like sensors are commonly found in signal transduction proteins. The periplasmic solute binding protein (SBP) domains are found ubiquitously and are generally involved in solute transport. These domains are widely observed as parts of separate proteins but not within the same polypeptide chain. We report the structural and biochemical characterization of the extracellular ligand-binding receptor, Dret_0059 from Desulfohalobium retbaense DSM 5692, an organism isolated from the Retba salt lake in Senegal. The structure of Dret_0059 consists of a novel combination of SBP and TPAS sensor domains. The N-terminal region forms an SBP domain and the C-terminalmore » region folds into a tandem PAS-like domain structure. A ketoleucine moiety is bound to the SBP, whereas a cytosine molecule is bound in the distal PAS domain of the TPAS. The differential scanning flourimetry studies in solution support the ligands observed in the crystal structure. There are only two other proteins with this structural architecture in the non-redundant sequence data base and we predict that they too bind the same substrates. There is significant interaction between the SBP and TPAS domains, and it is quite conceivable that the binding of one ligand will have an effect on the binding of the other. Our attempts to remove the ligands bound to the protein during expression were not successful, therefore, it is not clear what the relative affects are. The genomic context of this receptor does not contain any protein components expected for transport function, hence, we suggest that Dret_0059 is likely involved in signal transduction and not in solute transport.« less
Ma, Xin; Guo, Jing; Sun, Xiao
2015-01-01
The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.
MoFvAb: Modeling the Fv region of antibodies
Bujotzek, Alexander; Fuchs, Angelika; Qu, Changtao; Benz, Jörg; Klostermann, Stefan; Antes, Iris; Georges, Guy
2015-01-01
Knowledge of the 3-dimensional structure of the antigen-binding region of antibodies enables numerous useful applications regarding the design and development of antibody-based drugs. We present a knowledge-based antibody structure prediction methodology that incorporates concepts that have arisen from an applied antibody engineering environment. The protocol exploits the rich and continuously growing supply of experimentally derived antibody structures available to predict CDR loop conformations and the packing of heavy and light chain quickly and without user intervention. The homology models are refined by a novel antibody-specific approach to adapt and rearrange sidechains based on their chemical environment. The method achieves very competitive all-atom root mean square deviation values in the order of 1.5 Å on different evaluation datasets consisting of both known and previously unpublished antibody crystal structures. PMID:26176812
The feasibility of an efficient drug design method with high-performance computers.
Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki
2015-01-01
In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.
Plazinska, Anita; Plazinski, Wojciech; Jozwiak, Krzysztof
2014-04-30
The computational approach applicable for the molecular dynamics (MD)-based techniques is proposed to predict the ligand-protein binding affinities dependent on the ligand stereochemistry. All possible stereoconfigurations are expressed in terms of one set of force-field parameters [stereoconfiguration-independent potential (SIP)], which allows for calculating all relative free energies by only single simulation. SIP can be used for studying diverse, stereoconfiguration-dependent phenomena by means of various computational techniques of enhanced sampling. The method has been successfully tested on the β2-adrenergic receptor (β2-AR) binding the four fenoterol stereoisomers by both metadynamics simulations and replica-exchange MD. Both the methods gave very similar results, fully confirming the presence of stereoselective effects in the fenoterol-β2-AR interactions. However, the metadynamics-based approach offered much better efficiency of sampling which allows for significant reduction of the unphysical region in SIP. Copyright © 2014 Wiley Periodicals, Inc.
Highlander, S K; Wickersham, E A; Garza, O; Weinstock, G M
1993-01-01
Multicopy and single-copy chromosomal fusions between the Pasteurella haemolytica leukotoxin regulatory region and the Escherichia coli beta-galactosidase gene have been constructed. These fusions were used as reporters to identify and isolate regulators of leukotoxin expression from a P. haemolytica cosmid library. A cosmid clone, which inhibited leukotoxin expression from multicopy and single-copy protein fusions, was isolated and found to contain the complete leukotoxin gene cluster plus additional upstream sequences. The locus responsible for inhibition of expression from leukotoxin-beta-galactosidase fusions was mapped within these upstream sequences, by transposon mutagenesis with Tn5, and its DNA sequence was determined. The inhibitory activity was found to be associated with a predicted 440-amino-acid reading frame (lapA) that lies within a four-gene arginine transport locus. LapA is predicted to be the nucleotide-binding component of this transport system and shares homology with the Clp family of proteases. Images PMID:8359916
TEMPLE: analysing population genetic variation at transcription factor binding sites.
Litovchenko, Maria; Laurent, Stefan
2016-11-01
Genetic variation occurring at the level of regulatory sequences can affect phenotypes and fitness in natural populations. This variation can be analysed in a population genetic framework to study how genetic drift and selection affect the evolution of these functional elements. However, doing this requires a good understanding of the location and nature of regulatory regions and has long been a major hurdle. The current proliferation of genomewide profiling experiments of transcription factor occupancies greatly improves our ability to identify genomic regions involved in specific DNA-protein interactions. Although software exists for predicting transcription factor binding sites (TFBS), and the effects of genetic variants on TFBS specificity, there are no tools currently available for inferring this information jointly with the genetic variation at TFBS in natural populations. We developed the software Transcription Elements Mapping at the Population LEvel (TEMPLE), which predicts TFBS, evaluates the effects of genetic variants on TFBS specificity and summarizes the genetic variation occurring at TFBS in intraspecific sequence alignments. We demonstrate that TEMPLE's TFBS prediction algorithms gives identical results to PATSER, a software distribution commonly used in the field. We also illustrate the unique features of TEMPLE by analysing TFBS diversity for the TF Senseless (SENS) in one ancestral and one cosmopolitan population of the fruit fly Drosophila melanogaster. TEMPLE can be used to localize TFBS that are characterized by strong genetic differentiation across natural populations. This will be particularly useful for studies aiming to identify adaptive mutations. TEMPLE is a java-based cross-platform software that easily maps the genetic diversity at predicted TFBSs using a graphical interface, or from the Unix command line. © 2016 John Wiley & Sons Ltd.
Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.
Han, Youngmahn; Kim, Dongsup
2017-12-28
Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc . We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.
Delk, Nikkí A.; Johnson, Keith A.; Chowdhury, Naweed I.; Braam, Janet
2005-01-01
Changes in intracellular calcium (Ca2+) levels serve to signal responses to diverse stimuli. Ca2+ signals are likely perceived through proteins that bind Ca2+, undergo conformation changes following Ca2+ binding, and interact with target proteins. The 50-member calmodulin-like (CML) Arabidopsis (Arabidopsis thaliana) family encodes proteins containing the predicted Ca2+-binding EF-hand motif. The functions of virtually all these proteins are unknown. CML24, also known as TCH2, shares over 40% amino acid sequence identity with calmodulin, has four EF hands, and undergoes Ca2+-dependent changes in hydrophobic interaction chromatography and migration rate through denaturing gel electrophoresis, indicating that CML24 binds Ca2+ and, as a consequence, undergoes conformational changes. CML24 expression occurs in all major organs, and transcript levels are increased from 2- to 15-fold in plants subjected to touch, darkness, heat, cold, hydrogen peroxide, abscisic acid (ABA), and indole-3-acetic acid. However, CML24 protein accumulation changes were not detectable. The putative CML24 regulatory region confers reporter expression at sites of predicted mechanical stress; in regions undergoing growth; in vascular tissues and various floral organs; and in stomata, trichomes, and hydathodes. CML24-underexpressing transgenics are resistant to ABA inhibition of germination and seedling growth, are defective in long-day induction of flowering, and have enhanced tolerance to CoCl2, molybdic acid, ZnSO4, and MgCl2. MgCl2 tolerance is not due to reduced uptake or to elevated Ca2+ accumulation. Together, these data present evidence that CML24, a gene expressed in diverse organs and responsive to diverse stimuli, encodes a potential Ca2+ sensor that may function to enable responses to ABA, daylength, and presence of various salts. PMID:16113225
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
In, K H; Asano, K; Beier, D; Grobholz, J; Finn, P W; Silverman, E K; Silverman, E S; Collins, T; Fischer, A R; Keith, T P; Serino, K; Kim, S W; De Sanctis, G T; Yandava, C; Pillari, A; Rubin, P; Kemp, J; Israel, E; Busse, W; Ledford, D; Murray, J J; Segal, A; Tinkleman, D; Drazen, J M
1997-01-01
Five lipoxygenase (5-LO) is the first committed enzyme in the metabolic pathway leading to the synthesis of the leukotrienes. We examined genomic DNA isolated from 25 normal subjects and 31 patients with asthma (6 of whom had aspirin-sensitive asthma) for mutations in the known transcription factor binding regions and the protein encoding region of the 5-LO gene. A family of mutations in the G + C-rich transcription factor binding region was identified consisting of the deletion of one, deletion of two, or addition of one zinc finger (Sp1/Egr-1) binding sites in the region 176 to 147 bp upstream from the ATG translation start site where there are normally 5 Sp1 binding motifs in tandem. Reporter gene activity directed by any of the mutant forms of the transcription factor binding region was significantly (P < 0.05) less effective than the activity driven by the wild type transcription factor binding region. Electrophoretic mobility shift assays (EMSAs) demonstrated the capacity of wild type and mutant transcription factor binding regions to bind nuclear extracts from human umbilical vein endothelial cells (HUVECs). These data are consistent with a family of mutations in the 5-LO gene that can modify reporter gene transcription possibly through differences in Sp1 and Egr-1 transactivation. PMID:9062372
In, K H; Asano, K; Beier, D; Grobholz, J; Finn, P W; Silverman, E K; Silverman, E S; Collins, T; Fischer, A R; Keith, T P; Serino, K; Kim, S W; De Sanctis, G T; Yandava, C; Pillari, A; Rubin, P; Kemp, J; Israel, E; Busse, W; Ledford, D; Murray, J J; Segal, A; Tinkleman, D; Drazen, J M
1997-03-01
Five lipoxygenase (5-LO) is the first committed enzyme in the metabolic pathway leading to the synthesis of the leukotrienes. We examined genomic DNA isolated from 25 normal subjects and 31 patients with asthma (6 of whom had aspirin-sensitive asthma) for mutations in the known transcription factor binding regions and the protein encoding region of the 5-LO gene. A family of mutations in the G + C-rich transcription factor binding region was identified consisting of the deletion of one, deletion of two, or addition of one zinc finger (Sp1/Egr-1) binding sites in the region 176 to 147 bp upstream from the ATG translation start site where there are normally 5 Sp1 binding motifs in tandem. Reporter gene activity directed by any of the mutant forms of the transcription factor binding region was significantly (P < 0.05) less effective than the activity driven by the wild type transcription factor binding region. Electrophoretic mobility shift assays (EMSAs) demonstrated the capacity of wild type and mutant transcription factor binding regions to bind nuclear extracts from human umbilical vein endothelial cells (HUVECs). These data are consistent with a family of mutations in the 5-LO gene that can modify reporter gene transcription possibly through differences in Sp1 and Egr-1 transactivation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, R.; Wilton, R.; Cuff, M. E.
The tandem Per-Arnt-Sim (PAS) like sensors are commonly found in signal transduction proteins. The periplasmic solute binding protein (SBP) domains are found ubiquitously and are generally involved in solute transport. These domains are widely observed as parts of separate proteins but not within the same polypeptide chain. We report the structural and biochemical characterization of the extracellular ligand-binding receptor, Dret_0059 from Desulfohalobium retbaense DSM 5692, an organism isolated from the Retba salt lake in Senegal. The structure of Dret_0059 consists of a novel combination of SBP and TPAS sensor domains. The N-terminal region forms an SBP domain and the C-terminalmore » region folds into a tandem PAS-like domain structure. A ketoleucine moiety is bound to the SBP, whereas a cytosine molecule is bound in the distal PAS domain of the TPAS. The differential scanning flourimetry studies in solution support the ligands observed in the crystal structure. There are only two other proteins with this structural architecture in the non-redundant sequence data base and we predict that they too bind the same substrates. There is significant interaction between the SBP and TPAS domains, and it is quite conceivable that the binding of one ligand will have an effect on the binding of the other. Our attempts to remove the ligands bound to the protein during expression were not successful, therefore, it is not clear what the relative affects are. The genomic context of this receptor does not contain any protein components expected for transport function, hence, we suggest that Dret_0059 is likely involved in signal transduction and not in solute transport.« less
Liu, Ying; Matthews, Kathleen S.; Bondos, Sarah E.
2008-01-01
During animal development, distinct tissues, organs, and appendages are specified through differential gene transcription by Hox transcription factors. However, the conserved Hox homeodomains bind DNA with high affinity yet low specificity. We have therefore explored the structure of the Drosophila melanogaster Hox protein Ultrabithorax and the impact of its nonhomeodomain regions on DNA binding properties. Computational and experimental approaches identified several conserved, intrinsically disordered regions outside the homeodomain of Ultrabithorax that impact DNA binding by the homeodomain. Full-length Ultrabithorax bound to target DNA 2.5-fold weaker than its isolated homeodomain. Using N-terminal and C-terminal deletion mutants, we demonstrate that the YPWM region and the disordered microexons (termed the I1 region) inhibit DNA binding ∼2-fold, whereas the disordered I2 region inhibits homeodomain-DNA interaction a further ∼40-fold. Binding is restored almost to homeodomain affinity by the mostly disordered N-terminal 174 amino acids (R region) in a length-dependent manner. Both the I2 and R regions contain portions of the activation domain, functionally linking DNA binding and transcription regulation. Given that (i) the I1 region and a portion of the R region alter homeodomain-DNA binding as a function of pH and (ii) an internal deletion within I1 increases Ultrabithorax-DNA affinity, I1 must directly impact homeodomain-DNA interaction energetics. However, I2 appears to indirectly affect DNA binding in a manner countered by the N terminus. The amino acid sequences of I2 and much of the I1 and R regions vary significantly among Ultrabithorax orthologues, potentially diversifying Hox-DNA interactions. PMID:18508761
Prediction of binding hot spot residues by using structural and evolutionary parameters.
Higa, Roberto Hiroshi; Tozzi, Clésio Luis
2009-07-01
In this work, we present a method for predicting hot spot residues by using a set of structural and evolutionary parameters. Unlike previous studies, we use a set of parameters which do not depend on the structure of the protein in complex, so that the predictor can also be used when the interface region is unknown. Despite the fact that no information concerning proteins in complex is used for prediction, the application of the method to a compiled dataset described in the literature achieved a performance of 60.4%, as measured by F-Measure, corresponding to a recall of 78.1% and a precision of 49.5%. This result is higher than those reported by previous studies using the same data set.
PatchSurfers: Two methods for local molecular property-based binding ligand prediction.
Shin, Woong-Hee; Bures, Mark Gregory; Kihara, Daisuke
2016-01-15
Protein function prediction is an active area of research in computational biology. Function prediction can help biologists make hypotheses for characterization of genes and help interpret biological assays, and thus is a productive area for collaboration between experimental and computational biologists. Among various function prediction methods, predicting binding ligand molecules for a target protein is an important class because ligand binding events for a protein are usually closely intertwined with the proteins' biological function, and also because predicted binding ligands can often be directly tested by biochemical assays. Binding ligand prediction methods can be classified into two types: those which are based on protein-protein (or pocket-pocket) comparison, and those that compare a target pocket directly to ligands. Recently, our group proposed two computational binding ligand prediction methods, Patch-Surfer, which is a pocket-pocket comparison method, and PL-PatchSurfer, which compares a pocket to ligand molecules. The two programs apply surface patch-based descriptions to calculate similarity or complementarity between molecules. A surface patch is characterized by physicochemical properties such as shape, hydrophobicity, and electrostatic potentials. These properties on the surface are represented using three-dimensional Zernike descriptors (3DZD), which are based on a series expansion of a 3 dimensional function. Utilizing 3DZD for describing the physicochemical properties has two main advantages: (1) rotational invariance and (2) fast comparison. Here, we introduce Patch-Surfer and PL-PatchSurfer with an emphasis on PL-PatchSurfer, which is more recently developed. Illustrative examples of PL-PatchSurfer performance on binding ligand prediction as well as virtual drug screening are also provided. Copyright © 2015 Elsevier Inc. All rights reserved.
Binding site and affinity prediction of general anesthetics to protein targets using docking.
Liu, Renyu; Perez-Aguilar, Jose Manuel; Liang, David; Saven, Jeffery G
2012-05-01
The protein targets for general anesthetics remain unclear. A tool to predict anesthetic binding for potential binding targets is needed. In this study, we explored whether a computational method, AutoDock, could serve as such a tool. High-resolution crystal data of water-soluble proteins (cytochrome C, apoferritin, and human serum albumin), and a membrane protein (a pentameric ligand-gated ion channel from Gloeobacter violaceus [GLIC]) were used. Isothermal titration calorimetry (ITC) experiments were performed to determine anesthetic affinity in solution conditions for apoferritin. Docking calculations were performed using DockingServer with the Lamarckian genetic algorithm and the Solis and Wets local search method (http://www.dockingserver.com/web). Twenty general anesthetics were docked into apoferritin. The predicted binding constants were compared with those obtained from ITC experiments for potential correlations. In the case of apoferritin, details of the binding site and their interactions were compared with recent cocrystallization data. Docking calculations for 6 general anesthetics currently used in clinical settings (isoflurane, sevoflurane, desflurane, halothane, propofol, and etomidate) with known 50% effective concentration (EC(50)) values were also performed in all tested proteins. The binding constants derived from docking experiments were compared with known EC(50) values and octanol/water partition coefficients for the 6 general anesthetics. All 20 general anesthetics docked unambiguously into the anesthetic binding site identified in the crystal structure of apoferritin. The binding constants for 20 anesthetics obtained from the docking calculations correlate significantly with those obtained from ITC experiments (P = 0.04). In the case of GLIC, the identified anesthetic binding sites in the crystal structure are among the docking predicted binding sites, but not the top ranked site. Docking calculations suggest a most probable binding site located in the extracellular domain of GLIC. The predicted affinities correlated significantly with the known EC(50) values for the 6 frequently used anesthetics in GLIC for the site identified in the experimental crystal data (P = 0.006). However, predicted affinities in apoferritin, human serum albumin, and cytochrome C did not correlate with these 6 anesthetics' known experimental EC(50) values. A weak correlation between the predicted affinities and the octanol/water partition coefficients was observed for the sites in GLIC. We demonstrated that anesthetic binding sites and relative affinities can be predicted using docking calculations in an automatic docking server (AutoDock) for both water-soluble and membrane proteins. Correlation of predicted affinity and EC(50) for 6 frequently used general anesthetics was only observed in GLIC, a member of a protein family relevant to anesthetic mechanism.
Binding Site and Affinity Prediction of General Anesthetics to Protein Targets Using Docking
Liu, Renyu; Perez-Aguilar, Jose Manuel; Liang, David; Saven, Jeffery G.
2012-01-01
Background The protein targets for general anesthetics remain unclear. A tool to predict anesthetic binding for potential binding targets is needed. In this study, we explore whether a computational method, AutoDock, could serve as such a tool. Methods High-resolution crystal data of water soluble proteins (cytochrome C, apoferritin and human serum albumin), and a membrane protein (a pentameric ligand-gated ion channel from Gloeobacter violaceus, GLIC) were used. Isothermal titration calorimetry (ITC) experiments were performed to determine anesthetic affinity in solution conditions for apoferritin. Docking calculations were performed using DockingServer with the Lamarckian genetic algorithm and the Solis and Wets local search method (https://www.dockingserver.com/web). Twenty general anesthetics were docked into apoferritin. The predicted binding constants are compared with those obtained from ITC experiments for potential correlations. In the case of apoferritin, details of the binding site and their interactions were compared with recent co-crystallization data. Docking calculations for six general anesthetics currently used in clinical settings (isoflurane, sevoflurane, desflurane, halothane, propofol, and etomidate) with known EC50 were also performed in all tested proteins. The binding constants derived from docking experiments were compared with known EC50s and octanol/water partition coefficients for the six general anesthetics. Results All 20 general anesthetics docked unambiguously into the anesthetic binding site identified in the crystal structure of apoferritin. The binding constants for 20 anesthetics obtained from the docking calculations correlate significantly with those obtained from ITC experiments (p=0.04). In the case of GLIC, the identified anesthetic binding sites in the crystal structure are among the docking predicted binding sites, but not the top ranked site. Docking calculations suggest a most probable binding site located in the extracellular domain of GLIC. The predicted affinities correlated significantly with the known EC50s for the six commonly used anesthetics in GLIC for the site identified in the experimental crystal data (p=0.006). However, predicted affinities in apoferritin, human serum albumin, and cytochrome C did not correlate with these six anesthetics’ known experimental EC50s. A weak correlation between the predicted affinities and the octanol/water partition coefficients was observed for the sites in GLIC. Conclusion We demonstrated that anesthetic binding sites and relative affinities can be predicted using docking calculations in an automatic docking server (Autodock) for both water soluble and membrane proteins. Correlation of predicted affinity and EC50 for six commonly used general anesthetics was only observed in GLIC, a member of a protein family relevant to anesthetic mechanism. PMID:22392968
Importance of ligand reorganization free energy in protein-ligand binding-affinity prediction.
Yang, Chao-Yie; Sun, Haiying; Chen, Jianyong; Nikolovska-Coleska, Zaneta; Wang, Shaomeng
2009-09-30
Accurate prediction of the binding affinities of small-molecule ligands to their biological targets is fundamental for structure-based drug design but remains a very challenging task. In this paper, we have performed computational studies to predict the binding models of 31 small-molecule Smac (the second mitochondria-derived activator of caspase) mimetics to their target, the XIAP (X-linked inhibitor of apoptosis) protein, and their binding affinities. Our results showed that computational docking was able to reliably predict the binding models, as confirmed by experimentally determined crystal structures of some Smac mimetics complexed with XIAP. However, all the computational methods we have tested, including an empirical scoring function, two knowledge-based scoring functions, and MM-GBSA (molecular mechanics and generalized Born surface area), yield poor to modest prediction for binding affinities. The linear correlation coefficient (r(2)) value between the predicted affinities and the experimentally determined affinities was found to be between 0.21 and 0.36. Inclusion of ensemble protein-ligand conformations obtained from molecular dynamic simulations did not significantly improve the prediction. However, major improvement was achieved when the free-energy change for ligands between their free- and bound-states, or "ligand-reorganization free energy", was included in the MM-GBSA calculation, and the r(2) value increased from 0.36 to 0.66. The prediction was validated using 10 additional Smac mimetics designed and evaluated by an independent group. This study demonstrates that ligand reorganization free energy plays an important role in the overall binding free energy between Smac mimetics and XIAP. This term should be evaluated for other ligand-protein systems and included in the development of new scoring functions. To our best knowledge, this is the first computational study to demonstrate the importance of ligand reorganization free energy for the prediction of protein-ligand binding free energy.
Near-infrared fluorophores as biomolecular probes
NASA Astrophysics Data System (ADS)
Patonay, Gabor; Beckford, Garfield; Strekowski, Lucjan; Henary, Maged; Merid, Yonathan
2010-02-01
Near-Infrared (NIR) fluorescence has been valuable in analytical and bioanalytical chemistry. NIR probes and labels have been used for several applications, including hydrophobicity of protein binding sites, DNA sequencing, immunoassays, CE separations, etc. The NIR region (700-1100 nm) has advantages for the spectroscopist due to the inherently lower background interference from the biological matrix and the high molar absorptivities of NIR chromophores. During the studies we report here several NIR dyes were prepared to determine the role of the hydrophobicity of NIR dyes and their charge in binding to amino acids and proteins, e.g., serum albumins. We synthesized NIR dye homologs containing the same chromophore but substituents of varying hydrophobicity. Hydrophobic moieties were represented by alkyl and aryl groups. These NIR dyes of varying hydrophobicity exhibited varying degrees of H-aggregation in aqueous solution indicating that the degree of H-aggregation could be used as an indicator to predict binding characteristics to serum albumins. In order to understand what factors may be important in the binding process, spectral behavior of these varying hydrophobicity dyes were examined in the presence of amino acids. Typical dye structures that exhibit large binding constants to biomolecules were compared in order to optimize applications utilizing non-covalent interactions.
Structure of Drosophila Oskar reveals a novel RNA binding protein
Yang, Na; Yu, Zhenyu; Hu, Menglong; Wang, Mingzhu; Lehmann, Ruth; Xu, Rui-Ming
2015-01-01
Oskar (Osk) protein plays critical roles during Drosophila germ cell development, yet its functions in germ-line formation and body patterning remain poorly understood. This situation contrasts sharply with the vast knowledge about the function and mechanism of osk mRNA localization. Osk is predicted to have an N-terminal LOTUS domain (Osk-N), which has been suggested to bind RNA, and a C-terminal hydrolase-like domain (Osk-C) of unknown function. Here, we report the crystal structures of Osk-N and Osk-C. Osk-N shows a homodimer of winged-helix–fold modules, but without detectable RNA-binding activity. Osk-C has a lipase-fold structure but lacks critical catalytic residues at the putative active site. Surprisingly, we found that Osk-C binds the 3′UTRs of osk and nanos mRNA in vitro. Mutational studies identified a region of Osk-C important for mRNA binding. These results suggest possible functions of Osk in the regulation of stability, regulation of translation, and localization of relevant mRNAs through direct interaction with their 3′UTRs, and provide structural insights into a novel protein–RNA interaction motif involving a hydrolase-related domain. PMID:26324911
MotifMark: Finding regulatory motifs in DNA sequences.
Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D
2017-07-01
The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Atassi, M Zouhair; Dolimbek, Behzod Z; Steward, Lance E; Aoki, K Roger
2007-01-01
In studies from this laboratory, we localized the regions on the H chain of botulinum neurotoxin A (BoNT/A) that are recognized by anti-BoNT/A antibodies (Abs) and block the activity of the toxin in vivo. These Abs were obtained from cervical dystonia patients who had been treated with BoNT/A and had become unresponsive to the treatment, as well as blocking Abs raised in mouse, horse, and chicken. We also localized the regions involved in BoNT/A binding to mouse brain synaptosomes (snp). Comparison of spatial proximities in the three-dimensional structure of the Ab-binding regions and the snp binding showed that except for one, the Ab-binding regions either coincide or overlap with the snp regions. It should be folly expected that protective Abs when bound to the toxin at sites that coincide or overlap with snp binding would prevent the toxin from binding to nerve synapse and therefore block toxin entry into the neuron. Thus, analysis of the locations of the Ab-binding and the snp-binding regions provides a molecular rationale for the ability of protecting Abs to block BoNT/A action in vivo.
Zhou, Jiyun; Lu, Qin; Xu, Ruifeng; He, Yulan; Wang, Hongpeng
2017-08-29
Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.
Expanding signaling-molecule wavefront model of cell polarization in the Drosophila wing primordium.
Wortman, Juliana C; Nahmad, Marcos; Zhang, Peng Cheng; Lander, Arthur D; Yu, Clare C
2017-07-01
In developing tissues, cell polarization and proliferation are regulated by morphogens and signaling pathways. Cells throughout the Drosophila wing primordium typically show subcellular localization of the unconventional myosin Dachs on the distal side of cells (nearest the center of the disc). Dachs localization depends on the spatial distribution of bonds between the protocadherins Fat (Ft) and Dachsous (Ds), which form heterodimers between adjacent cells; and the Golgi kinase Four-jointed (Fj), which affects the binding affinities of Ft and Ds. The Fj concentration forms a linear gradient while the Ds concentration is roughly uniform throughout most of the wing pouch with a steep transition region that propagates from the center to the edge of the pouch during the third larval instar. Although the Fj gradient is an important cue for polarization, it is unclear how the polarization is affected by cell division and the expanding Ds transition region, both of which can alter the distribution of Ft-Ds heterodimers around the cell periphery. We have developed a computational model to address these questions. In our model, the binding affinity of Ft and Ds depends on phosphorylation by Fj. We assume that the asymmetry of the Ft-Ds bond distribution around the cell periphery defines the polarization, with greater asymmetry promoting cell proliferation. Our model predicts that this asymmetry is greatest in the radially-expanding transition region that leaves polarized cells in its wake. These cells naturally retain their bond distribution asymmetry after division by rapidly replenishing Ft-Ds bonds at new cell-cell interfaces. Thus we predict that the distal localization of Dachs in cells throughout the pouch requires the movement of the Ds transition region and the simple presence, rather than any specific spatial pattern, of Fj.
Evolution and Structural Organization of the C Proteins of Paramyxovirinae
Karlin, David G.
2014-01-01
The phosphoprotein (P) gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT), and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group) and human parainfluenza virus 1 (Sendai group). We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site) and a highly constrained region (the C-terminus of C), seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations. PMID:24587180
In Silico Analysis of Epitope-Based Vaccine Candidates against Hepatitis B Virus Polymerase Protein
Zheng, Juzeng; Lin, Xianfan; Wang, Xiuyan; Zheng, Liyu; Lan, Songsong; Jin, Sisi; Ou, Zhanfan; Wu, Jinming
2017-01-01
Hepatitis B virus (HBV) infection has persisted as a major public health problem due to the lack of an effective treatment for those chronically infected. Therapeutic vaccination holds promise, and targeting HBV polymerase is pivotal for viral eradication. In this research, a computational approach was employed to predict suitable HBV polymerase targeting multi-peptides for vaccine candidate selection. We then performed in-depth computational analysis to evaluate the predicted epitopes’ immunogenicity, conservation, population coverage, and toxicity. Lastly, molecular docking and MHC-peptide complex stabilization assay were utilized to determine the binding energy and affinity of epitopes to the HLA-A0201 molecule. Criteria-based analysis provided four predicted epitopes, RVTGGVFLV, VSIPWTHKV, YMDDVVLGA and HLYSHPIIL. Assay results indicated the lowest binding energy and high affinity to the HLA-A0201 molecule for epitopes VSIPWTHKV and YMDDVVLGA and epitopes RVTGGVFLV and VSIPWTHKV, respectively. Regions 307 to 320 and 377 to 387 were considered to have the highest probability to be involved in B cell epitopes. The T cell and B cell epitopes identified in this study are promising targets for an epitope-focused, peptide-based HBV vaccine, and provide insight into HBV-induced immune response. PMID:28509875
Warfield, Becka M.
2017-01-01
RNA aptamers are oligonucleotides that bind with high specificity and affinity to target ligands. In the absence of bound ligand, secondary structures of RNA aptamers are generally stable, but single-stranded and loop regions, including ligand binding sites, lack defined structures and exist as ensembles of conformations. For example, the well-characterized theophylline-binding aptamer forms a highly stable binding site when bound to theophylline, but the binding site is unstable and disordered when theophylline is absent. Experimental methods have not revealed at atomic resolution the conformations that the theophylline aptamer explores in its unbound state. Consequently, in the present study we applied 21 microseconds of molecular dynamics simulations to structurally characterize the ensemble of conformations that the aptamer adopts in the absence of theophylline. Moreover, we apply Markov state modeling to predict the kinetics of transitions between unbound conformational states. Our simulation results agree with experimental observations that the theophylline binding site is found in many distinct binding-incompetent states and show that these states lack a binding pocket that can accommodate theophylline. The binding-incompetent states interconvert with binding-competent states through structural rearrangement of the binding site on the nanosecond to microsecond timescale. Moreover, we have simulated the complete theophylline binding pathway. Our binding simulations supplement prior experimental observations of slow theophylline binding kinetics by showing that the binding site must undergo a large conformational rearrangement after the aptamer and theophylline form an initial complex, most notably, a major rearrangement of the C27 base from a buried to solvent-exposed orientation. Theophylline appears to bind by a combination of conformational selection and induced fit mechanisms. Finally, our modeling indicates that when Mg2+ ions are present the population of binding-competent aptamer states increases more than twofold. This population change, rather than direct interactions between Mg2+ and theophylline, accounts for altered theophylline binding kinetics. PMID:28437473
Accurate Binding Free Energy Predictions in Fragment Optimization.
Steinbrecher, Thomas B; Dahlgren, Markus; Cappel, Daniel; Lin, Teng; Wang, Lingle; Krilov, Goran; Abel, Robert; Friesner, Richard; Sherman, Woody
2015-11-23
Predicting protein-ligand binding free energies is a central aim of computational structure-based drug design (SBDD)--improved accuracy in binding free energy predictions could significantly reduce costs and accelerate project timelines in lead discovery and optimization. The recent development and validation of advanced free energy calculation methods represents a major step toward this goal. Accurately predicting the relative binding free energy changes of modifications to ligands is especially valuable in the field of fragment-based drug design, since fragment screens tend to deliver initial hits of low binding affinity that require multiple rounds of synthesis to gain the requisite potency for a project. In this study, we show that a free energy perturbation protocol, FEP+, which was previously validated on drug-like lead compounds, is suitable for the calculation of relative binding strengths of fragment-sized compounds as well. We study several pharmaceutically relevant targets with a total of more than 90 fragments and find that the FEP+ methodology, which uses explicit solvent molecular dynamics and physics-based scoring with no parameters adjusted, can accurately predict relative fragment binding affinities. The calculations afford R(2)-values on average greater than 0.5 compared to experimental data and RMS errors of ca. 1.1 kcal/mol overall, demonstrating significant improvements over the docking and MM-GBSA methods tested in this work and indicating that FEP+ has the requisite predictive power to impact fragment-based affinity optimization projects.
Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland
2009-12-01
Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.
Paci, Alexandr; Liu, Xiao Hu; Huang, Hao; Lim, Abelyn; Houry, Walid A.; Zhao, Rongmin
2012-01-01
Pih1 is an unstable protein and a subunit of the R2TP complex that, in yeast Saccharomyces cerevisiae, also contains the helicases Rvb1, Rvb2, and the Hsp90 cofactor Tah1. Pih1 and the R2TP complex are required for the box C/D small nucleolar ribonucleoprotein (snoRNP) assembly and ribosomal RNA processing. Purified Pih1 tends to aggregate in vitro. Molecular chaperone Hsp90 and its cochaperone Tah1 are required for the stability of Pih1 in vivo. We had shown earlier that the C terminus of Pih1 destabilizes the protein and that the C terminus of Tah1 binds to the Pih1 C terminus to form a stable complex. Here, we analyzed the secondary structure of the Pih1 C terminus and identified two intrinsically disordered regions and five hydrophobic clusters. Site-directed mutagenesis indicated that one predicted intrinsically disordered region IDR2 is involved in Tah1 binding, and that the C terminus of Pih1 contains multiple destabilization or degron elements. Additionally, the Pih1 N-terminal domain, Pih11–230, was found to be able to complement the physiological role of full-length Pih1 at 37 °C. Pih11–230 as well as a shorter Pih1 N-terminal fragment Pih11–195 is able to bind Rvb1/Rvb2 heterocomplex. However, the sequence between the two disordered regions in Pih1 significantly enhances the Pih1 N-terminal domain binding to Rvb1/Rvb2. Based on these data, a model of protein-protein interactions within the R2TP complex is proposed. PMID:23139418
Gulliver, Emily L; Wright, Amy; Lucas, Deanna Deveson; Mégroz, Marianne; Kleifeld, Oded; Schittenhelm, Ralf B; Powell, David R; Seemann, Torsten; Bulitta, Jürgen B; Harper, Marina; Boyce, John D
2018-05-01
Pasteurella multocida is a Gram-negative bacterium responsible for many important animal diseases. While a number of P. multocida virulence factors have been identified, very little is known about how gene expression and protein production is regulated in this organism. Small RNA (sRNA) molecules are critical regulators that act by binding to specific mRNA targets, often in association with the RNA chaperone protein Hfq. In this study, transcriptomic analysis of the P. multocida strain VP161 revealed a putative sRNA with high identity to GcvB from Escherichia coli and Salmonella enterica serovar Typhimurium. High-throughput quantitative liquid proteomics was used to compare the proteomes of the P. multocida VP161 wild-type strain, a gcvB mutant, and a GcvB overexpression strain. These analyses identified 46 proteins that displayed significant differential production after inactivation of gcvB , 36 of which showed increased production. Of the 36 proteins that were repressed by GcvB, 27 were predicted to be involved in amino acid biosynthesis or transport. Bioinformatic analyses of putative P. multocida GcvB target mRNAs identified a strongly conserved 10 nucleotide consensus sequence, 5'-AACACAACAT-3', with the central eight nucleotides identical to the seed binding region present within GcvB mRNA targets in E. coli and S. Typhimurium. Using a defined set of seed region mutants, together with a two-plasmid reporter system that allowed for quantification of sRNA-mRNA interactions, this sequence was confirmed to be critical for the binding of the P. multocida GcvB to the target mRNA, gltA . © 2018 Gulliver et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-01-01
Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259
Real-time ligand binding pocket database search using local surface descriptors.
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-07-01
Because of the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two-dimensional pseudo-Zernike moments or the three-dimensional Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark studies employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed.
Rhoden, John J.; Dyas, Gregory L.
2016-01-01
Despite the increasing number of multivalent antibodies, bispecific antibodies, fusion proteins, and targeted nanoparticles that have been generated and studied, the mechanism of multivalent binding to cell surface targets is not well understood. Here, we describe a conceptual and mathematical model of multivalent antibody binding to cell surface antigens. Our model predicts that properties beyond 1:1 antibody:antigen affinity to target antigens have a strong influence on multivalent binding. Predicted crucial properties include the structure and flexibility of the antibody construct, the target antigen(s) and binding epitope(s), and the density of antigens on the cell surface. For bispecific antibodies, the ratio of the expression levels of the two target antigens is predicted to be critical to target binding, particularly for the lower expressed of the antigens. Using bispecific antibodies of different valencies to cell surface antigens including MET and EGF receptor, we have experimentally validated our modeling approach and its predictions and observed several nonintuitive effects of avidity related to antigen density, target ratio, and antibody affinity. In some biological circumstances, the effect we have predicted and measured varied from the monovalent binding interaction by several orders of magnitude. Moreover, our mathematical framework affords us a mechanistic interpretation of our observations and suggests strategies to achieve the desired antibody-antigen binding goals. These mechanistic insights have implications in antibody engineering and structure/activity relationship determination in a variety of biological contexts. PMID:27022022
Nielsen, Morten; Lundegaard, Claus; Lund, Ole
2007-01-01
Background Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. Results The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. Conclusion The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available. PMID:17608956
Nielsen, Morten; Lundegaard, Claus; Lund, Ole
2007-07-04
Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available.
Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B
1989-01-01
The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501
Doxey, Andrew C; Cheng, Zhenyu; Moffatt, Barbara A; McConkey, Brendan J
2010-08-03
Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins. The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry. Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.
Antonini, E; Ascenzi, P; Bolognesi, M; Menegatti, E; Guarneri, M
1983-04-25
The formation of the bovine beta-trypsin-bovine basic pancreatic trypsin inhibitor (Kunitz) (BPTI) complex was monitored, making use of three different signals: proflavine displacement, optical density changes in the ultraviolet region, and the loss of the catalytic activity. The rates of the reactions indicated by the three different signals were similar at neutral pH, but diverged at low pH. At pH 3.50, proflavine displacement precedes the optical density changes in the ultraviolet and the loss of enzyme activity by several orders of magnitude in time (Antonini, E., Ascenzi, P., Menegatti, E., and Guarneri, M. (1983) Biopolymers 22, 363-375). These data indicated that the bovine beta-trypsin-BPTI complex formation is a multistage process and led to the prediction that, at pH 3.50, BPTI addition to the bovine beta-trypsin-proflavine complex would remove proflavine inhibition and the enzyme would recover transiently its catalytic activity before being irreversibly inhibited by completion of BPTI binding. The kinetic evidences, by completion of BPTI binding. The kinetic evidences, here shown, verified this prediction, indicating that during the bovine beta-trypsin-BPTI complex formation one transient intermediate occurs, which is not able to bind proflavine but may bind and hydrolyze the substrate. Thus, the observed peculiar catalytic behavior is in line with the proposed reaction mechanism for the bovine beta-trypsin-BPTI complex formation, which postulates a sequence of distinct polar and apolar interactions at the contact area.
Radiation-induced oxidative damage to the DNA-binding domain of the lactose repressor
Gillard, Nathalie; Goffinont, Stephane; Buré, Corinne; Davidkova, Marie; Maurizot, Jean-Claude; Cadene, Martine; Spotheim-Maurizot, Melanie
2007-01-01
Understanding the cellular effects of radiation-induced oxidation requires the unravelling of key molecular events, particularly damage to proteins with important cellular functions. The Escherichia coli lactose operon is a classical model of gene regulation systems. Its functional mechanism involves the specific binding of a protein, the repressor, to a specific DNA sequence, the operator. We have shown previously that upon irradiation with γ-rays in solution, the repressor loses its ability to bind the operator. Water radiolysis generates hydroxyl radicals (OH· radicals) which attack the protein. Damage of the repressor DNA-binding domain, called the headpiece, is most likely to be responsible of this loss of function. Using CD, fluorescence spectroscopy and a combination of proteolytic cleavage with MS, we have examined the state of the irradiated headpiece. CD measurements revealed a dose-dependent conformational change involving metastable intermediate states. Fluorescence measurements showed a gradual degradation of tyrosine residues. MS was used to count the number of oxidations in different regions of the headpiece and to narrow down the parts of the sequence bearing oxidized residues. By calculating the relative probabilities of reaction of each amino acid with OH· radicals, we can predict the most probable oxidation targets. By comparing the experimental results with the predictions we conclude that Tyr7, Tyr12, Tyr17, Met42 and Tyr47 are the most likely hotspots of oxidation. The loss of repressor function is thus correlated with chemical modifications and conformational changes of the headpiece. PMID:17263689
BAF57 Modulation of Androgen Receptor Action and Prostate Cancer Progression
2007-12-01
mapped the AR binding site on BAF57 to the N-terminus (proline-rich region). Furthermore, the DBD and hinge region of AR also appear to play a...Accomplishments of Task 1: BAF57 binds to DNA binding domain ( DBD ) and hinge region of AR As outlined in the initial proposal, the first task...the above construct are the well-characterized zinc finger DNA binding domain ( DBD ) and the hinge region. Given the significant role of these two
BAF57 Modulation of Androgen Receptor Action and Prostate Cancer Progression
2006-12-01
has fine mapped the AR binding site on BAF57 to the N-terminus (proline-rich region). Furthermore, the DBD and hinge region of AR also appear to...Accomplishments of Task 1: BAF57 binds to DNA binding domain ( DBD ) and hinge region of AR As outlined in the initial proposal, the first task was to...construct are the well-characterized zinc finger DNA binding domain ( DBD ) and the hinge region. Given the significant role of these two domains in AR
A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs
Miao, Zhichao; Westhof, Eric
2015-01-01
Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods. PMID:26681179
NASA Astrophysics Data System (ADS)
Rosenfeld, Robin J.; Goodsell, David S.; Musah, Rabi A.; Morris, Garrett M.; Goodin, David B.; Olson, Arthur J.
2003-08-01
The W191G cavity of cytochrome c peroxidase is useful as a model system for introducing small molecule oxidation in an artificially created cavity. A set of small, cyclic, organic cations was previously shown to bind in the buried, solvent-filled pocket created by the W191G mutation. We docked these ligands and a set of non-binders in the W191G cavity using AutoDock 3.0. For the ligands, we compared docking predictions with experimentally determined binding energies and X-ray crystal structure complexes. For the ligands, predicted binding energies differed from measured values by ± 0.8 kcal/mol. For most ligands, the docking simulation clearly predicted a single binding mode that matched the crystallographic binding mode within 1.0 Å RMSD. For 2 ligands, where the docking procedure yielded an ambiguous result, solutions matching the crystallographic result could be obtained by including an additional crystallographically observed water molecule in the protein model. For the remaining 2 ligands, docking indicated multiple binding modes, consistent with the original electron density, suggesting disordered binding of these ligands. Visual inspection of the atomic affinity grid maps used in docking calculations revealed two patches of high affinity for hydrogen bond donating groups. Multiple solutions are predicted as these two sites compete for polar hydrogens in the ligand during the docking simulation. Ligands could be distinguished, to some extent, from non-binders using a combination of two trends: predicted binding energy and level of clustering. In summary, AutoDock 3.0 appears to be useful in predicting key structural and energetic features of ligand binding in the W191G cavity.
Velazquez, Hector A; Riccardi, Demian; Xiao, Zhousheng; Quarles, Leigh Darryl; Yates, Charless Ryan; Baudry, Jerome; Smith, Jeremy C
2018-02-01
Ensemble docking is now commonly used in early-stage in silico drug discovery and can be used to attack difficult problems such as finding lead compounds which can disrupt protein-protein interactions. We give an example of this methodology here, as applied to fibroblast growth factor 23 (FGF23), a protein hormone that is responsible for regulating phosphate homeostasis. The first small-molecule antagonists of FGF23 were recently discovered by combining ensemble docking with extensive experimental target validation data (Science Signaling, 9, 2016, ra113). Here, we provide a detailed account of how ensemble-based high-throughput virtual screening was used to identify the antagonist compounds discovered in reference (Science Signaling, 9, 2016, ra113). Moreover, we perform further calculations, redocking those antagonist compounds identified in reference (Science Signaling, 9, 2016, ra113) that performed well on drug-likeness filters, to predict possible binding regions. These predicted binding modes are rescored with the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) approach to calculate the most likely binding site. Our findings suggest that the antagonist compounds antagonize FGF23 through the disruption of protein-protein interactions between FGF23 and fibroblast growth factor receptor (FGFR). © 2017 John Wiley & Sons A/S.
Xu, Jingting; Hu, Hong; Dai, Yang
The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their potential in enhancer prediction despite the fact that low methylated regions (LMRs) have been implied to be distal active regulatory regions. In this work, we propose a prediction framework, LMethyR-SVM, using LMRs identified from cell-type-specific WGBS DNA methylation profiles and a weighted support vector machine learning framework. In LMethyR-SVM, the set of cell-type-specific LMRs is further divided into three sets: reliable positive, like positive and likely negative, according to their resemblance to a small set of experimentally validated enhancers in the VISTA database based on an estimated non-parametric density distribution. Then, the prediction model is obtained by solving a weighted support vector machine. We demonstrate the performance of LMethyR-SVM by using the WGBS DNA methylation profiles derived from the human embryonic stem cell type (H1) and the fetal lung fibroblast cell type (IMR90). The predicted enhancers are highly conserved with a reasonable validation rate based on a set of commonly used positive markers including transcription factors, p300 binding and DNase-I hypersensitive sites. In addition, we show evidence that the large fraction of the LMethyR-SVM predicted enhancers are not predicted by ChromHMM in H1 cell type and they are more enriched for the FANTOM5 enhancers. Our work suggests that low methylated regions detected from the WGBS data are useful as complementary resources to histone modification marks in developing models for the prediction of cell-type-specific enhancers.
Heterogeneity of D2 dopamine receptors in different brain regions.
Leonard, M N; Macey, C A; Strange, P G
1987-01-01
The binding of [3H]spiperone has been examined in membranes derived from different regions of bovine brain. In caudate nucleus, nucleus accumbens, olfactory tubercle and putamen binding is to D2 dopamine and 5HT2 serotonin receptors, whereas in cingulate cortex only serotonin 5HT2 receptor binding can be detected. D2 dopamine receptors were examined in detail in caudate nucleus, olfactory tubercle and putamen using [3H]spiperone binding in the presence of 0.3 microM-mianserin (to block 5HT2 serotonin receptors). No evidence for heterogeneity among D2 dopamine receptors either between brain regions or within a brain region was found from the displacements of [3H]spiperone binding by a range of antagonists, including dibenzazepines and substituted benzamides. Regulation of agonist binding by guanine nucleotides did, however, differ between regions. In caudate nucleus a population of agonist binding sites appeared resistant to guanine nucleotide regulation, whereas this was not the case in olfactory tubercle and putamen. PMID:2963621
Characterisation of single domain ATP-binding cassette protien homologues of Theileria parva.
Kibe, M K; Macklin, M; Gobright, E; Bishop, R; Urakawa, T; ole-MoiYoi, O K
2001-09-01
Two distinct genes encoding single domain, ATP-binding cassette transport protein homologues of Theileria parva were cloned and sequenced. Neither of the genes is tandemly duplicated. One gene, TpABC1, encodes a predicted protein of 593 amino acids with an N-terminal hydrophobic domain containing six potential membrane-spanning segments. A single discontinuous ATP-binding element was located in the C-terminal region of TpABC1. The second gene, TpABC2, also contains a single C-terminal ATP-binding motif. Copies of TpABC2 were present at four loci in the T. parva genome on three different chromosomes. TpABC1 exhibited allelic polymorphism between stocks of the parasite. Comparison of cDNA and genomic sequences revealed that TpABC1 contained seven short introns, between 29 and 84 bp in length. The full-length TpABC1 protein was expressed in insect cells using the baculovirus system. Application of antibodies raised against the recombinant antigen to western blots of T. parva piroplasm lysates detected an 85 kDa protein in this life-cycle stage.
[F-18]-AV-1451 binding correlates with postmortem neurofibrillary tangle Braak staging.
Marquié, Marta; Siao Tick Chong, Michael; Antón-Fernández, Alejandro; Verwer, Eline E; Sáez-Calveras, Nil; Meltzer, Avery C; Ramanan, Prianca; Amaral, Ana C; Gonzalez, Jose; Normandin, Marc D; Frosch, Matthew P; Gómez-Isla, Teresa
2017-10-01
[F-18]-AV-1451, a PET tracer specifically developed to detect brain neurofibrillary tau pathology, has the potential to facilitate accurate diagnosis of Alzheimer's disease (AD), staging of brain tau burden and monitoring disease progression. Recent PET studies show that patients with mild cognitive impairment and AD dementia exhibit significantly higher in vivo [F-18]-AV-1451 retention than cognitively normal controls. Importantly, PET patterns of [F-18]-AV-1451 correlate well with disease severity and seem to match the predicted topographic Braak staging of neurofibrillary tangles (NFTs) in AD, although this awaits confirmation. We studied the correlation of autoradiographic binding patterns of [F-18]-AV-1451 and the stereotypical spatiotemporal pattern of progression of NFTs using legacy postmortem brain samples representing different Braak NFT stages (I-VI). We performed [F-18]-AV-1451 phosphor-screen autoradiography and quantitative tau measurements (stereologically based NFT counts and biochemical analysis of tau pathology) in three brain regions (entorhinal cortex, superior temporal sulcus and visual cortex) in a total of 22 cases: low Braak (I-II, n = 6), intermediate Braak (III-IV, n = 7) and high Braak (V-VI, n = 9). Strong and selective [F-18]-AV-1451 binding was detected in all tangle-containing regions matching precisely the observed pattern of PHF-tau immunostaining across the different Braak stages. As expected, no signal was detected in the white matter or other non-tangle containing regions. Quantification of [F-18]-AV-1451 binding was very significantly correlated with the number of NFTs present in each brain region and with the total tau and phospho-tau content as reported by Western blot and ELISA. [F-18]-AV-1451 is a promising biomarker for in vivo quantification of brain tau burden in AD. Neuroimaging-pathologic studies conducted on postmortem material from individuals imaged while alive are now needed to confirm these observations.
2014-01-01
Background Binding free energy and binding hot spots at protein-protein interfaces are two important research areas for understanding protein interactions. Computational methods have been developed previously for accurate prediction of binding free energy change upon mutation for interfacial residues. However, a large number of interrupted and unimportant atomic contacts are used in the training phase which caused accuracy loss. Results This work proposes a new method, βACV ASA , to predict the change of binding free energy after alanine mutations. βACV ASA integrates accessible surface area (ASA) and our newly defined β contacts together into an atomic contact vector (ACV). A β contact between two atoms is a direct contact without being interrupted by any other atom between them. A β contact’s potential contribution to protein binding is also supposed to be inversely proportional to its ASA to follow the water exclusion hypothesis of binding hot spots. Tested on a dataset of 396 alanine mutations, our method is found to be superior in classification performance to many other methods, including Robetta, FoldX, HotPOINT, an ACV method of β contacts without ASA integration, and ACV ASA methods (similar to βACV ASA but based on distance-cutoff contacts). Based on our data analysis and results, we can draw conclusions that: (i) our method is powerful in the prediction of binding free energy change after alanine mutation; (ii) β contacts are better than distance-cutoff contacts for modeling the well-organized protein-binding interfaces; (iii) β contacts usually are only a small fraction number of the distance-based contacts; and (iv) water exclusion is a necessary condition for a residue to become a binding hot spot. Conclusions βACV ASA is designed using the advantages of both β contacts and water exclusion. It is an excellent tool to predict binding free energy changes and binding hot spots after alanine mutation. PMID:24568581
NASA Astrophysics Data System (ADS)
Sippl, Wolfgang
2000-08-01
One of the major challenges in computational approaches to drug design is the accurate prediction of binding affinity of biomolecules. In the present study several prediction methods for a published set of estrogen receptor ligands are investigated and compared. The binding modes of 30 ligands were determined using the docking program AutoDock and were compared with available X-ray structures of estrogen receptor-ligand complexes. On the basis of the docking results an interaction energy-based model, which uses the information of the whole ligand-receptor complex, was generated. Several parameters were modified in order to analyze their influence onto the correlation between binding affinities and calculated ligand-receptor interaction energies. The highest correlation coefficient ( r 2 = 0.617, q 2 LOO = 0.570) was obtained considering protein flexibility during the interaction energy evaluation. The second prediction method uses a combination of receptor-based and 3D quantitative structure-activity relationships (3D QSAR) methods. The ligand alignment obtained from the docking simulations was taken as basis for a comparative field analysis applying the GRID/GOLPE program. Using the interaction field derived with a water probe and applying the smart region definition (SRD) variable selection, a significant and robust model was obtained ( r 2 = 0.991, q 2 LOO = 0.921). The predictive ability of the established model was further evaluated by using a test set of six additional compounds. The comparison with the generated interaction energy-based model and with a traditional CoMFA model obtained using a ligand-based alignment ( r 2 = 0.951, q 2 LOO = 0.796) indicates that the combination of receptor-based and 3D QSAR methods is able to improve the quality of the underlying model.
Makeyev, A V; Liebhaber, S A
2000-08-01
We have identified two novel human genes encoding proteins with a high level of sequence identity to two previously characterized RNA-binding proteins, alphaCP-1 and alphaCP-2. Both of these novel genes, alphaCP-3 and alphaCP-4, are predicted to encode proteins with triplicated KH domains. The number and organization of the KH domains, their sequences, and the sequences of the contiguous regions are conserved among all four alphaCP proteins. The common evolutionary origin of these proteins is substantiated by conservation of exon-intron organization in the corresponding genes. The map positions of alphaCP-1 and alphaCP-2 (previously reported) and those of alphaCP-3 and alphaCP-4 (present report) reveal that the four alphaCP loci are dispersed in the human genome; alphaCP-3 and alphaCP-4 mapped to 21q22.3 and 3p21, and the respective mouse orthologues mapped to syntenic regions of the mouse genome, 10B5 and 9F1-F2, respectively. Two additional loci in the human genome were identified as alphaCP-2 processed pseudogenes (PCBP2P1, 21q22.3, and PCBP2P2, 8q21-q22). Although the overall levels of alphaCP-3 and alphaCP-4 mRNAs are substantially lower than those of alphaCP-1 and alphaCP-2, transcripts of alphaCP-3 and alphaCP-4 were found in all mouse tissues tested. These data establish a new subfamily of genes predicted to encode closely related KH-containing RNA-binding proteins with potential functions in posttranscriptional controls. Copyright 2000 Academic Press.
Binding free energy prediction in strongly hydrophobic biomolecular systems.
Charlier, Landry; Nespoulous, Claude; Fiorucci, Sébastien; Antonczak, Serge; Golebiowski, Jérome
2007-11-21
We present a comparison of various computational approaches aiming at predicting the binding free energy in ligand-protein systems where the ligand is located within a highly hydrophobic cavity. The relative binding free energy between similar ligands is obtained by means of the thermodynamic integration (TI) method and compared to experimental data obtained through isothermal titration calorimetry measurements. The absolute free energy of binding prediction was obtained on a similar system (a pyrazine derivative bound to a lipocalin) by TI, potential of mean force (PMF) and also by means of the MMPBSA protocols. Although the TI protocol performs poorly either with an explicit or an implicit solvation scheme, the PMF calculation using an implicit solvation scheme leads to encouraging results, with a prediction of the binding affinity being 2 kcal mol(-1) lower than the experimental value. The use of an implicit solvation scheme appears to be well suited for the study of such hydrophobic systems, due to the lack of water molecules within the binding site.
Yang, Kai-Chun; Ku, Hsiao-Lun; Wu, Chia-Liang; Wang, Shyh-Jen; Yang, Chen-Chang; Deng, Jou-Fang; Lee, Ming-Been; Chou, Yuan-Hwa
2011-12-30
Carbon monoxide poisoning (COP) after charcoal burning results in delayed neuropsychological sequelae (DNS), which show clinical resemblance to Parkinson's disease, without adequate predictors at present. This study examined the role of dopamine transporter (DAT) binding for the prediction of DNS. Twenty-seven suicide attempters with COP were recruited. Seven of them developed DNS, while the remainder did not. The striatal DAT binding was measured by single photon emission computed tomography with (99m)Tc-TRODAT. The specific uptake ratio was derived based on a ratio equilibrium model. Using a logistic regression model, multiple clinical variables were examined as potential predictors for DNS. COP patients with DNS had a lower binding on left striatal DAT binding than patients without DNS. Logistic regression analysis showed that a combination of initial loss of consciousness and lower left striatal DAT binding predicted the development of DNS. Our data indicate that the left striatal DAT binding could help to predict the development of DNS. This finding not only demonstrates the feasibility of brain imaging techniques for predicting the development of DNS but will also help clinicians to improve the quality of care for COP patients. 2011 Elsevier Ireland Ltd. All rights reserved.
A web server for analysis, comparison and prediction of protein ligand binding sites.
Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S
2016-03-25
One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .
Chen, Fu; Sun, Huiyong; Wang, Junmei; Zhu, Feng; Liu, Hui; Wang, Zhe; Lei, Tailong; Li, Youyong; Hou, Tingjun
2018-06-21
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (ϵ in ). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with ϵ in = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 118 out of the 149 protein-RNA systems (79.2%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Accurate and sensitive quantification of protein-DNA binding affinity.
Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J
2018-04-17
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.
Accurate and sensitive quantification of protein-DNA binding affinity
Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.
2018-01-01
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332
Context influences on TALE–DNA binding revealed by quantitative profiling
Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.
2015-01-01
Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805
Context influences on TALE-DNA binding revealed by quantitative profiling.
Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L
2015-06-11
Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.
Sequence-Based Prediction of RNA-Binding Residues in Proteins.
Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena
2017-01-01
Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.
Sequence-Based Prediction of RNA-Binding Residues in Proteins
Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena
2017-01-01
Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829
Luo, Heng; Ye, Hao; Ng, Hui; Shi, Leming; Tong, Weida; Mattes, William; Mendrick, Donna; Hong, Huixiao
2015-01-01
As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs.
2015-01-01
Background As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Methods Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Results Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Conclusions Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs. PMID:26424483
Krajnak, Kristine; Rosewell, Katherine L; Duncan, Marilyn J; Wise, Phyllis M
2003-11-14
Estrogen-related changes in serotonergic neuronal transmission, including changes in the number of serotonin transporter (SERT) binding sites, have been cited as a possible cause for changes in mood, memory and sleep that occur during the menopausal transition. However, both aging and estradiol regulate SERT binding sites in the brain. The goal of this experiment was to determine how aging and estrogen interact to regulate SERT levels in the forebrain of young and reproductively senescent female Sprague-Dawley rats using [3H]paroxetine. The density of specific [3H]paroxetine binding in various brain regions was compared in young (2-4 months) and reproductively senescent (10-12 months) female rats at three times of day. In most brain regions examined, estrogen and aging independently increased the number of [3H]paroxetine binding sites. The only region that displayed a reduction in [3H]paroxetine binding with age was the suprachiasmatic nucleus (SCN). Time of day influenced [3H]paroxetine binding in the SCN and the paraventricular thalamus (PVT), two regions known to be involved in the regulation of circadian rhythms. Aging and/or estrogen also altered the pattern of binding in these regions. Thus, based on the results of this study, we conclude that aging and estrogen both act to regulate SERT binding sites in the forebrain of female rats, and that this regulation is region specific.
DuMond, Jenna F; He, Yi; Burg, Maurice B; Ferraris, Joan D
2015-11-01
Hypertonicity stimulates Nuclear Factor of Activated T-cells 5 (NFAT5) nuclear localization and transactivating activity. Many transcription factors are known to contain intrinsically disordered regions (IDRs) which become more structured with local environmental changes such as osmolality, temperature and tonicity. The transactivating domain of NFAT5 is predicted to be intrinsically disordered under normal tonicity, and under high NaCl, the activity of this domain is increased. To study the binding of co-regulatory proteins at IDRs a cDNA construct expressing the NFAT5 TAD was created and transformed into Escherichia coli cells. Transformed E. coli cells were mass produced by fermentation and extracted by cell lysis to release the NFAT5 TAD. The NFAT5 TAD was subsequently purified using a His-tag column, cation exchange chromatography as well as hydrophobic interaction chromatography and then characterized by mass spectrometry (MS). Published by Elsevier Inc.
Widespread Site-Dependent Buffering of Human Regulatory Polymorphism
Kutyavin, Tanya; Stamatoyannopoulos, John A.
2012-01-01
The average individual is expected to harbor thousands of variants within non-coding genomic regions involved in gene regulation. However, it is currently not possible to interpret reliably the functional consequences of genetic variation within any given transcription factor recognition sequence. To address this, we comprehensively analyzed heritable genome-wide binding patterns of a major sequence-specific regulator (CTCF) in relation to genetic variability in binding site sequences across a multi-generational pedigree. We localized and quantified CTCF occupancy by ChIP-seq in 12 related and unrelated individuals spanning three generations, followed by comprehensive targeted resequencing of the entire CTCF–binding landscape across all individuals. We identified hundreds of variants with reproducible quantitative effects on CTCF occupancy (both positive and negative). While these effects paralleled protein–DNA recognition energetics when averaged, they were extensively buffered by striking local context dependencies. In the significant majority of cases buffering was complete, resulting in silent variants spanning every position within the DNA recognition interface irrespective of level of binding energy or evolutionary constraint. The prevalence of complex partial or complete buffering effects severely constrained the ability to predict reliably the impact of variation within any given binding site instance. Surprisingly, 40% of variants that increased CTCF occupancy occurred at positions of human–chimp divergence, challenging the expectation that the vast majority of functional regulatory variants should be deleterious. Our results suggest that, even in the presence of “perfect” genetic information afforded by resequencing and parallel studies in multiple related individuals, genomic site-specific prediction of the consequences of individual variation in regulatory DNA will require systematic coupling with empirical functional genomic measurements. PMID:22457641
Recognition of functional sites in protein structures.
Shulman-Peleg, Alexandra; Nussinov, Ruth; Wolfson, Haim J
2004-06-04
Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.
Determination of Surface-Exposed, Functional Domains of Gonococcal Transferrin-Binding Protein A
Yost-Daljev, Mary Kate; Cornelissen, Cynthia Nau
2004-01-01
The gonococcal transferrin receptor is composed of two distinct proteins, TbpA and TbpB. TbpA is a member of the TonB-dependent family of integral outer membrane transporters, while TbpB is lipid modified and thought to be peripherally surface exposed. We previously proposed a hypothetical topology model for gonococcal TbpA that was based upon computer predictions and similarity with other TonB-dependent transporters for which crystal structures have been determined. In the present study, the hemagglutinin epitope was inserted into TbpA to probe the surface topology of this protein and secondarily to test the functional impacts of site-specific mutagenesis. Twelve epitope insertion mutants were constructed, five of which allowed us to confirm the surface exposure of loops 2, 3, 5, 7, and 10. In contrast to the predictions set forth by the hypothetical model, insertion into the plug region resulted in an epitope that was surface accessible, while epitope insertions into two putative loops (9 and 11) were not surface accessible. Insertions into putative loop 3 and β strand 9 abolished transferrin binding and utilization, and the plug insertion mutant exhibited decreased transferrin-binding affinity concomitant with an inability to utilize it. Insertion into putative β strand 16 generated a mutant that was able to bind transferrin normally but that was unable to mediate utilization. Mutants with insertions into putative loops 2, 9, and 11 maintained wild-type binding affinity but could utilize only transferrin in the presence of TbpB. This is the first demonstration of the ability of TbpB to compensate for a mutation in TbpA. PMID:14977987
Shows, Kathryn H; Shiang, Rita
2008-11-01
Treacher Collins syndrome is an autosomal-dominant mandibulofacial dysostosis caused by haploinsufficiency of the TCOF1 gene product treacle. Mouse Tcof1 protein is approximately 61% identical and 71% similar to treacle, and heterozygous knockout of Tcof1 causes craniofacial malformation. Tcof1 expression is high in developing neural crest, but much lower in other tissues. To investigate this dual regulation, highly conserved regions upstream of TCOF1 homologs were tested through deletion and mutation reporter assays, and conserved predicted transcription factor binding sites were assessed through chromatin binding studies. Assays were performed in mouse P19 embryonic carcinoma cells and in HEK293 cells to determine differential activation in cell types at different stages of differentiation. Binding of Cebpb, Zfp161, and Sp1 transcription factors was specific to the Tcof1 regulatory region in P19 cells. The Zfp161 binding site demonstrated P19 cell-specific repression, while the Sp1/Sp3 candidate site demonstrated HEK293 cell-specific activation. Moreover, presence of c-myb and Zfp161 transcripts was specific to P19 cells. A minimal promoter fragment from -253 to +43 bp directs constitutive expression in both cell types, and dual regulation of Tcof1 appears to be through differential repression of this minimal promoter. The CpG island at the transcription start site remains unmethylated in P19 cells, 11.5 dpc mouse embryonic tissue, and adult mouse ear, which supports constitutive activation of the Tcof1 promoter.
Suarez, Julio V.; Banks, Stephen; Thomas, Paul G.; Day, Anil
2014-01-01
The green alga Chlamydomonas reinhardtii provides a tractable genetic model to study herbicide mode of action using forward genetics. The herbicide norflurazon inhibits phytoene desaturase, which is required for carotenoid synthesis. Locating amino acid substitutions in mutant phytoene desaturases conferring norflurazon resistance provides a genetic approach to map the herbicide binding site. We isolated a UV-induced mutant able to grow in very high concentrations of norflurazon (150 µM). The phytoene desaturase gene in the mutant strain contained the first resistance mutation to be localised to the dinucleotide-binding Rossmann-likedomain. A highly conserved phenylalanine amino acid at position 131 of the 564 amino acid precursor protein was changed to a valine in the mutant protein. F131, and two other amino acids whose substitution confers norflurazon resistance in homologous phytoene desaturase proteins, map to distant regions in the primary sequence of the C. reinhardtii protein (V472, L505) but in tertiary models these residues cluster together to a region close to the predicted FAD binding site. The mutant gene allowed direct 5 µM norflurazon based selection of transformants, which were tolerant to other bleaching herbicides including fluridone, flurtamone, and diflufenican but were more sensitive to beflubutamid than wild type cells. Norflurazon resistance and beflubutamid sensitivity allow either positive or negative selection against transformants expressing the mutant phytoene desaturase gene. PMID:24936791
Sim, B K; Orlandi, P A; Haynes, J D; Klotz, F W; Carter, J M; Camus, D; Zegans, M E; Chulay, J D
1990-11-01
The Plasmodium falciparum gene encoding erythrocyte binding antigen-175 (EBA-175), a putative receptor for red cell invasion (Camus, D., and T. J. Hadley. 1985. Science (Wash. DC). 230:553-556.), has been isolated and characterized. DNA sequencing demonstrated a single open reading frame encoding a translation product of 1,435 amino acid residues. Peptides corresponding to regions on the deduced amino acid sequence predicted to be B cell epitopes were assessed for immunogenicity. Immunization of mice and rabbits with EBA-peptide 4, a synthetic peptide encompassing amino acid residues 1,062-1,103, produced antibodies that recognized P. falciparum merozoites in an indirect fluorescent antibody assay. When compared to sera from rabbits immunized with the same adjuvant and carrier protein, sera from rabbits immunized with EBA-peptide 4 inhibited merozoite invasion of erythrocytes in vitro by 80% at a 1:5 dilution. Furthermore, these sera inhibited the binding of purified, authentic EBA-175 to erythrocytes, suggesting that their activity in inhibiting merozoite invasion of erythrocytes is mediated by blocking the binding of EBA-175 to erythrocytes. Since the nucleotide sequence of EBA-peptide 4 is conserved among seven strains of P. falciparum from throughout the world (Sim, B. K. L. 1990. Mol. Biochem. Parasitol. 41:293-296.), these data identify a region of the protein that should be a focus of vaccine development efforts.
Rostamian, Mosayeb; Mousavy, Seyed Jafar; Ebrahimi, Firouz; Ghadami, Seyyed Abolghasem; Sheibani, Nader; Minaei, Mohammad Ebrahim; Arefpour Torabi, Mohammad Ali
2012-01-01
Recently, botulinum neurotoxin (BoNT)-derived recombinant proteins have been suggested as potential botulism vaccines. Here, with concentrating on BoNT type E (BoNT/E), we studied two of these binding domain-based recombinant proteins: a multivalent chimer protein, which is composed of BoNT serotypes A, B and E binding subdomains, and a monovalent recombinant protein, which contains 93 amino acid residues from recombinant C-terminal heavy chain of BoNT/E (rBoNT/E-HCC). Both proteins have an identical region (48 aa) that contains one of the most important BoNT/E epitopes (YLTHMRD sequence). The recombinant protein efficiency in antibody production, their structural differences, and their BoNT/E-epitope location were compared by using ELISA, circular dichroism, computational modeling, and hydrophobicity predictions. Immunological studies indicated that the antibody yield against rBoNT/E-HCC was higher than chimer protein. Cross ELISA confirmed that the antibodies against the chimer protein recognized rBoNT/E-HCC more efficiently. However, both antibody groups (anti-chimer and anti-rBoNT/E-HCC antibodies) were able to recognize other proteins. Structural studies with circular dichroism showed that chimer proteins have slightly more secondary structures than rBoNT/E-HCC. The immunological results suggested that the above-mentioned identical region in rBoNT/E-HCC is more exposed. Circular dichroism, computational protein modeling and hydrophobicity predictions indicated a more exposed location for the identical region in rBoNT/E-HCC than the chimer protein, which is strongly in agreement with immunological results.
Osman, Toba A M; Olsthoorn, René C L; Livieratos, Ioannis C
2014-09-22
Pepino mosaic virus (PepMV) is a mechanically-transmitted positive-strand RNA potexvirus, with a 6410 nt long single-stranded (ss) RNA genome flanked by a 5'-methylguanosine cap and a 3' poly-A tail. Computer-assisted folding of the 64 nt long PepMV 3'-untranslated region (UTR) resulted in the prediction of three stem-loop structures (hp1, hp2, and hp3 in the 3'-5' direction). The importance of these structures and/or sequences for promotion of negative-strand RNA synthesis and binding to the RNA dependent RNA polymerase (RdRp) was tested in vitro using a specific RdRp assay. Hp1, which is highly variable among different PepMV isolates, appeared dispensable for negative-strand synthesis. Hp2, which is characterized by a large U-rich loop, tolerated base-pair changes in its stem as long as they maintained the stem integrity but was very sensitive to changes in the U-rich loop. Hp3, which harbours the conserved potexvirus ACUUAA hexamer motif, was essential for template activity. Template-RNA polymerase binding competition experiments showed that the ACUUAA sequence represents a high-affinity RdRp binding element. Copyright © 2014 Elsevier B.V. All rights reserved.
Jeyapalan, Zina; Deng, Zhaoqun; Shatseva, Tatiana; Fang, Ling; He, Chengyan; Yang, Burton B
2011-04-01
The non-coding 3'-untranslated region (UTR) plays an important role in the regulation of microRNA (miRNA) functions, since it can bind and inactivate multiple miRNAs. Here, we show the 3'-UTR of CD44 is able to antagonize cytoplasmic miRNAs, and result in the increased translation of CD44 and downstream target mRNA, CDC42. A series of cell function assays in the human breast cancer cell line, MT-1, have shown that the CD44 3'-UTR inhibits proliferation, colony formation and tumor growth. Furthermore, it modulated endothelial cell activities, favored angiogenesis, induced tumor cell apoptosis and increased sensitivity to Docetaxel. These results are due to the interaction of the CD44 3'-UTR with multiple miRNAs. Computational algorithms have predicted three miRNAs, miR-216a, miR-330 and miR-608, can bind to both the CD44 and CDC42 3'-UTRs. This was confirmed with luciferase assays, western blotting and immunohistochemical staining and correlated with a series of siRNA assays. Thus, the non-coding CD44 3'-UTR serves as a competitor for miRNA binding and subsequently inactivates miRNA functions, by freeing the target mRNAs from being repressed.
Jeyapalan, Zina; Deng, Zhaoqun; Shatseva, Tatiana; Fang, Ling; He, Chengyan; Yang, Burton B.
2011-01-01
The non-coding 3′-untranslated region (UTR) plays an important role in the regulation of microRNA (miRNA) functions, since it can bind and inactivate multiple miRNAs. Here, we show the 3′-UTR of CD44 is able to antagonize cytoplasmic miRNAs, and result in the increased translation of CD44 and downstream target mRNA, CDC42. A series of cell function assays in the human breast cancer cell line, MT-1, have shown that the CD44 3′-UTR inhibits proliferation, colony formation and tumor growth. Furthermore, it modulated endothelial cell activities, favored angiogenesis, induced tumor cell apoptosis and increased sensitivity to Docetaxel. These results are due to the interaction of the CD44 3′-UTR with multiple miRNAs. Computational algorithms have predicted three miRNAs, miR-216a, miR-330 and miR-608, can bind to both the CD44 and CDC42 3′-UTRs. This was confirmed with luciferase assays, western blotting and immunohistochemical staining and correlated with a series of siRNA assays. Thus, the non-coding CD44 3′-UTR serves as a competitor for miRNA binding and subsequently inactivates miRNA functions, by freeing the target mRNAs from being repressed. PMID:21149267
Accurate Prediction of Inducible Transcription Factor Binding Intensities In Vivo
Siepel, Adam; Lis, John T.
2012-01-01
DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB–seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB–seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF–bound and HSF–free DNA, and then detecting HSF–bound DNA by high-throughput sequencing. We compared PB–seq binding profiles with ones observed in vivo by ChIP–seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase–seq data and the ChIP–chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity. PMID:22479205
Tactics for preclinical validation of receptor-binding radiotracers
Lever, Susan Z.; Fan, Kuo-Hsien; Lever, John R.
2016-01-01
Introduction Aspects of radiopharmaceutical development are illustrated through preclinical studies of [125I]-(E)-1-(2-(2,3-dihydrobenzofuran-5-yl)ethyl)-4-(iodoallyl)piperazine ([125I]-E-IA- BF-PE-PIPZE), a radioligand for sigma-1 (σ1) receptors, coupled with examples from the recent literature. Findings are compared to those previously observed for [125I]-(E)-1-(2-(2,3-dimethoxy-5-yl)ethyl)-4-(iodoallyl)piperazine ([125I]-E-IA-DM-PE-PIPZE). Methods Syntheses of E-IA-BF-PE-PIPZE and [125I]-E-IA-BF-PE-PIPZE were accomplished by standard methods. In vitro receptor binding studies and autoradiography were performed, and binding potential was predicted. Measurements of lipophilicity and protein binding were obtained. In vivo studies were conducted in mice to evaluate radioligand stability, as well as specific binding to σ1 sites in brain, brain regions and peripheral organs in the presence and absence of potential blockers. Results E-IA-BF-PE-PIPZE exhibited high affinity and selectivity for σ1 receptors (Ki = 0.43 ± 0.03 nM, σ2 / σ1 = 173). [125I]-E-IA-BF-PE-PIPZE was prepared in good yield and purity, with high specific activity. Radioligand binding provided dissociation (koff) and association (kon) rate constants, along with a measured Kd of 0.24 ± 0.01 nM and Bmax of 472 ± 13 fmol / mg protein. The radioligand proved suitable for quantitative autoradiography in vitro using brain sections. Moderate lipophilicity, Log D7.4 2.69 ± 0.28, was determined, and protein binding was 71 ± 0.3%. In vivo, high initial whole brain uptake, > 6% injected dose / g, cleared slowly over 24 h. Specific binding represented 75% to 93% of total binding from 15 min to 24 h. Findings were confirmed and extended by regional brain biodistribution. Radiometabolites were not observed in brain (1%). Conclusions Substitution of dihydrobenzofuranylethyl for dimethoxyphenethyl increased radioligand affinity for σ1 receptors by 16-fold. While high specific binding to σ1 receptors was observed for both radioligands in vivo, [125I]-E-IA-BF-PE-PIPZE displayed much slower clearance kinetics than [125I]-E-IA-DM-PE-PIPZE. Thus, minor structural modifications of σ1 receptor radioligands lead to major differences in binding properties in vitro and in vivo. PMID:27755986
Rhoden, John J; Dyas, Gregory L; Wroblewski, Victor J
2016-05-20
Despite the increasing number of multivalent antibodies, bispecific antibodies, fusion proteins, and targeted nanoparticles that have been generated and studied, the mechanism of multivalent binding to cell surface targets is not well understood. Here, we describe a conceptual and mathematical model of multivalent antibody binding to cell surface antigens. Our model predicts that properties beyond 1:1 antibody:antigen affinity to target antigens have a strong influence on multivalent binding. Predicted crucial properties include the structure and flexibility of the antibody construct, the target antigen(s) and binding epitope(s), and the density of antigens on the cell surface. For bispecific antibodies, the ratio of the expression levels of the two target antigens is predicted to be critical to target binding, particularly for the lower expressed of the antigens. Using bispecific antibodies of different valencies to cell surface antigens including MET and EGF receptor, we have experimentally validated our modeling approach and its predictions and observed several nonintuitive effects of avidity related to antigen density, target ratio, and antibody affinity. In some biological circumstances, the effect we have predicted and measured varied from the monovalent binding interaction by several orders of magnitude. Moreover, our mathematical framework affords us a mechanistic interpretation of our observations and suggests strategies to achieve the desired antibody-antigen binding goals. These mechanistic insights have implications in antibody engineering and structure/activity relationship determination in a variety of biological contexts. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
2011-01-01
Background Existing methods of predicting DNA-binding proteins used valuable features of physicochemical properties to design support vector machine (SVM) based classifiers. Generally, selection of physicochemical properties and determination of their corresponding feature vectors rely mainly on known properties of binding mechanism and experience of designers. However, there exists a troublesome problem for designers that some different physicochemical properties have similar vectors of representing 20 amino acids and some closely related physicochemical properties have dissimilar vectors. Results This study proposes a systematic approach (named Auto-IDPCPs) to automatically identify a set of physicochemical and biochemical properties in the AAindex database to design SVM-based classifiers for predicting and analyzing DNA-binding domains/proteins. Auto-IDPCPs consists of 1) clustering 531 amino acid indices in AAindex into 20 clusters using a fuzzy c-means algorithm, 2) utilizing an efficient genetic algorithm based optimization method IBCGA to select an informative feature set of size m to represent sequences, and 3) analyzing the selected features to identify related physicochemical properties which may affect the binding mechanism of DNA-binding domains/proteins. The proposed Auto-IDPCPs identified m=22 features of properties belonging to five clusters for predicting DNA-binding domains with a five-fold cross-validation accuracy of 87.12%, which is promising compared with the accuracy of 86.62% of the existing method PSSM-400. For predicting DNA-binding sequences, the accuracy of 75.50% was obtained using m=28 features, where PSSM-400 has an accuracy of 74.22%. Auto-IDPCPs and PSSM-400 have accuracies of 80.73% and 82.81%, respectively, applied to an independent test data set of DNA-binding domains. Some typical physicochemical properties discovered are hydrophobicity, secondary structure, charge, solvent accessibility, polarity, flexibility, normalized Van Der Waals volume, pK (pK-C, pK-N, pK-COOH and pK-a(RCOOH)), etc. Conclusions The proposed approach Auto-IDPCPs would help designers to investigate informative physicochemical and biochemical properties by considering both prediction accuracy and analysis of binding mechanism simultaneously. The approach Auto-IDPCPs can be also applicable to predict and analyze other protein functions from sequences. PMID:21342579
PREDICTING ER BINDING AFFINITY FOR EDC RANKING AND PRIORITIZATION: A COMPARISON OF THREE MODELS
A comparative analysis of how three COREPA models for ER binding affinity performed when used to predict potential estrogen receptor (ER) ligands is presented. Models I and II were developed based on training sets of 232 and 279 rat ER binding affinity measurements, respectively....
Polymorphisms A387P in thrombospondin-4 and N700S in thrombospondin-1 perturb calcium binding sites.
Stenina, Olga I; Ustinov, Valentin; Krukovets, Irene; Marinic, Tina; Topol, Eric J; Plow, Edward F
2005-11-01
Recent genetic studies have associated members of the thrombospondin (TSP) gene family with premature cardiovascular disease. The disease-associated polymorphisms lead to single amino acid changes in TSP-4 (A387P) and TSP-1 (N700S). These substitutions reside in adjacent domains of these highly homologous proteins. Secondary structural predictive programs and the homology of the domains harboring these amino acid substitutions to those in other proteins pointed to potential alterations of putative Ca2+ binding sites that reside in close proximity to the polymorphic amino acids. Since Ca2+ binding is critical for the structure and function of TSP family members, direct evidence for differences in Ca2+ binding by the polymorphic forms was sought. Using synthetic peptides and purified recombinant variant fragments bearing the amino acid substitutions, we measured differences in Tb3+ luminescence as an index of Ca2+ binding. The Tb3+ binding constants placed the TSP-1 region affected by N700S polymorphism among other high-affinity Ca2+ binding sites. The affinity of Ca2+ binding was lower for peptides (3.5-fold) and recombinant fragments (10-fold) containing the S700 vs. the N700 form. In TSP-4, the P387 form acquired an additional Ca2+ binding site absent in the A387 form. The results of our study suggest that both substitutions (A387P in TSP-4 and N700S in TSP-1) alter Ca2+ binding properties. Since these substitutions exert the opposite effects on Ca2+ binding, a decrease in TSP-1 and an increase in TSP-4, the two TSP variants are likely to influence cardiovascular functions in distinct but yet pathogenic ways.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hsu, Hao-Chi; Tong, Simon; Zhou, Yuchen
Human FABP5 and FABP7 are intracellular endocannabinoid transporters. SBFI-26 is an α-truxillic acid 1-naphthyl monoester that competitively inhibits the activities of FABP5 and FABP7 and produces antinociceptive and anti-inflammatory effects in mice. The synthesis of SBFI-26 yields several stereoisomers, and it is not known how the inhibitor binds the transporters. Here we report co-crystal structures of SBFI-26 in complex with human FABP5 and FABP7 at 2.2 and 1.9 Å resolution, respectively. We found that only (S)-SBFI-26 was present in the crystal structures. The inhibitor largely mimics the fatty acid binding pattern, but it also has several unique interactions. Notably, themore » FABP7 complex corroborates key aspects of the ligand binding pose at the canonical site previously predicted by virtual screening. In FABP5, SBFI-26 was unexpectedly found to bind at the substrate entry portal region in addition to binding at the canonical ligand-binding pocket. Our structural and binding energy analyses indicate that both R and S forms appear to bind the transporter equally well. We suggest that the S enantiomer observed in the crystal structures may be a result of the crystallization process selectively incorporating the (S)-SBFI-26–FABP complexes into the growing lattice, or that the S enantiomer may bind to the portal site more rapidly than to the canonical site, leading to an increased local concentration of the S enantiomer for binding to the canonical site. Our work reveals two binding poses of SBFI-26 in its target transporters. This knowledge will guide the development of more potent FABP inhibitors based upon the SBFI-26 scaffold.« less
du Plessis, Mignon; Smith, Anthony M.; Klugman, Keith P.
1998-01-01
A seminested-PCR assay, based on the amplification of the pneumococcal penicillin-binding protein 2B gene (pbp2B), was developed for the detection of penicillin-resistant and -susceptible pneumococci in cerebrospinal fluid (CSF) specimens. Species-specific primers (P5 and P6) which amplified a 682-bp conserved region of the transpeptidase-encoding region of the pbp2B gene were used. Four “resistance” primers were designed to bind to altered areas of the pbp2B gene identified in penicillin-resistant South African wild-type strains. Together with the downstream primer P6, the upstream resistance primers amplified fragments which were used to detect the presence of penicillin resistance. This system identified all 35 of the S. pneumoniae isolates evaluated, including strains of 11 different serotypes and a range of penicillin-resistant and -susceptible strains. The specificity of the assay was demonstrated by its inability to amplify DNA from other bacterial species which commonly cause meningitis. It was possible to detect pneumococcal DNA from culture-negative CSF inoculated with 2.5 pg of purified DNA or 18 CFU. Analysis of 285 CSF specimens showed that PCR detected the pneumococcus in 18 samples positive by culture, including the identification of four penicillin-resistant isolates. The positive predictive value and the negative predictive value of the assay were each 100%. PMID:9466757
Tramontano, A; Bianchi, E; Venturini, S; Martin, F; Pessi, A; Sollazzo, M
1994-03-01
Conformationally constraining selectable peptides onto a suitable scaffold that enables their conformation to be predicted or readily determined by experimental techniques would considerably boost the drug discovery process by reducing the gap between the discovery of a peptide lead and the design of a peptidomimetic with a more desirable pharmacological profile. With this in mind, we designed the minibody, a 61-residue beta-protein aimed at retaining some desirable features of immunoglobulin variable domains, such as tolerance to sequence variability in selected regions of the protein and predictability of the main chain conformation of the same regions, based on the 'canonical structures' model. To test the ability of the minibody scaffold to support functional sites we also designed a metal binding version of the protein by suitably choosing the sequences of its loops. The minibody was produced both by chemical synthesis and expression in E. coli and characterized by size exclusion chromatography, UV CD (circular dichroism) spectroscopy and metal binding activity. All our data supported the model, but a more detailed structural characterization of the molecule was impaired by its low solubility. We were able to overcome this problem both by further mutagenesis of the framework and by addition of a solubilizing motif. The minibody is being used to select constrained human IL-6 peptidic ligands from a library displayed on the surface of the f1 bacteriophage.
McCullough, Christopher; Neumann, Terrence S.; Gone, Jayapal Reddy; He, Zhengjie; Herrild, Christian; Wondergem, Julie; Pandey, Rajesh K.; Donaldson, William A.; Sem, Daniel S.
2014-01-01
Various estrogen analogs were synthesized and tested for binding to human ERα using a fluorescence polarization displacement assay. Binding affinity and orientation were also predicted using docking calculations. Docking was able to accurately predict relative binding affinity and orientation for estradiol, but only if a tightly bound water molecule bridging Arg394/Glu353 is present. Di-hydroxyl compounds sometimes bind in two orientations, which are flipped in terms of relative positioning of their hydroxyl groups. Di-hydroxyl compounds were predicted to bind with their aliphatic hydroxyl group interacting with His524 in ERα. One nonsteroid-based dihdroxyl compound was 1000-fold specific for ERβ over ERα, and was also 25-fold specific for agonist ERβ versus antagonist activity. Docking predictions suggest this specificity may be due to interaction of the aliphatic hydroxyl with His475 in the agonist form of ERβ, versus with Thr299 in the antagonist form. But, the presence of this aliphatic hydroxyl is not required in all compounds, since mono-hydroxyl (phenolic) compounds bind ERα with high affinity, via hydroxyl hydrogen bonding interactions with the ERα Arg394/Glu353/water triad, and van der Waals interactions with the rest of the molecule. PMID:24315190
Accuracy of binding mode prediction with a cascadic stochastic tunneling method.
Fischer, Bernhard; Basili, Serena; Merlitz, Holger; Wenzel, Wolfgang
2007-07-01
We investigate the accuracy of the binding modes predicted for 83 complexes of the high-resolution subset of the ASTEX/CCDC receptor-ligand database using the atomistic FlexScreen approach with a simple forcefield-based scoring function. The median RMS deviation between experimental and predicted binding mode was just 0.83 A. Over 80% of the ligands dock within 2 A of the experimental binding mode, for 60 complexes the docking protocol locates the correct binding mode in all of ten independent simulations. Most docking failures arise because (a) the experimental structure clashed in our forcefield and is thus unattainable in the docking process or (b) because the ligand is stabilized by crystal water. 2007 Wiley-Liss, Inc.
A low-complexity region in the YTH domain protein Mmi1 enhances RNA binding.
Stowell, James A W; Wagstaff, Jane L; Hill, Chris H; Yu, Minmin; McLaughlin, Stephen H; Freund, Stefan M V; Passmore, Lori A
2018-06-15
Mmi1 is an essential RNA-binding protein in the fission yeast Schizosaccharomyces pombe that eliminates meiotic transcripts during normal vegetative growth. Mmi1 contains a YTH domain that binds specific RNA sequences, targeting mRNAs for degradation. The YTH domain of Mmi1 uses a noncanonical RNA-binding surface that includes contacts outside the conserved fold. Here, we report that an N-terminal extension that is proximal to the YTH domain enhances RNA binding. Using X-ray crystallography, NMR, and biophysical methods, we show that this low-complexity region becomes more ordered upon RNA binding. This enhances the affinity of the interaction of the Mmi1 YTH domain with specific RNAs by reducing the dissociation rate of the Mmi1-RNA complex. We propose that the low-complexity region influences RNA binding indirectly by reducing dynamic motions of the RNA-binding groove and stabilizing a conformation of the YTH domain that binds to RNA with high affinity. Taken together, our work reveals how a low-complexity region proximal to a conserved folded domain can adopt an ordered structure to aid nucleic acid binding. © 2018 Stowell et al.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-04-01
Ligand-protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-01-01
Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects. PMID:20162627
Prediction of binding hot spot residues by using structural and evolutionary parameters
2009-01-01
In this work, we present a method for predicting hot spot residues by using a set of structural and evolutionary parameters. Unlike previous studies, we use a set of parameters which do not depend on the structure of the protein in complex, so that the predictor can also be used when the interface region is unknown. Despite the fact that no information concerning proteins in complex is used for prediction, the application of the method to a compiled dataset described in the literature achieved a performance of 60.4%, as measured by F-Measure, corresponding to a recall of 78.1% and a precision of 49.5%. This result is higher than those reported by previous studies using the same data set. PMID:21637529
Bhhatarai, Barun; Wilson, Daniel M.; Price, Paul S.; Marty, Sue; Parks, Amanda K.; Carney, Edward
2016-01-01
Background: Integrative testing strategies (ITSs) for potential endocrine activity can use tiered in silico and in vitro models. Each component of an ITS should be thoroughly assessed. Objectives: We used the data from three in vitro ToxCast™ binding assays to assess OASIS, a quantitative structure-activity relationship (QSAR) platform covering both estrogen receptor (ER) and androgen receptor (AR) binding. For stronger binders (described here as AC50 < 1 μM), we also examined the relationship of QSAR predictions of ER or AR binding to the results from 18 ER and 10 AR transactivation assays, 72 ER-binding reference compounds, and the in vivo uterotrophic assay. Methods: NovaScreen binding assay data for ER (human, bovine, and mouse) and AR (human, chimpanzee, and rat) were used to assess the sensitivity, specificity, concordance, and applicability domain of two OASIS QSAR models. The binding strength relative to the QSAR-predicted binding strength was examined for the ER data. The relationship of QSAR predictions of binding to transactivation- and pathway-based assays, as well as to in vivo uterotrophic responses, was examined. Results: The QSAR models had both high sensitivity (> 75%) and specificity (> 86%) for ER as well as both high sensitivity (92–100%) and specificity (70–81%) for AR. For compounds within the domains of the ER and AR QSAR models that bound with AC50 < 1 μM, the QSAR models accurately predicted the binding for the parent compounds. The parent compounds were active in all transactivation assays where metabolism was incorporated and, except for those compounds known to require metabolism to manifest activity, all assay platforms where metabolism was not incorporated. Compounds in-domain and predicted to bind by the ER QSAR model that were positive in ToxCast™ ER binding at AC50 < 1 μM were active in the uterotrophic assay. Conclusions: We used the extensive ToxCast™ HTS binding data set to show that OASIS ER and AR QSAR models had high sensitivity and specificity when compounds were in-domain of the models. Based on this research, we recommend a tiered screening approach wherein a) QSAR is used to identify compounds in-domain of the ER or AR binding models and predicted to bind; b) those compounds are screened in vitro to assess binding potency; and c) the stronger binders (AC50 < 1 μM) are screened in vivo. This scheme prioritizes compounds for integrative testing and risk assessment. Importantly, compounds that are not in-domain, that are predicted either not to bind or to bind weakly, that are not active in in vitro, that require metabolism to manifest activity, or for which in vivo AR testing is in order, need to be assessed differently. Citation: Bhhatarai B, Wilson DM, Price PS, Marty S, Parks AK, Carney E. 2016. Evaluation of OASIS QSAR models using ToxCast™ in vitro estrogen and androgen receptor binding data and application in an integrated endocrine screening approach. Environ Health Perspect 124:1453–1461; http://dx.doi.org/10.1289/EHP184 PMID:27152837
Predicting Nonspecific Ion Binding Using DelPhi
Petukh, Marharyta; Zhenirovskyy, Maxim; Li, Chuan; Li, Lin; Wang, Lin; Alexov, Emil
2012-01-01
Ions are an important component of the cell and affect the corresponding biological macromolecules either via direct binding or as a screening ion cloud. Although some ion binding is highly specific and frequently associated with the function of the macromolecule, other ions bind to the protein surface nonspecifically, presumably because the electrostatic attraction is strong enough to immobilize them. Here, we test such a scenario and demonstrate that experimentally identified surface-bound ions are located at a potential that facilitates binding, which indicates that the major driving force is the electrostatics. Without taking into consideration geometrical factors and structural fluctuations, we show that ions tend to be bound onto the protein surface at positions with strong potential but with polarity opposite to that of the ion. This observation is used to develop a method that uses a DelPhi-calculated potential map in conjunction with an in-house-developed clustering algorithm to predict nonspecific ion-binding sites. Although this approach distinguishes only the polarity of the ions, and not their chemical nature, it can predict nonspecific binding of positively or negatively charged ions with acceptable accuracy. One can use the predictions in the Poisson-Boltzmann approach by placing explicit ions in the predicted positions, which in turn will reduce the magnitude of the local potential and extend the limits of the Poisson-Boltzmann equation. In addition, one can use this approach to place the desired number of ions before conducting molecular-dynamics simulations to neutralize the net charge of the protein, because it was shown to perform better than standard screened Coulomb canned routines, or to predict ion-binding sites in proteins. This latter is especially true for proteins that are involved in ion transport, because such ions are loosely bound and very difficult to detect experimentally. PMID:22735539
Hou, Tingjun; Wang, Junmei; Li, Youyong; Wang, Wei
2011-01-24
The Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) and the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) methods calculate binding free energies for macromolecules by combining molecular mechanics calculations and continuum solvation models. To systematically evaluate the performance of these methods, we report here an extensive study of 59 ligands interacting with six different proteins. First, we explored the effects of the length of the molecular dynamics (MD) simulation, ranging from 400 to 4800 ps, and the solute dielectric constant (1, 2, or 4) on the binding free energies predicted by MM/PBSA. The following three important conclusions could be observed: (1) MD simulation length has an obvious impact on the predictions, and longer MD simulation is not always necessary to achieve better predictions. (2) The predictions are quite sensitive to the solute dielectric constant, and this parameter should be carefully determined according to the characteristics of the protein/ligand binding interface. (3) Conformational entropy often show large fluctuations in MD trajectories, and a large number of snapshots are necessary to achieve stable predictions. Next, we evaluated the accuracy of the binding free energies calculated by three Generalized Born (GB) models. We found that the GB model developed by Onufriev and Case was the most successful model in ranking the binding affinities of the studied inhibitors. Finally, we evaluated the performance of MM/GBSA and MM/PBSA in predicting binding free energies. Our results showed that MM/PBSA performed better in calculating absolute, but not necessarily relative, binding free energies than MM/GBSA. Considering its computational efficiency, MM/GBSA can serve as a powerful tool in drug design, where correct ranking of inhibitors is often emphasized.
Tau PET in Alzheimer disease and mild cognitive impairment.
Cho, Hanna; Choi, Jae Yong; Hwang, Mi Song; Lee, Jae Hoon; Kim, You Jin; Lee, Hye Mi; Lyoo, Chul Hyoung; Ryu, Young Hoon; Lee, Myung Sik
2016-07-26
To investigate the topographical distribution of tau pathology and its effect on functional and structural changes in patients with Alzheimer disease (AD) and mild cognitive impairment (MCI) by using (18)F-AV-1451 PET. We included 20 patients with AD, 15 patients with MCI, and 20 healthy controls, and performed neuropsychological function tests, MRI, as well as (18)F-florbetaben (for amyloid) and (18)F-AV-1451 (for tau) PET scans. By using the regional volume-of-interest masks extracted from MRIs, regional binding values of standardized uptake value ratios and volumes were measured. We compared regional binding values among 3 diagnostic groups and identified correlations among the regional binding values, performance in each cognitive function test, and regional atrophy. (18)F-AV-1451 binding was increased only in the entorhinal cortex in patients with MCI, while patients with AD exhibited greater binding in most cortical regions. In the 35 patients with MCI and AD, (18)F-AV-1451 binding in most of the neocortex increased with a worsening of global cognitive function. The visual and verbal memory functions were associated with the extent of (18)F-AV-1451 binding, especially in the medial temporal regions. The (18)F-AV-1451 binding also correlated with the severity of regional atrophy of the cerebral cortex. Tau PET imaging with (18)F-AV-1451 could serve as an in vivo biomarker for the evaluation of AD-related tau pathology and monitoring disease progression. The accumulation of pathologic tau is more closely related to functional and structural deterioration in the AD spectrum than β-amyloid. © 2016 American Academy of Neurology.
Villoutreix, B O; Härdig, Y; Wallqvist, A; Covell, D G; García de Frutos, P; Dahlbäck, B
1998-06-01
C4b-binding protein (C4BP) contributes to the regulation of the classical pathway of the complement system and plays an important role in blood coagulation. The main human C4BP isoform is composed of one beta-chain and seven alpha-chains essentially built from three and eight complement control protein (CCP) modules, respectively, followed by a nonrepeat carboxy-terminal region involved in polymerization of the chains. C4BP is known to interact with heparin, C4b, complement factor I, serum amyloid P component, streptococcal Arp and Sir proteins, and factor VIII/VIIIa via its alpha-chains and with protein S through its beta-chain. The principal aim of the present study was to localize regions of C4BP involved in the interaction with C4b, Arp, and heparin. For this purpose, a computer model of the 8 CCP modules of C4BP alpha-chain was constructed, taking into account data from previous electron microscopy (EM) studies. This structure was investigated in the context of known and/or new experimental data. Analysis of the alpha-chain model, together with monoclonal antibody studies and heparin binding experiments, suggests that a patch of positively charged residues, at the interface between the first and second CCP modules, plays an important role in the interaction between C4BP and C4b/Arp/Sir/heparin. Putative binding sites, secondary-structure prediction for the central core, and an overall reevaluation of the size of the C4BP molecule are also presented. An understanding of these intermolecular interactions should contribute to the rational design of potential therapeutic agents aiming at interfering specifically some of these protein-protein interactions.
Engineering Encodable Lanthanide-Binding Tags (LBTs) into Loop Regions of Proteins
Barthelmes, Katja; Reynolds, Anne M.; Peisach, Ezra; Jonker, Hendrik R. A.; DeNunzio, Nicholas J.; Allen, Karen N.; Imperiali, Barbara; Schwalbe, Harald
2011-01-01
Lanthanide-binding-tags (LBTs) are valuable tools for investigation of protein structure, function, and dynamics by NMR spectroscopy, X-ray crystallography and luminescence studies. We have inserted LBTs into three different loop positions (denoted L, R, and S) of the model protein interleukin-1β and varied the length of the spacer between the LBT and the protein (denoted 1-3). Luminescence studies demonstrate that all nine constructs bind Tb3+ tightly in the low nanomolar range. No significant change in the fusion protein occurs from insertion of the LBT, as shown by two X-ray crystallographic structures of the IL1β-S1 and IL1β-L3 constructs and for the remaining constructs by comparing 1H-15N-HSQC NMR spectra with wild-type IL1β. Additionally, binding of LBT-loop IL1β proteins to their native binding partner in vitro remains unaltered. X-ray crystallographic phasing was successful using only the signal from the bound lanthanide. Large residual dipolar couplings (RDCs) could be determined by NMR spectroscopy for all LBT-loop-constructs and revealed that the LBT-2 series were rigidly incorporated into the interleukin-1β structure. The paramagnetic NMR spectra of loop-LBT mutant IL1β-R2 were assigned and the Δχ tensor components were calculated based on RDCs and pseudocontact shifts (PCSs). A structural model of the IL1β-R2 construct was calculated using the paramagnetic restraints. The current data provide support that encodable LBTs serve as versatile biophysical tags when inserted into loop regions of proteins of known structure or predicted via homology modelling. PMID:21182275
Brzeska, Hanna; Pridham, Kevin; Chery, Godefroy; Titus, Margaret A.; Korn, Edward D.
2014-01-01
F-actin structures and their distribution are important determinants of the dynamic shapes and functions of eukaryotic cells. Actin waves are F-actin formations that move along the ventral cell membrane driven by actin polymerization. Dictyostelium myosin IB is associated with actin waves but its role in the wave is unknown. Myosin IB is a monomeric, non-filamentous myosin with a globular head that binds to F-actin and has motor activity, and a non-helical tail comprising a basic region, a glycine-proline-glutamine-rich region and an SH3-domain. The basic region binds to acidic phospholipids in the plasma membrane through a short basic-hydrophobic site and the Gly-Pro-Gln region binds F-actin. In the current work we found that both the basic-hydrophobic site in the basic region and the Gly-Pro-Gln region of the tail are required for the association of myosin IB with actin waves. This is the first evidence that the Gly-Pro-Gln region is required for localization of myosin IB to a specific actin structure in situ. The head is not required for myosin IB association with actin waves but binding of the head to F-actin strengthens the association of myosin IB with waves and stabilizes waves. Neither the SH3-domain nor motor activity is required for association of myosin IB with actin waves. We conclude that myosin IB contributes to anchoring actin waves to the plasma membranes by binding of the basic-hydrophobic site to acidic phospholipids in the plasma membrane and binding of the Gly-Pro-Gln region to F-actin in the wave. PMID:24747353
Tang, Yat T; Marshall, Garland R
2011-02-28
Binding affinity prediction is one of the most critical components to computer-aided structure-based drug design. Despite advances in first-principle methods for predicting binding affinity, empirical scoring functions that are fast and only relatively accurate are still widely used in structure-based drug design. With the increasing availability of X-ray crystallographic structures in the Protein Data Bank and continuing application of biophysical methods such as isothermal titration calorimetry to measure thermodynamic parameters contributing to binding free energy, sufficient experimental data exists that scoring functions can now be derived by separating enthalpic (ΔH) and entropic (TΔS) contributions to binding free energy (ΔG). PHOENIX, a scoring function to predict binding affinities of protein-ligand complexes, utilizes the increasing availability of experimental data to improve binding affinity predictions by the following: model training and testing using high-resolution crystallographic data to minimize structural noise, independent models of enthalpic and entropic contributions fitted to thermodynamic parameters assumed to be thermodynamically biased to calculate binding free energy, use of shape and volume descriptors to better capture entropic contributions. A set of 42 descriptors and 112 protein-ligand complexes were used to derive functions using partial least-squares for change of enthalpy (ΔH) and change of entropy (TΔS) to calculate change of binding free energy (ΔG), resulting in a predictive r2 (r(pred)2) of 0.55 and a standard error (SE) of 1.34 kcal/mol. External validation using the 2009 version of the PDBbind "refined set" (n = 1612) resulted in a Pearson correlation coefficient (R(p)) of 0.575 and a mean error (ME) of 1.41 pK(d). Enthalpy and entropy predictions were of limited accuracy individually. However, their difference resulted in a relatively accurate binding free energy. While the development of an accurate and applicable scoring function was an objective of this study, the main focus was evaluation of the use of high-resolution X-ray crystal structures with high-quality thermodynamic parameters from isothermal titration calorimetry for scoring function development. With the increasing application of structure-based methods in molecular design, this study suggests that using high-resolution crystal structures, separating enthalpy and entropy contributions to binding free energy, and including descriptors to better capture entropic contributions may prove to be effective strategies toward rapid and accurate calculation of binding affinity.
Identification of B cell epitopes of alcohol dehydrogenase allergen of Curvularia lunata.
Nair, Smitha; Kukreja, Neetu; Singh, Bhanu Pratap; Arora, Naveen
2011-01-01
Epitope identification assists in developing molecules for clinical applications and is useful in defining molecular features of allergens for understanding structure/function relationship. The present study was aimed to identify the B cell epitopes of alcohol dehydrogenase (ADH) allergen from Curvularia lunata using in-silico methods and immunoassay. B cell epitopes of ADH were predicted by sequence and structure based methods and protein-protein interaction tools while T cell epitopes by inhibitory concentration and binding score methods. The epitopes were superimposed on a three dimensional model of ADH generated by homology modeling and analyzed for antigenic characteristics. Peptides corresponding to predicted epitopes were synthesized and immunoreactivity assessed by ELISA using individual and pooled patients' sera. The homology model showed GroES like catalytic domain joined to Rossmann superfamily domain by an alpha helix. Stereochemical quality was confirmed by Procheck which showed 90% residues in most favorable region of Ramachandran plot while Errat gave a quality score of 92.733%. Six B cell (P1-P6) and four T cell (P7-P10) epitopes were predicted by a combination of methods. Peptide P2 (epitope P2) showed E(X)(2)GGP(X)(3)KKI conserved pattern among allergens of pathogenesis related family. It was predicted as high affinity binder based on electronegativity and low hydrophobicity. The computational methods employed were validated using Bet v 1 and Der p 2 allergens where 67% and 60% of the epitope residues were predicted correctly. Among B cell epitopes, Peptide P2 showed maximum IgE binding with individual and pooled patients' sera (mean OD 0.604±0.059 and 0.506±0.0035, respectively) followed by P1, P4 and P3 epitopes. All T cell epitopes showed lower IgE binding. Four B cell epitopes of C. lunata ADH were identified. Peptide P2 can serve as a potential candidate for diagnosis of allergic diseases.
Predicting nucleic acid binding interfaces from structural models of proteins
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2011-01-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767
Predicting changes in cardiac myocyte contractility during early drug discovery with in vitro assays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morton, M.J., E-mail: michael.morton@astrazeneca.com; Armstrong, D.; Abi Gerges, N.
2014-09-01
Cardiovascular-related adverse drug effects are a major concern for the pharmaceutical industry. Activity of an investigational drug at the L-type calcium channel could manifest in a number of ways, including changes in cardiac contractility. The aim of this study was to define which of the two assay technologies – radioligand-binding or automated electrophysiology – was most predictive of contractility effects in an in vitro myocyte contractility assay. The activity of reference and proprietary compounds at the L-type calcium channel was measured by radioligand-binding assays, conventional patch-clamp, automated electrophysiology, and by measurement of contractility in canine isolated cardiac myocytes. Activity inmore » the radioligand-binding assay at the L-type Ca channel phenylalkylamine binding site was most predictive of an inotropic effect in the canine cardiac myocyte assay. The sensitivity was 73%, specificity 83% and predictivity 78%. The radioligand-binding assay may be run at a single test concentration and potency estimated. The least predictive assay was automated electrophysiology which showed a significant bias when compared with other assay formats. Given the importance of the L-type calcium channel, not just in cardiac function, but also in other organ systems, a screening strategy emerges whereby single concentration ligand-binding can be performed early in the discovery process with sufficient predictivity, throughput and turnaround time to influence chemical design and address a significant safety-related liability, at relatively low cost. - Highlights: • The L-type calcium channel is a significant safety liability during drug discovery. • Radioligand-binding to the L-type calcium channel can be measured in vitro. • The assay can be run at a single test concentration as part of a screening cascade. • This measurement is highly predictive of changes in cardiac myocyte contractility.« less
Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.
Sun, Meijian; Wang, Xia; Zou, Chuanxin; He, Zenghui; Liu, Wei; Li, Honglin
2016-06-07
RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .
Zheng, Wenjun
2017-01-10
Dynactin, a large multiprotein complex, binds with the cytoplasmic dynein-1 motor and various adaptor proteins to allow recruitment and transportation of cellular cargoes toward the minus end of microtubules. The structure of the dynactin complex is built around an actin-like minifilament with a defined length, which has been visualized in a high-resolution structure of the dynactin filament determined by cryo-electron microscopy (cryo-EM). To understand the energetic basis of dynactin filament assembly, we used molecular dynamics simulation to probe the intersubunit interactions among the actin-like proteins, various capping proteins, and four extended regions of the dynactin shoulder. Our simulations revealed stronger intersubunit interactions at the barbed and pointed ends of the filament and involving the extended regions (compared with the interactions within the filament), which may energetically drive filament termination by the capping proteins and recruitment of the actin-like proteins by the extended regions, two key features of the dynactin filament assembly process. Next, we modeled the unknown binding configuration among dynactin, dynein tails, and a number of coiled-coil adaptor proteins (including several Bicaudal-D and related proteins and three HOOK proteins), and predicted a key set of charged residues involved in their electrostatic interactions. Our modeling is consistent with previous findings of conserved regions, functional sites, and disease mutations in the adaptor proteins and will provide a structural framework for future functional and mutational studies of these adaptor proteins. In sum, this study yielded rich structural and energetic information about dynactin and associated adaptor proteins that cannot be directly obtained from the cryo-EM structures with limited resolutions.
SAAMBE: Webserver to Predict the Charge of Binding Free Energy Caused by Amino Acids Mutations.
Petukh, Marharyta; Dai, Luogeng; Alexov, Emil
2016-04-12
Predicting the effect of amino acid substitutions on protein-protein affinity (typically evaluated via the change of protein binding free energy) is important for both understanding the disease-causing mechanism of missense mutations and guiding protein engineering. In addition, researchers are also interested in understanding which energy components are mostly affected by the mutation and how the mutation affects the overall structure of the corresponding protein. Here we report a webserver, the Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) webserver, which addresses the demand for tools for predicting the change of protein binding free energy. SAAMBE is an easy to use webserver, which only requires that a coordinate file be inputted and the user is provided with various, but easy to navigate, options. The user specifies the mutation position, wild type residue and type of mutation to be made. The server predicts the binding free energy change, the changes of the corresponding energy components and provides the energy minimized 3D structure of the wild type and mutant proteins for download. The SAAMBE protocol performance was tested by benchmarking the predictions against over 1300 experimentally determined changes of binding free energy and a Pearson correlation coefficient of 0.62 was obtained. How the predictions can be used for discriminating disease-causing from harmless mutations is discussed. The webserver can be accessed via http://compbio.clemson.edu/saambe_webserver/.
Prediction of 3- to 5-Month Outcomes from Signs of Acute Bilirubin Toxicity in Newborn Infants.
El Houchi, Salma Z; Iskander, Iman; Gamaleldin, Rasha; El Shenawy, Amira; Seoud, Iman; Abou-Youssef, Hazem; Wennberg, Richard P
2017-04-01
To evaluate the ability of the bilirubin-induced neurologic dysfunction (BIND) score to predict residual neurologic and auditory disability and to document the relationship of BIND score to total serum bilirubin (TSB) concentration. The BIND score (assessing mental status, muscle tone, and cry patterns) was obtained serially at 6- to 8-hour intervals in 220 near-term and full-term infants with severe hyperbilirubinemia. Neurologic and/or auditory outcomes at 3-5 months of age were correlated with the highest calculated BIND score. The BIND score was also correlated with TSB. Follow-up neurologic and auditory examinations were performed for 145/202 (72%) surviving infants. All infants with severe acute bilirubin encephalopathy (BIND scores 7-9) either died or suffered residual neurologic and auditory impairment. Of 24 cases with moderate encephalopathy (BIND 4-6), 15 (62.5%) resolved following aggressive intervention and were normal at follow-up. Three of 73 infants with mild encephalopathy (BIND scores 1-3) but severe jaundice (TSB ranging 33.5-38 mg/dL; 573-650 µmol/L) had residual neurologic and/or auditory impairment. A BIND score ≥4 had a specificity of 87.3% and a sensitivity of 97.4% for predicting poor neurologic outcomes (receiver operating characteristic analysis). BIND scores trended higher with severe hyperbilirubinemia (r 2 = 0.54, P < .005), but 5/39 (13%) infants with TSB ≥36.5 mg/dL (624 µmol/L) had BIND scores ≤3, and normal outcomes at 3-5 months. The BIND score can be used to evaluate the severity of acute bilirubin encephalopathy and predict residual neurologic and hearing dysfunction. Copyright © 2017 Elsevier Inc. All rights reserved.
Computational methods for prediction of RNA interactions with metal ions and small organic ligands.
Philips, Anna; Łach, Grzegorz; Bujnicki, Janusz M
2015-01-01
In the recent years, it has become clear that a wide range of regulatory functions in bacteria are performed by riboswitches--regions of mRNA that change their structure upon external stimuli. Riboswitches are therefore attractive targets for drug design, molecular engineering, and fundamental research on regulatory circuitry of living cells. Several mechanisms are known for riboswitches controlling gene expression, but most of them perform their roles by ligand binding. As with other macromolecules, knowledge of the 3D structure of riboswitches is crucial for the understanding of their function. The development of experimental methods allowed for investigation of RNA structure and its complexes with ligands (which are either riboswitches' substrates or inhibitors) and metal cations (which stabilize the structure and are also known to be riboswitches' inhibitors). The experimental probing of different states of riboswitches is however time consuming, costly, and difficult to resolve without theoretical support. The natural consequence is the use of computational methods at least for initial research, such as the prediction of putative binding sites of ligands or metal ions. Here, we present a review on such methods, with a special focus on knowledge-based methods developed in our laboratory: LigandRNA--a scoring function for the prediction of RNA-small molecule interactions and MetalionRNA--a predictor of metal ions-binding sites in RNA structures. Both programs are available free of charge as a Web servers, LigandRNA at http://ligandrna.genesilico.pl and MetalionRNA at http://metalionrna.genesilico.pl/. © 2015 Elsevier Inc. All rights reserved.
Kratochwil, Nicole A; Gatti-McArthur, Silvia; Hoener, Marius C; Lindemann, Lothar; Christ, Andreas D; Green, Luke G; Guba, Wolfgang; Martin, Rainer E; Malherbe, Pari; Porter, Richard H P; Slack, Jay P; Winnig, Marcel; Dehmlow, Henrietta; Grether, Uwe; Hertel, Cornelia; Narquizian, Robert; Panousis, Constantinos G; Kolczewski, Sabine; Steward, Lucinda
2011-01-01
G protein-coupled receptors (GPCRs) share a common architecture consisting of seven transmembrane (TM) domains. Various lines of evidence suggest that this fold provides a generic binding pocket within the TM region for hosting agonists, antagonists, and allosteric modulators. Hence, an automated method was developed that allows a fast analysis and comparison of these generic ligand binding pockets across the entire GPCR family by providing the relevant information for all GPCRs in the same format. This methodology compiles amino acids lining the TM binding pocket including parts of the ECL2 loop in a so-called 1D ligand binding pocket vector and translates these 1D vectors in a second step into 3D receptor pharmacophore models. It aims to support various aspects of GPCR drug discovery in the pharmaceutical industry. Applications of pharmacophore similarity analysis of these 1D LPVs include definition of receptor subfamilies, prediction of species differences within subfamilies in regard to in vitro pharmacology and identification of nearest neighbors for GPCRs of interest to generate starting points for GPCR lead identification programs. These aspects of GPCR research are exemplified in the field of melanopsins, trace amine-associated receptors and somatostatin receptor subtype 5. In addition, it is demonstrated how 3D pharmacophore models of the LPVs can support the prediction of amino acids involved in ligand recognition, the understanding of mutational data in a 3D context and the elucidation of binding modes for GPCR ligands and their evaluation. Furthermore, guidance through 3D receptor pharmacophore modeling for the synthesis of subtype-specific GPCR ligands will be reported. Illustrative examples are taken from the GPCR family class C, metabotropic glutamate receptors 1 and 5 and sweet taste receptors, and from the GPCR class A, e.g. nicotinic acid and 5-hydroxytryptamine 5A receptor. © 2011 Bentham Science Publishers
Nuclear factors that bind to the enhancer region of nondefective Friend murine leukemia virus.
Manley, N R; O'Connell, M A; Sharp, P A; Hopkins, N
1989-01-01
Nondefective Friend murine leukemia virus (MuLV) causes erythroleukemia when injected into newborn NFS mice, while Moloney MuLV causes T-cell lymphoma. Exchange of the Friend virus enhancer region, a sequence of about 180 nucleotides including the direct repeat and a short 3'-adjacent segment, for the corresponding region in Moloney MuLV confers the ability to cause erythroid disease on Moloney MuLV. We have used the electrophoretic mobility shift assay and methylation interference analysis to identify cellular factors which bind to the Friend virus enhancer region and compared these with factors, previously identified, that bind to the Moloney virus direct repeat (N. A. Speck and D. Baltimore, Mol. Cell. Biol. 7:1101-1110, 1987). We identified five binding sites for sequence-specific DNA-binding proteins in the Friend virus enhancer region. While some binding sites are present in both the Moloney and Friend virus enhancers, both viruses contain unique sites not present in the other. Although none of the factors identified in this report which bind to these unique sites are present exclusively in T cells or erythroid cells, they bind to three regions of the enhancer shown by genetic analysis to encode disease specificity and thus are candidates to mediate the tissue-specific expression and distinct disease specificities encoded by these virus enhancer elements. Images PMID:2778872
Pereira, L A; van der Knaap, J A; van den Boom, V; van den Heuvel, F A; Timmers, H T
2001-11-01
The human RNA polymerase II transcription factor B-TFIID consists of TATA-binding protein (TBP) and the TBP-associated factor (TAF) TAF(II)170 and can rapidly redistribute over promoter DNA. Here we report the identification of human TBP-binding regions in human TAF(II)170. We have defined the TBP interaction domain of TAF(II)170 within three amino-terminal regions: residues 2 to 137, 290 to 381, and 380 to 460. Each region contains a pair of Huntington-elongation-A subunit-Tor repeats and exhibits species-specific interactions with TBP family members. Remarkably, the altered-specificity TBP mutant (TBP(AS)) containing a triple mutation in the concave surface is defective for binding the TAF(II)170 amino-terminal region of residues 1 to 504. Furthermore, within this region the TAF(II)170 residues 290 to 381 can inhibit the interaction between Drosophila TAF(II)230 (residues 2 to 81) and TBP through competition for the concave surface of TBP. Biochemical analyses of TBP binding to the TATA box indicated that TAF(II)170 region 290-381 inhibits TBP-DNA complex formation. Importantly, the TBP(AS) mutant is less sensitive to TAF(II)170 inhibition. Collectively, our results support a mechanism in which TAF(II)170 induces high-mobility DNA binding by TBP through reversible interactions with its concave DNA binding surface.
Prediction of Ras-effector interactions using position energy matrices.
Kiel, Christina; Serrano, Luis
2007-09-01
One of the more challenging problems in biology is to determine the cellular protein interaction network. Progress has been made to predict protein-protein interactions based on structural information, assuming that structural similar proteins interact in a similar way. In a previous publication, we have determined a genome-wide Ras-effector interaction network based on homology models, with a high accuracy of predicting binding and non-binding domains. However, for a prediction on a genome-wide scale, homology modelling is a time-consuming process. Therefore, we here successfully developed a faster method using position energy matrices, where based on different Ras-effector X-ray template structures, all amino acids in the effector binding domain are sequentially mutated to all other amino acid residues and the effect on binding energy is calculated. Those pre-calculated matrices can then be used to score for binding any Ras or effector sequences. Based on position energy matrices, the sequences of putative Ras-binding domains can be scanned quickly to calculate an energy sum value. By calibrating energy sum values using quantitative experimental binding data, thresholds can be defined and thus non-binding domains can be excluded quickly. Sequences which have energy sum values above this threshold are considered to be potential binding domains, and could be further analysed using homology modelling. This prediction method could be applied to other protein families sharing conserved interaction types, in order to determine in a fast way large scale cellular protein interaction networks. Thus, it could have an important impact on future in silico structural genomics approaches, in particular with regard to increasing structural proteomics efforts, aiming to determine all possible domain folds and interaction types. All matrices are deposited in the ADAN database (http://adan-embl.ibmc.umh.es/). Supplementary data are available at Bioinformatics online.
Brylinski, Michal; Skolnick, Jeffrey
2010-01-01
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609
Quantifying the Effect of DNA Packaging on Gene Expression Level
NASA Astrophysics Data System (ADS)
Kim, Harold
2010-10-01
Gene expression, the process by which the genetic code comes alive in the form of proteins, is one of the most important biological processes in living cells, and begins when transcription factors bind to specific DNA sequences in the promoter region upstream of a gene. The relationship between gene expression output and transcription factor input which is termed the gene regulation function is specific to each promoter, and predicting this gene regulation function from the locations of transcription factor binding sites is one of the challenges in biology. In eukaryotic organisms (for example, animals, plants, fungi etc), DNA is highly compacted into nucleosomes, 147-bp segments of DNA tightly wrapped around histone protein core, and therefore, the accessibility of transcription factor binding sites depends on their locations with respect to nucleosomes - sites inside nucleosomes are less accessible than those outside nucleosomes. To understand how transcription factor binding sites contribute to gene expression in a quantitative manner, we obtain gene regulation functions of promoters with various configurations of transcription factor binding sites by using fluorescent protein reporters to measure transcription factor input and gene expression output in single yeast cells. In this talk, I will show that the affinity of a transcription factor binding site inside and outside the nucleosome controls different aspects of the gene regulation function, and explain this finding based on a mass-action kinetic model that includes competition between nucleosomes and transcription factors.
The Disordered C-Terminus of Yeast Hsf1 Contains a Cryptic Low-Complexity Amyloidogenic Region.
Pujols, Jordi; Santos, Jaime; Pallarès, Irantzu; Ventura, Salvador
2018-05-06
Response mechanisms to external stress rely on networks of proteins able to activate specific signaling pathways to ensure the maintenance of cell proteostasis. Many of the proteins mediating this kind of response contain intrinsically disordered regions, which lack a defined structure, but still are able to interact with a wide range of clients that modulate the protein function. Some of these interactions are mediated by specific short sequences embedded in the longer disordered regions. Because the physicochemical properties that promote functional and abnormal interactions are similar, it has been shown that, in globular proteins, aggregation-prone and binding regions tend to overlap. It could be that the same principle applies for disordered protein regions. In this context, we show here that a predicted low-complexity interacting region in the disordered C-terminus of the stress response master regulator heat shock factor 1 (Hsf1) protein corresponds to a cryptic amyloid region able to self-assemble into fibrillary structures resembling those found in neurodegenerative disorders.
Functional Specialization and Flexibility in Human Association Cortex
Yeo, B. T. Thomas; Krienen, Fenna M.; Eickhoff, Simon B.; Yaakub, Siti N.; Fox, Peter T.; Buckner, Randy L.; Asplund, Christopher L.; Chee, Michael W.L.
2015-01-01
The association cortex supports cognitive functions enabling flexible behavior. Here, we explored the organization of human association cortex by mathematically formalizing the notion that a behavioral task engages multiple cognitive components, which are in turn supported by multiple overlapping brain regions. Application of the model to a large data set of neuroimaging experiments (N = 10 449) identified complex zones of frontal and parietal regions that ranged from being highly specialized to highly flexible. The network organization of the specialized and flexible regions was explored with an independent resting-state fMRI data set (N = 1000). Cortical regions specialized for the same components were strongly coupled, suggesting that components function as partially isolated networks. Functionally flexible regions participated in multiple components to different degrees. This heterogeneous selectivity was predicted by the connectivity between flexible and specialized regions. Functionally flexible regions might support binding or integrating specialized brain networks that, in turn, contribute to the ability to execute multiple and varied tasks. PMID:25249407
Welch, Brett D; Paduch, Marcin; Leser, George P; Bergman, Zachary; Kors, Christopher A; Paterson, Reay G; Jardetzky, Theodore S; Kossiakoff, Anthony A; Lamb, Robert A
2014-10-01
Paramyxoviruses are enveloped negative-strand RNA viruses that are significant human and animal pathogens. Most paramyxoviruses infect host cells via the concerted action of a tetrameric attachment protein (variously called HN, H, or G) that binds either sialic acid or protein receptors on target cells and a trimeric fusion protein (F) that merges the viral envelope with the plasma membrane at neutral pH. F initially folds to a metastable prefusion conformation that becomes activated via a cleavage event during cellular trafficking. Upon receptor binding, the attachment protein, which consists of a globular head anchored to the membrane via a helical tetrameric stalk, triggers a major conformation change in F which results in fusion of virus and host cell membranes. We recently proposed a model for F activation in which the attachment protein head domains move following receptor binding to expose HN stalk residues critical for triggering F. To test the model in the context of wild-type viral glycoproteins, we used a restricted-diversity combinatorial Fab library and phage display to rapidly generate synthetic antibodies (sAbs) against multiple domains of the paramyxovirus parainfluenza 5 (PIV5) pre- and postfusion F and HN. As predicted by the model, sAbs that bind to the critical F-triggering region of the HN stalk do not disrupt receptor binding or neuraminidase (NA) activity but are potent inhibitors of fusion. An inhibitory prefusion F-specific sAb recognized a quaternary antigenic site and may inhibit fusion by preventing F refolding or by blocking the F-HN interaction. Importance: The paramyxovirus family of negative-strand RNA viruses cause significant disease in humans and animals. The viruses bind to cells via their receptor binding protein and then enter cells by fusion of their envelope with the host cell plasma membrane, a process mediated by a metastable viral fusion (F) protein. To understand the steps in viral membrane fusion, a library of synthetic antibodies to F protein and the receptor binding protein was generated in bacteriophage. These antibodies bound to different regions of the F protein and the receptor binding protein, and the location of antibody binding affected different processes in viral entry into cells. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
The constant region affects antigen binding of antibodies to DNA by altering secondary structure.
Xia, Yumin; Janda, Alena; Eryilmaz, Ertan; Casadevall, Arturo; Putterman, Chaim
2013-11-01
We previously demonstrated an important role of the constant region in the pathogenicity of anti-DNA antibodies. To determine the mechanisms by which the constant region affects autoantibody binding, a panel of isotype-switch variants (IgG1, IgG2a, IgG2b) was generated from the murine PL9-11 IgG3 autoantibody. The affinity of the PL9-11 antibody panel for histone was measured by surface plasmon resonance (SPR). Tryptophan fluorescence was used to determine wavelength shifts of the antibody panel upon binding to DNA and histone. Finally, circular dichroism spectroscopy was used to measure changes in secondary structure. SPR analysis revealed significant differences in histone binding affinity between members of the PL9-11 panel. The wavelength shifts of tryptophan fluorescence emission were found to be dependent on the antibody isotype, while circular dichroism analysis determined that changes in antibody secondary structure content differed between isotypes upon antigen binding. Thus, the antigen binding affinity is dependent on the particular constant region expressed. Moreover, the effects of antibody binding to antigen were also constant region dependent. Alteration of secondary structures influenced by constant regions may explain differences in fine specificity of anti-DNA antibodies between antibodies with similar variable regions, as well as cross-reactivity of anti-DNA antibodies with non-DNA antigens. Copyright © 2013 Elsevier Ltd. All rights reserved.
Chung, C N; Hamaguchi, Y; Honjo, T; Kawaichi, M
1994-01-01
To map regions important for DNA binding of the mouse homologue of Suppressor of Hairless or RBP-J kappa protein, mutated mouse RBP-J kappa cDNAs were made by insertion of oligonucleotide linkers or base replacement. DNA binding assays using the mutated proteins expressed in COS cells showed that various mutations between 218 Arg and 227 Arg decreased the DNA binding activity drastically. The DNA binding activity was not affected by amino acid replacements within the integrase motif of the RBP-J kappa protein (230His-269His). Replacements between 291Arg and 323Tyr affected the DNA binding activity slightly but reproducibly. These results indicate that the region encompassing 218Arg-227Arg is critical for the DNA binding activity of RBP-J kappa. This region did not show any significant homology to motifs or domains of the previously described DNA binding proteins. Using a truncation mutant protein RBP-J kappa was shown to associate with DNA as a monomer. Images PMID:8065905
The unfoldomics decade: an update on intrinsically disordered proteins.
Dunker, A Keith; Oldfield, Christopher J; Meng, Jingwei; Romero, Pedro; Yang, Jack Y; Chen, Jessica Walton; Vacic, Vladimir; Obradovic, Zoran; Uversky, Vladimir N
2008-09-16
Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.
Identification and therapeutic potential of a vitronectin binding region of meningococcal msf.
Hill, Darryl J; Griffiths, Natalie J; Borodina, Elena; Andreae, Clio A; Sessions, Richard B; Virji, Mumtaz
2015-01-01
The human pathogen Neisseria meningitides (Nm) attains serum resistance via a number of mechanisms, one of which involves binding to the host complement regulator protein vitronectin. We have shown previously that the Meningococcal surface fibril (Msf), a trimeric autotransporter, binds to the activated form of vitronectin (aVn) to increase Nm survival in human serum. In this study, we aimed to identify the aVn-binding region of Msf to assess its potential as an antigen which can elicit antibodies that block aVn binding and/or possess bactericidal properties. Using several recombinant Msf fragments spanning its surface-exposed region, the smallest aVn-binding recombinants were found to span residues 1-86 and 39-124. The use of further deletion constructs and overlapping recombinant Msf fragments suggested that a region of Msf comprising residues 39-82 may be primarily important for aVn binding and that other regions may also be involved but to a lesser extent. Molecular modelling implicated K66 and K68, conserved in all available Msf sequences, to be involved in the interaction. Recombinant fragments which bound to aVn were able to reduce the survival advantage conveyed by aVn-interaction in serum bactericidal assays. Antibodies raised against one such fragment inhibited aVn binding to Msf. In addition, the antibodies enhanced specific killing of Msf-expressing Nm in a dose-dependent manner. Overall, this study identifies an aVn-binding region of Msf, an adhesin known to impart serum resistance properties to the pathogen; and shows that this region of Msf can elicit antibodies with dual properties which reduce pathogen survival within the host and thus has potential as a vaccine antigen.
Identification and Therapeutic Potential of a Vitronectin Binding Region of Meningococcal Msf
Hill, Darryl J.; Griffiths, Natalie J.; Borodina, Elena; Andreae, Clio A.; Sessions, Richard B.; Virji, Mumtaz
2015-01-01
The human pathogen Neisseria meningitides (Nm) attains serum resistance via a number of mechanisms, one of which involves binding to the host complement regulator protein vitronectin. We have shown previously that the Meningococcal surface fibril (Msf), a trimeric autotransporter, binds to the activated form of vitronectin (aVn) to increase Nm survival in human serum. In this study, we aimed to identify the aVn-binding region of Msf to assess its potential as an antigen which can elicit antibodies that block aVn binding and/or possess bactericidal properties. Using several recombinant Msf fragments spanning its surface-exposed region, the smallest aVn-binding recombinants were found to span residues 1-86 and 39-124. The use of further deletion constructs and overlapping recombinant Msf fragments suggested that a region of Msf comprising residues 39-82 may be primarily important for aVn binding and that other regions may also be involved but to a lesser extent. Molecular modelling implicated K66 and K68, conserved in all available Msf sequences, to be involved in the interaction. Recombinant fragments which bound to aVn were able to reduce the survival advantage conveyed by aVn-interaction in serum bactericidal assays. Antibodies raised against one such fragment inhibited aVn binding to Msf. In addition, the antibodies enhanced specific killing of Msf-expressing Nm in a dose-dependent manner. Overall, this study identifies an aVn-binding region of Msf, an adhesin known to impart serum resistance properties to the pathogen; and shows that this region of Msf can elicit antibodies with dual properties which reduce pathogen survival within the host and thus has potential as a vaccine antigen. PMID:25826209
DOE Office of Scientific and Technical Information (OSTI.GOV)
MacArthur, Stewart; Li, Xiao-Yong; Li, Jingyi
2009-05-15
BACKGROUND: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. RESULTS: Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of functionmore » and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. CONCLUSIONS: It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.« less
ProBiS-ligands: a web server for prediction of ligands by examination of protein binding sites.
Konc, Janez; Janežič, Dušanka
2014-07-01
The ProBiS-ligands web server predicts binding of ligands to a protein structure. Starting with a protein structure or binding site, ProBiS-ligands first identifies template proteins in the Protein Data Bank that share similar binding sites. Based on the superimpositions of the query protein and the similar binding sites found, the server then transposes the ligand structures from those sites to the query protein. Such ligand prediction supports many activities, e.g. drug repurposing. The ProBiS-ligands web server, an extension of the ProBiS web server, is open and free to all users at http://probis.cmm.ki.si/ligands. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Margreitter, Christian; Mayrhofer, Patrick; Kunert, Renate; Oostenbrink, Chris
2016-06-01
Monoclonal antibodies represent the fastest growing class of biotherapeutic proteins. However, as they are often initially derived from rodent organisms, there is a severe risk of immunogenic reactions, hampering their applicability. The humanization of these antibodies remains a challenging task in the context of rational drug design. "Superhumanization" describes the direct transfer of the complementarity determining regions to a human germline framework, but this humanization approach often results in loss of binding affinity. In this study, we present a new approach for predicting promising backmutation sites using molecular dynamics simulations of the model antibody Ab2/3H6. The simulation method was developed in close conjunction with novel specificity experiments. Binding properties of mAb variants were evaluated directly from crude supernatants and confirmed using established binding affinity assays for purified antibodies. Our approach provides access to the dynamical features of the actual binding sites of an antibody, based solely on the antibody sequence. Thus we do not need structural data on the antibody-antigen complex and circumvent cumbersome methods to assess binding affinities. © 2016 The Authors Journal of Molecular Recognition Published by John Wiley & Sons Ltd. © 2016 The Authors Journal of Molecular Recognition Published by John Wiley & Sons Ltd.
Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei
2016-01-01
Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851
Analysis of Physicochemical and Structural Properties Determining HIV-1 Coreceptor Usage
Bozek, Katarzyna; Lengauer, Thomas; Sierra, Saleta; Kaiser, Rolf; Domingues, Francisco S.
2013-01-01
The relationship of HIV tropism with disease progression and the recent development of CCR5-blocking drugs underscore the importance of monitoring virus coreceptor usage. As an alternative to costly phenotypic assays, computational methods aim at predicting virus tropism based on the sequence and structure of the V3 loop of the virus gp120 protein. Here we present a numerical descriptor of the V3 loop encoding its physicochemical and structural properties. The descriptor allows for structure-based prediction of HIV tropism and identification of properties of the V3 loop that are crucial for coreceptor usage. Use of the proposed descriptor for prediction results in a statistically significant improvement over the prediction based solely on V3 sequence with 3 percentage points improvement in AUC and 7 percentage points in sensitivity at the specificity of the 11/25 rule (95%). We additionally assessed the predictive power of the new method on clinically derived ‘bulk’ sequence data and obtained a statistically significant improvement in AUC of 3 percentage points over sequence-based prediction. Furthermore, we demonstrated the capacity of our method to predict therapy outcome by applying it to 53 samples from patients undergoing Maraviroc therapy. The analysis of structural features of the loop informative of tropism indicates the importance of two loop regions and their physicochemical properties. The regions are located on opposite strands of the loop stem and the respective features are predominantly charge-, hydrophobicity- and structure-related. These regions are in close proximity in the bound conformation of the loop potentially forming a site determinant for the coreceptor binding. The method is available via server under http://structure.bioinf.mpi-inf.mpg.de/. PMID:23555214
Chen, Chunhong; Newell, Kim; Lawrence, Gregory J.; Ellis, Jeffrey G.; Anderson, Peter A.; Dodds, Peter N.
2016-01-01
NOD-like receptors (NLRs) are central components of the plant immune system. L6 is a Toll/interleukin-1 receptor (TIR) domain-containing NLR from flax (Linum usitatissimum) conferring immunity to the flax rust fungus. Comparison of L6 to the weaker allele L7 identified two polymorphic regions in the TIR and the nucleotide binding (NB) domains that regulate both effector ligand-dependent and -independent cell death signaling as well as nucleotide binding to the receptor. This suggests that a negative functional interaction between the TIR and NB domains holds L7 in an inactive/ADP-bound state more tightly than L6, hence decreasing its capacity to adopt the active/ATP-bound state and explaining its weaker activity in planta. L6 and L7 variants with a more stable ADP-bound state failed to bind to AvrL567 in yeast two-hybrid assays, while binding was detected to the signaling active variants. This contrasts with current models predicting that effectors bind to inactive receptors to trigger activation. Based on the correlation between nucleotide binding, effector interaction, and immune signaling properties of L6/L7 variants, we propose that NLRs exist in an equilibrium between ON and OFF states and that effector binding to the ON state stabilizes this conformation, thereby shifting the equilibrium toward the active form of the receptor to trigger defense signaling. PMID:26744216
Hot spot analysis for driving the development of hits into leads in fragment based drug discovery
Hall, David R.; Ngan, Chi Ho; Zerbe, Brandon S.; Kozakov, Dima; Vajda, Sandor
2011-01-01
Fragment based drug design (FBDD) starts with finding fragment-sized compounds that are highly ligand efficient and can serve as a core moiety for developing high affinity leads. Although the core-bound structure of a protein facilitates the construction of leads, effective design is far from straightforward. We show that protein mapping, a computational method developed to find binding hot spots and implemented as the FTMap server, provides information that complements the fragment screening results and can drive the evolution of core fragments into larger leads with a minimal loss or, in some cases, even a gain in ligand efficiency. The method places small molecular probes, the size of organic solvents, on a dense grid around the protein, and identifies the hot spots as consensus clusters formed by clusters of several probes. The hot spots are ranked based on the number of probe clusters, which predicts the binding propensity of the subsites and hence their importance for drug design. Accordingly, with a single exception the main hot spot identified by FTMap binds the core compound found by fragment screening. The most useful information is provided by the neighboring secondary hot spots, indicating the regions where the core can be extended to increase its affinity. To quantify this information, we calculate the density of probes from mapping, which describes the binding propensity at each point, and show that the change in the correlation between a ligand position and the probe density upon extending or repositioning the core moiety predicts the expected change in ligand efficiency. PMID:22145575
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord
José-Edwards, Diana S.; Oda-Ishii, Izumi; Kugler, Jamie E.; Passamaneck, Yale J.; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna
2015-01-01
A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.
José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna
2015-12-01
A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.
NASA Technical Reports Server (NTRS)
Hsieh, H. L.; Tong, C. G.; Thomas, C.; Roux, S. J.
1996-01-01
A CDNA encoding a 47 kDa nucleoside triphosphatase (NTPase) that is associated with the chromatin of pea nuclei has been cloned and sequenced. The translated sequence of the cDNA includes several domains predicted by known biochemical properties of the enzyme, including five motifs characteristic of the ATP-binding domain of many proteins, several potential casein kinase II phosphorylation sites, a helix-turn-helix region characteristic of DNA-binding proteins, and a potential calmodulin-binding domain. The deduced primary structure also includes an N-terminal sequence that is a predicted signal peptide and an internal sequence that could serve as a bipartite-type nuclear localization signal. Both in situ immunocytochemistry of pea plumules and immunoblots of purified cell fractions indicate that most of the immunodetectable NTPase is within the nucleus, a compartment proteins typically reach through nuclear pores rather than through the endoplasmic reticulum pathway. The translated sequence has some similarity to that of human lamin C, but not high enough to account for the earlier observation that IgG against human lamin C binds to the NTPase in immunoblots. Northern blot analysis shows that the NTPase MRNA is strongly expressed in etiolated plumules, but only poorly or not at all in the leaf and stem tissues of light-grown plants. Accumulation of NTPase mRNA in etiolated seedlings is stimulated by brief treatments with both red and far-red light, as is characteristic of very low-fluence phytochrome responses. Southern blotting with pea genomic DNA indicates the NTPase is likely to be encoded by a single gene.
Design principles of a microtubule polymerase
Geyer, Elisabeth A; Miller, Matthew P; Brautigam, Chad A; Biggins, Sue
2018-01-01
Stu2/XMAP215 microtubule polymerases use multiple tubulin-binding TOG domains and a lattice-binding basic region to processively promote faster elongation. How the domain composition and organization of these proteins dictate polymerase activity, end localization, and processivity is unknown. We show that polymerase activity does not require different kinds of TOGs, nor are there strict requirements for how the TOGs are linked. We identify an unexpected antagonism between the tubulin-binding TOGs and the lattice-binding basic region: lattice binding by the basic region is weak when at least two TOGs engage tubulins, strong when TOGs are empty. End-localization of Stu2 requires unpolymerized tubulin, at least two TOGs, and polymerase competence. We propose a ‘ratcheting’ model for processivity: transfer of tubulin from TOGs to the lattice activates the basic region, retaining the polymerase at the end for subsequent rounds of tubulin binding and incorporation. These results clarify design principles of the polymerase. PMID:29897335
Revised Model of Calcium and Magnesium Binding to the Bacterial Cell Wall
Thomas, Kieth J.; Rice, Charles V.
2014-01-01
Metals bind to the bacterial cell wall yet the binding mechanisms and affinity constants are not fully understood. The cell wall of gram positive bacteria is characterized by a thick layer of peptidoglycan and anionic teichoic acids anchored in the cytoplasmic membrane (lipoteichoic acid) or covalently bound to the cell wall (wall teichoic acid). The polyphosphate groups of teichoic acid provide one-half of the metal binding sites for calcium and magnesium, contradicting previous reports that calcium binding is 100% dependent on teichoic acid. The remaining binding sites are formed with the carboxyl units of peptidoglycan. In this work we report equilibrium association constants and total metal binding capacities for the interaction of calcium and magnesium ions with the bacterial cell wall. Metal binding is much stronger and previously reported. Curvature of Scatchard plots from the binding data and the resulting two regions of binding affinity suggest the presence of negative cooperative binding, meaning that the binding affinity decreases as more ions become bound to the sample. For Ca2+, Region I has a KA = (1.0 ± 0.2) × 106 M−1 and Region II has a KA = (0.075 ± 0.058) × 106 M−1. For Mg2+, KA1 = (1.5 ± 0.1) × 106 and KA2 = (0.17 ± 0.10) × 106. A binding capacity (η) is reported for both regions. However, since binding is still occurring in Region II, the total binding capacity is denoted by η2, which are 0.70 ± 0.04 µmol/mg and 0.67 ± 0.03 µmol/mg for Ca2+ and Mg2+ respectively. These data contradict the current paradigm of there being a single metal affinity value that is constant over a range of concentrations. We also find that measurement of equilibrium binding constants is highly sample dependent, suggesting a role for diffusion of metals through heterogeneous cell wall fragments. As a result, we are able to reconcile many contradictory theories that describe binding affinity and the binding mode of divalent metal cations. PMID:25315444
Lee, Hui Sun; Jo, Sunhwan; Lim, Hyun-Suk; Im, Wonpil
2012-07-23
Molecular docking is widely used to obtain binding modes and binding affinities of a molecule to a given target protein. Despite considerable efforts, however, prediction of both properties by docking remains challenging mainly due to protein's structural flexibility and inaccuracy of scoring functions. Here, an integrated approach has been developed to improve the accuracy of binding mode and affinity prediction and tested for small molecule MDM2 and MDMX antagonists. In this approach, initial candidate models selected from docking are subjected to equilibration MD simulations to further filter the models. Free energy perturbation molecular dynamics (FEP/MD) simulations are then applied to the filtered ligand models to enhance the ability in predicting the near-native ligand conformation. The calculated binding free energies for MDM2 complexes are overestimated compared to experimental measurements mainly due to the difficulties in sampling highly flexible apo-MDM2. Nonetheless, the FEP/MD binding free energy calculations are more promising for discriminating binders from nonbinders than docking scores. In particular, the comparison between the MDM2 and MDMX results suggests that apo-MDMX has lower flexibility than apo-MDM2. In addition, the FEP/MD calculations provide detailed information on the different energetic contributions to ligand binding, leading to a better understanding of the sensitivity and specificity of protein-ligand interactions.
Fu, Junjie; Xia, Amy; Dai, Yao; Qi, Xin
2016-01-01
Discovering molecules capable of binding to HIV trans-activation responsive region (TAR) RNA thereby disrupting its interaction with Tat protein is an attractive strategy for developing novel antiviral drugs. Computational docking is considered as a useful tool for predicting binding affinity and conducting virtual screening. Although great progress in predicting protein-ligand interactions has been achieved in the past few decades, modeling RNA-ligand interactions is still largely unexplored due to the highly flexible nature of RNA. In this work, we performed molecular docking study with HIV TAR RNA using previously identified cyclic peptide L22 and its analogues with varying affinities toward HIV-1 TAR RNA. Furthermore, sarcosine scan was conducted to generate derivatives of CGP64222, a peptide-peptoid hybrid with inhibitory activity on Tat/TAR RNA interaction. Each compound was docked using CDOCKER, Surflex-Dock and FlexiDock to compare the effectiveness of each method. It was found that FlexiDock energy values correlated well with the experimental Kd values and could be used to predict the affinity of the ligands toward HIV-1 TAR RNA with a superior accuracy. Our results based on comparative analysis of different docking methods in RNA-ligand modeling will facilitate the structure-based discovery of HIV TAR RNA ligands for antiviral therapy.
Zuo, Zhili; Gandhi, Neha S; Mancera, Ricardo L
2010-12-27
The leucine zipper region of activator protein-1 (AP-1) comprises the c-Jun and c-Fos proteins and constitutes a well-known coiled coil protein-protein interaction motif. We have used molecular dynamics (MD) simulations in conjunction with the molecular mechanics/Poisson-Boltzmann generalized-Born surface area [MM/PB(GB)SA] methods to predict the free energy of interaction of these proteins. In particular, the influence of the choice of solvation model, protein force field, and water potential on the stability and dynamic properties of the c-Fos-c-Jun complex were investigated. Use of the AMBER polarizable force field ff02 in combination with the polarizable POL3 water potential was found to result in increased stability of the c-Fos-c-Jun complex. MM/PB(GB)SA calculations revealed that MD simulations using the POL3 water potential give the lowest predicted free energies of interaction compared to other nonpolarizable water potentials. In addition, the calculated absolute free energy of binding was predicted to be closest to the experimental value using the MM/GBSA method with independent MD simulation trajectories using the POL3 water potential and the polarizable ff02 force field, while all other binding affinities were overestimated.
Martínez-Sernández, Victoria; Mezo, Mercedes; González-Warleta, Marta; Perteguer, María J; Gárate, Teresa; Romarís, Fernanda; Ubeira, Florencio M
2017-05-26
MF6p/FhHDM-1 is a small protein secreted by the parasitic flatworm (trematode) Fasciola hepatica that belongs to a broad family of heme-binding proteins (MF6p/helminth defense molecules (HDMs)). MF6p/HDMs are of interest for understanding heme homeostasis in trematodes and as potential targets for the development of new flukicides. Moreover, interest in these molecules has also increased because of their immunomodulatory properties. Here we have extended our previous findings on the mechanism of MF6p/HDM-heme interactions and mapped the protein regions required for heme binding and for other biological functions. Our data revealed that MF6p/FhHDM-1 forms high-molecular-weight complexes when associated with heme and that these complexes are reorganized by a stacking procedure to form fibril-like and granular nanostructures. Furthermore, we showed that MF6p/FhHDM-1 is a transitory heme-binding protein as protein·heme complexes can be disrupted by contact with an apoprotein ( e.g. apomyoglobin) with higher affinity for heme. We also demonstrated that (i) the heme-binding region is located in the MF6p/FhHDM-1 C-terminal moiety, which also inhibits the peroxidase-like activity of heme, and (ii) MF6p/HDMs from other trematodes, such as Opisthorchis viverrini and Paragonimus westermani , also bind heme. Finally, we observed that the N-terminal, but not the C-terminal, moiety of MF6p/HDMs has a predicted structural analogy with cell-penetrating peptides and that both the entire protein and the peptide corresponding to the N-terminal moiety of MF6p/FhHDM-1 interact in vitro with cell membranes in hemin-preconditioned erythrocytes. Our findings suggest that MF6p/HDMs can transport heme in trematodes and thereby shield the parasite from the harmful effects of heme. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Yoder-Himes, Deborah R.; Kroos, Lee
2006-01-01
The bacterium Myxococcus xanthus employs extracellular signals to coordinate aggregation and sporulation during multicellular development. Extracellular, contact-dependent signaling that involves the CsgA protein (called C-signaling) activates FruA, a putative response regulator that governs a branched signaling pathway inside cells. One branch regulates cell movement, leading to aggregation. The other branch regulates gene expression, leading to sporulation. C-signaling is required for full expression of most genes induced after 6 h into development, including the gene identified by Tn5 lac insertion Ω4400. To determine if FruA is a direct regulator of Ω4400 transcription, a combination of in vivo and in vitro experiments was performed. Ω4400 expression was abolished in a fruA mutant. The DNA-binding domain of FruA bound specifically to DNA upstream of the promoter −35 region in vitro. Mutations between bp −86 and −77 greatly reduced binding. One of these mutations had been shown previously to reduce Ω4400 expression in vivo and make it independent of C-signaling. For the first time, chromatin immunoprecipitation (ChIP) experiments were performed on M. xanthus. The ChIP experiments demonstrated that FruA is associated with the Ω4400 promoter region late in development, even in the absence of C-signaling. Based on these results, we propose that FruA directly activates Ω4400 transcription to a moderate level prior to C-signaling and, in response to C-signaling, binds near bp −80 and activates transcription to a higher level. Also, the highly localized effects of mutations between bp −86 and −77 on DNA binding in vitro, together with recently published footprints, allow us to predict a consensus binding site of GTCG/CGA/G for the FruA DNA-binding domain. PMID:16816188
Ito, Takeshi; Ninokura, Satoshi; Kitazumi, Yuki; Mezic, Katherine G.; Cress, Brady F.; Koffas, Mattheos A. G.; Morgan, Joel E.; Barquera, Blanca; Miyoshi, Hideto
2017-01-01
The Na+-pumping NADH-quinone oxidoreductase (Na+-NQR) is the first enzyme of the respiratory chain and the main ion transporter in many marine and pathogenic bacteria, including Vibrio cholerae. The V. cholerae Na+-NQR has been extensively studied, but its binding sites for ubiquinone and inhibitors remain controversial. Here, using a photoreactive ubiquinone PUQ-3 as well as two aurachin-type inhibitors [125I]PAD-1 and [125I]PAD-2 and photoaffinity labeling experiments on the isolated enzyme, we demonstrate that the ubiquinone ring binds to the NqrA subunit in the regions Leu-32–Met-39 and Phe-131–Lys-138, encompassing the rear wall of a predicted ubiquinone-binding cavity. The quinolone ring and alkyl side chain of aurachin bound to the NqrB subunit in the regions Arg-43–Lys-54 and Trp-23–Gly-89, respectively. These results indicate that the binding sites for ubiquinone and aurachin-type inhibitors are in close proximity but do not overlap one another. Unexpectedly, although the inhibitory effects of PAD-1 and PAD-2 were almost completely abolished by certain mutations in NqrB (i.e. G140A and E144C), the binding reactivities of [125I]PAD-1 and [125I]PAD-2 to the mutated enzymes were unchanged compared with those of the wild-type enzyme. We also found that photoaffinity labeling by [125I]PAD-1 and [125I]PAD-2, rather than being competitively suppressed in the presence of other inhibitors, is enhanced under some experimental conditions. To explain these apparently paradoxical results, we propose models for the catalytic reaction of Na+-NQR and its interactions with inhibitors on the basis of the biochemical and biophysical results reported here and in previous work. PMID:28298441
Affholter, J A; Cascieri, M A; Bayne, M L; Brange, J; Casaretto, M; Roth, R A
1990-08-21
Insulin-degrading enzyme (IDE) hydrolyzes insulin at a limited number of sites. Although the positions of these cleavages are known, the residues of insulin important in its binding to IDE have not been defined. To this end, we have studied the binding of a variety of insulin analogues to the protease in a solid-phase binding assay using immunoimmobilized IDE. Since IDE binds insulin with 600-fold greater affinity than it does insulin-like growth factor I (25 nM and approximately 16,000 nM, respectively), the first set of analogues studied were hybrid molecules of insulin and IGF I. IGF I mutants [insB1-17,17-70]IGF I, [Tyr55,Gln56]IGF I, and [Phe23,Phe24,Tyr25]IGF I have been synthesized and share the property of having insulin-like amino acids at positions corresponding to primary sites of cleavage of insulin by IDE. Whereas the first two exhibit affinities for IDE similar to that of wild type IGF I, the [Phe23,Phe24,Tyr25]IGF I analogue has a 32-fold greater affinity for the immobilized enzyme. Replacement of Phe-23 by Ser eliminates this increase. Removal of the eight amino acid D-chain region of IGF I (which has been predicted to interfere with binding to the 23-25 region) results in a 25-fold increase in affinity for IDE, confirming the importance of residues 23-25 in the high-affinity recognition of IDE. A similar role for the corresponding (B24-26) residues of insulin is supported by the use of site-directed mutant and semisynthetic insulin analogues. Insulin mutants [B25-Asp]insulin and [B25-His]insulin display 16- and 20-fold decreases in IDE affinity versus wild-type insulin.(ABSTRACT TRUNCATED AT 250 WORDS)
Fuzzy regions in an intrinsically disordered protein impair protein-protein interactions.
Gruet, Antoine; Dosnon, Marion; Blocquel, David; Brunel, Joanna; Gerlier, Denis; Das, Rahul K; Bonetti, Daniela; Gianni, Stefano; Fuxreiter, Monika; Longhi, Sonia; Bignon, Christophe
2016-02-01
Despite the partial disorder-to-order transition that intrinsically disordered proteins often undergo upon binding to their partners, a considerable amount of residual disorder may be retained in the bound form, resulting in a fuzzy complex. Fuzzy regions flanking molecular recognition elements may enable partner fishing through non-specific, transient contacts, thereby facilitating binding, but may also disfavor binding through various mechanisms. So far, few computational or experimental studies have addressed the effect of fuzzy appendages on partner recognition by intrinsically disordered proteins. In order to shed light onto this issue, we used the interaction between the intrinsically disordered C-terminal domain of the measles virus (MeV) nucleoprotein (NTAIL ) and the X domain (XD) of the viral phosphoprotein as model system. After binding to XD, the N-terminal region of NTAIL remains conspicuously disordered, with α-helical folding taking place only within a short molecular recognition element. To study the effect of the N-terminal fuzzy region on NTAIL /XD binding, we generated N-terminal truncation variants of NTAIL , and assessed their binding abilities towards XD. The results revealed that binding increases with shortening of the N-terminal fuzzy region, with this also being observed with hsp70 (another MeV NTAIL binding partner), and for the homologous NTAIL /XD pairs from the Nipah and Hendra viruses. Finally, similar results were obtained when the MeV NTAIL fuzzy region was replaced with a highly dissimilar artificial disordered sequence, supporting a sequence-independent inhibitory effect of the fuzzy region. © 2015 Federation of European Biochemical Societies.
Automated benchmarking of peptide-MHC class I binding predictions.
Trolle, Thomas; Metushi, Imir G; Greenbaum, Jason A; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten
2015-07-01
Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. mniel@cbs.dtu.dk or bpeters@liai.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Automated benchmarking of peptide-MHC class I binding predictions
Trolle, Thomas; Metushi, Imir G.; Greenbaum, Jason A.; Kim, Yohan; Sidney, John; Lund, Ole; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten
2015-01-01
Motivation: Numerous in silico methods predicting peptide binding to major histocompatibility complex (MHC) class I molecules have been developed over the last decades. However, the multitude of available prediction tools makes it non-trivial for the end-user to select which tool to use for a given task. To provide a solid basis on which to compare different prediction tools, we here describe a framework for the automated benchmarking of peptide-MHC class I binding prediction tools. The framework runs weekly benchmarks on data that are newly entered into the Immune Epitope Database (IEDB), giving the public access to frequent, up-to-date performance evaluations of all participating tools. To overcome potential selection bias in the data included in the IEDB, a strategy was implemented that suggests a set of peptides for which different prediction methods give divergent predictions as to their binding capability. Upon experimental binding validation, these peptides entered the benchmark study. Results: The benchmark has run for 15 weeks and includes evaluation of 44 datasets covering 17 MHC alleles and more than 4000 peptide-MHC binding measurements. Inspection of the results allows the end-user to make educated selections between participating tools. Of the four participating servers, NetMHCpan performed the best, followed by ANN, SMM and finally ARB. Availability and implementation: Up-to-date performance evaluations of each server can be found online at http://tools.iedb.org/auto_bench/mhci/weekly. All prediction tool developers are invited to participate in the benchmark. Sign-up instructions are available at http://tools.iedb.org/auto_bench/mhci/join. Contact: mniel@cbs.dtu.dk or bpeters@liai.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25717196
ERIC Educational Resources Information Center
Hughes, Gethin; Desantis, Andrea; Waszak, Florian
2013-01-01
Sensory processing of action effects has been shown to differ from that of externally triggered stimuli, with respect both to the perceived timing of their occurrence (intentional binding) and to their intensity (sensory attenuation). These phenomena are normally attributed to forward action models, such that when action prediction is consistent…
Cornes, Belinda K; Brody, Jennifer A; Nikpoor, Naghmeh; Morrison, Alanna C; Chu, Huan; Ahn, Byung Soo; Wang, Shuai; Dauriz, Marco; Barzilay, Joshua I; Dupuis, Josée; Florez, Jose C; Coresh, Josef; Gibbs, Richard A; Kao, W H Linda; Liu, Ching-Ti; McKnight, Barbara; Muzny, Donna; Pankow, James S; Reid, Jeffrey G; White, Charles C; Johnson, Andrew D; Wong, Tien Y; Psaty, Bruce M; Boerwinkle, Eric; Rotter, Jerome I; Siscovick, David S; Sladek, Robert; Meigs, James B
2014-06-01
Common variation at the 11p11.2 locus, encompassing MADD, ACP2, NR1H3, MYBPC3, and SPI1, has been associated in genome-wide association studies with fasting glucose and insulin (FI). In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study, we sequenced 5 gene regions at 11p11.2 to identify rare, potentially functional variants influencing fasting glucose or FI levels. Sequencing (mean depth, 38×) across 16.1 kb in 3566 individuals without diabetes mellitus identified 653 variants, 79.9% of which were rare (minor allele frequency <1%) and novel. We analyzed rare variants in 5 gene regions with FI or fasting glucose using the sequence kernel association test. At NR1H3, 53 rare variants were jointly associated with FI (P=2.73×10(-3)); of these, 7 were predicted to have regulatory function and showed association with FI (P=1.28×10(-3)). Conditioning on 2 previously associated variants at MADD (rs7944584, rs10838687) did not attenuate this association, suggesting that there are >2 independent signals at 11p11.2. One predicted regulatory variant, chr11:47227430 (hg18; minor allele frequency=0.00068), contributed 20.6% to the overall sequence kernel association test score at NR1H3, lies in intron 2 of NR1H3, and is a predicted binding site for forkhead box A1 (FOXA1), a transcription factor associated with insulin regulation. In human HepG2 hepatoma cells, the rare chr11:47227430 A allele disrupted FOXA1 binding and reduced FOXA1-dependent transcriptional activity. Sequencing at 11p11.2-NR1H3 identified rare variation associated with FI. One variant, chr11:47227430, seems to be functional, with the rare A allele reducing transcription factor FOXA1 binding and FOXA1-dependent transcriptional activity. © 2014 American Heart Association, Inc.
Hara, Hirokazu; Takeda, Tatsuya; Yamamoto, Nozomi; Furuya, Keisuke; Hirose, Kazuya; Kamiya, Tetsuro; Adachi, Tetsuo
2013-07-01
Bim is a member of the pro-apoptotic BH3-only Bcl-2 family of proteins. Bim gene undergoes alternative splicing to produce three predominant splicing variants (BimEL, BimL and BimS). The smallest variant BimS is the most potent inducer of apoptosis. Zinc (Zn(2+)) has been reported to stimulate apoptosis in various cell types. In this study, we examined whether Zn(2+) affects the expression of Bim in human neuroblastoma SH-SY5Y cells. Zn(2+) triggered alterations in Bim splicing and induced preferential generation of BimS, but not BimEL and BimL, in a dose- and time-dependent manner. Other metals (cadmium, cobalt and copper) and stresses (oxidative, endoplasmic reticulum and genotoxic stresses) had little or no effect on the expression of BimS. To address the mechanism of Zn(2+)-induced preferential generation of BimS, which lacks exon 4, we developed a Bim mini-gene construct. Deletion analysis using the Bim mini-gene revealed that predicted binding sites of the SR protein SRSF6, also known as SRp55, are located in the intronic region adjacent to exon 4. We also found that mutations in the predicted SRSF6-binding sites abolished generation of BimS mRNA from the mutated Bim mini-gene. In addition, a UV cross-linking assay followed by Western blotting showed that SRSF6 directly bound to the predicted binding site and Zn(2+) suppressed this binding. Moreover, Zn(2+) stimulated SRSF6 hyper-phosphorylation. TG003, a cdc2-like kinase inhibitor, partially prevented Zn(2+)-induced generation of BimS and SRSF6 hyper-phosphorylation. Taken together, our findings suggest that Zn(2+) inhibits the activity of SRSF6 and promotes elimination of exon 4, leading to preferential generation of BimS. © 2013 FEBS.
Kulp, John L.; Cloudsdale, Ian S.; Kulp, John L.
2017-01-01
Chemically diverse fragments tend to collectively bind at localized sites on proteins, which is a cornerstone of fragment-based techniques. A central question is how general are these strategies for predicting a wide variety of molecular interactions such as small molecule-protein, protein-protein and protein-nucleic acid for both experimental and computational methods. To address this issue, we recently proposed three governing principles, (1) accurate prediction of fragment-macromolecule binding free energy, (2) accurate prediction of water-macromolecule binding free energy, and (3) locating sites on a macromolecule that have high affinity for a diversity of fragments and low affinity for water. To test the generality of these concepts we used the computational technique of Simulated Annealing of Chemical Potential to design one small fragment to break the RecA-RecA protein-protein interaction and three fragments that inhibit peptide-deformylase via water-mediated multi-body interactions. Experiments confirm the predictions that 6-hydroxydopamine potently inhibits RecA and that PDF inhibition quantitatively tracks the water-mediated binding predictions. Additionally, the principles correctly predict the essential bound waters in HIV Protease, the surprisingly extensive binding site of elastase, the pinpoint location of electron transfer in dihydrofolate reductase, the HIV TAT-TAR protein-RNA interactions, and the MDM2-MDM4 differential binding to p53. The experimental confirmations of highly non-obvious predictions combined with the precise characterization of a broad range of known phenomena lend strong support to the generality of fragment-based methods for characterizing molecular recognition. PMID:28837642
Kulp, John L; Cloudsdale, Ian S; Kulp, John L; Guarnieri, Frank
2017-01-01
Chemically diverse fragments tend to collectively bind at localized sites on proteins, which is a cornerstone of fragment-based techniques. A central question is how general are these strategies for predicting a wide variety of molecular interactions such as small molecule-protein, protein-protein and protein-nucleic acid for both experimental and computational methods. To address this issue, we recently proposed three governing principles, (1) accurate prediction of fragment-macromolecule binding free energy, (2) accurate prediction of water-macromolecule binding free energy, and (3) locating sites on a macromolecule that have high affinity for a diversity of fragments and low affinity for water. To test the generality of these concepts we used the computational technique of Simulated Annealing of Chemical Potential to design one small fragment to break the RecA-RecA protein-protein interaction and three fragments that inhibit peptide-deformylase via water-mediated multi-body interactions. Experiments confirm the predictions that 6-hydroxydopamine potently inhibits RecA and that PDF inhibition quantitatively tracks the water-mediated binding predictions. Additionally, the principles correctly predict the essential bound waters in HIV Protease, the surprisingly extensive binding site of elastase, the pinpoint location of electron transfer in dihydrofolate reductase, the HIV TAT-TAR protein-RNA interactions, and the MDM2-MDM4 differential binding to p53. The experimental confirmations of highly non-obvious predictions combined with the precise characterization of a broad range of known phenomena lend strong support to the generality of fragment-based methods for characterizing molecular recognition.
Simultaneous prediction of binding free energy and specificity for PDZ domain-peptide interactions
NASA Astrophysics Data System (ADS)
Crivelli, Joseph J.; Lemmon, Gordon; Kaufmann, Kristian W.; Meiler, Jens
2013-12-01
Interactions between protein domains and linear peptides underlie many biological processes. Among these interactions, the recognition of C-terminal peptides by PDZ domains is one of the most ubiquitous. In this work, we present a mathematical model for PDZ domain-peptide interactions capable of predicting both affinity and specificity of binding based on X-ray crystal structures and comparative modeling with R osetta. We developed our mathematical model using a large phage display dataset describing binding specificity for a wild type PDZ domain and 91 single mutants, as well as binding affinity data for a wild type PDZ domain binding to 28 different peptides. Structural refinement was carried out through several R osetta protocols, the most accurate of which included flexible peptide docking and several iterations of side chain repacking and backbone minimization. Our findings emphasize the importance of backbone flexibility and the energetic contributions of side chain-side chain hydrogen bonds in accurately predicting interactions. We also determined that predicting PDZ domain-peptide interactions became increasingly challenging as the length of the peptide increased in the N-terminal direction. In the training dataset, predicted binding energies correlated with those derived through calorimetry and specificity switches introduced through single mutations at interface positions were recapitulated. In independent tests, our best performing protocol was capable of predicting dissociation constants well within one order of magnitude of the experimental values and specificity profiles at the level of accuracy of previous studies. To our knowledge, this approach represents the first integrated protocol for predicting both affinity and specificity for PDZ domain-peptide interactions.
MSPocket: an orientation-independent algorithm for the detection of ligand binding pockets.
Zhu, Hongbo; Pisabarro, M Teresa
2011-02-01
Identification of ligand binding pockets on proteins is crucial for the characterization of protein functions. It provides valuable information for protein-ligand docking and rational engineering of small molecules that regulate protein functions. A major number of current prediction algorithms of ligand binding pockets are based on cubic grid representation of proteins and, thus, the results are often protein orientation dependent. We present the MSPocket program for detecting pockets on the solvent excluded surface of proteins. The core algorithm of the MSPocket approach does not use any cubic grid system to represent proteins and is therefore independent of protein orientations. We demonstrate that MSPocket is able to achieve an accuracy of 75% in predicting ligand binding pockets on a test dataset used for evaluating several existing methods. The accuracy is 92% if the top three predictions are considered. Comparison to one of the recently published best performing methods shows that MSPocket reaches similar performance with the additional feature of being protein orientation independent. Interestingly, some of the predictions are different, meaning that the two methods can be considered complementary and combined to achieve better prediction accuracy. MSPocket also provides a graphical user interface for interactive investigation of the predicted ligand binding pockets. In addition, we show that overlap criterion is a better strategy for the evaluation of predicted ligand binding pockets than the single point distance criterion. The MSPocket source code can be downloaded from http://appserver.biotec.tu-dresden.de/MSPocket/. MSPocket is also available as a PyMOL plugin with a graphical user interface.
Location of the synaptosome-binding regions on botulinum neurotoxin B.
Dolimbek, Behzod Z; Steward, Lance E; Aoki, K Roger; Atassi, M Zouhair
2012-01-10
The regions of botulinum neurotoxin B (BoNT/B) involved in binding to mouse brain synaptosomes (snps) were localized. Sixty 19-residue overlapping peptides (peptide C31 consisted of 24 residues) encompassing BoNT/B H chain (residues 442-1291) were synthesized and used to inhibit binding of (125)I-labeled BoNT/B to snps. Synaptosome-binding regions were noncompeting and existed on both H(N) and H(C) domains of neurotoxin. At 37 °C, inhibitory activities on H(N) resided, in decreasing order, in peptides 638-656 (26.7%), 596-614 (18.2%), 512-530 (13.9%), 778-796 (13.8%), and 526-544 (11.6%). On H(C), activity resided in decreasing order in peptides 1170-1188 (44.6%), 1128-1146 (21.6%), 1184-1202 (18.6%), 1156-1174 (13.0%), 946-964 (11.8%), 1114-1132 (11.2%), 1100-1118 (6.2%), 876-894 (6.1%), 1268-1291 (4.6%), and 1226-1244 (4.3%). The 45 remaining H(N) and H(C) peptides had no activity. At 4 °C, peptide C24 (1170-1188) remained quite active (inhibiting, 31.2%), while activities of peptides N15, C21, and C25 were little under 10%. The snp-binding regions contained sites that bind synaptotagmin II and gangliosides. Despite the low degree of sequence homology, BoNT/B and BoNT/A display significant structural homology and appeared to bind in part to the same snp-binding regions. Binding of each labeled toxin to snps was inhibited ~50% by the other toxin, 70-72% by its correlate H(C), and by the H(C) of the other toxin [29% (BoNT/A by H(C) of B) or 32% (BoNT/B by H(C) of A)]. In the three-dimensional structure of BoNT/B, the greater part of H(C), one H(N) face, and part of the belt on the same side interact with snps. Thus, BoNT/B binds to snps through the H(C) head and employs regions on one H(N) face and the belt, reserving flexibility for the belt's unbound part to release the light chain. Most snp-binding regions coincide or overlap with blocking antibody (Ab)-binding regions explaining how such Abs prevent BoNT/B toxicity.
Genome-scale prediction of proteins with long intrinsically disordered regions.
Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz
2014-01-01
Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.
Molecular basis of endosomal-membrane association for the dengue virus envelope protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rogers, David M.; Kent, Michael S.; Rempe, Susan B.
Dengue virus is coated by an icosahedral shell of 90 envelope protein dimers that convert to trimers at low pH and promote fusion of its membrane with the membrane of the host endosome. We provide the first estimates for the free energy barrier and minimum for two key steps in this process: host membrane bending and protein–membrane binding. Both are studied using complementary membrane elastic, continuum electrostatics and all-atom molecular dynamics simulations. The predicted host membrane bending required to form an initial fusion stalk presents a 22–30 kcal/mol free energy barrier according to a constrained membrane elastic model. Combined continuummore » and molecular dynamics results predict a 15 kcal/mol free energy decrease on binding of each trimer of dengue envelope protein to a membrane with 30% anionic phosphatidylglycerol lipid. The bending cost depends on the preferred curvature of the lipids composing the host membrane leaflets, while the free energy gained for protein binding depends on the surface charge density of the host membrane. The fusion loop of the envelope protein inserts exactly at the level of the interface between the membrane's hydrophobic and head-group regions. As a result, the methods used in this work provide a means for further characterization of the structures and free energies of protein-assisted membrane fusion.« less
Molecular basis of endosomal-membrane association for the dengue virus envelope protein
Rogers, David M.; Kent, Michael S.; Rempe, Susan B.
2015-01-02
Dengue virus is coated by an icosahedral shell of 90 envelope protein dimers that convert to trimers at low pH and promote fusion of its membrane with the membrane of the host endosome. We provide the first estimates for the free energy barrier and minimum for two key steps in this process: host membrane bending and protein–membrane binding. Both are studied using complementary membrane elastic, continuum electrostatics and all-atom molecular dynamics simulations. The predicted host membrane bending required to form an initial fusion stalk presents a 22–30 kcal/mol free energy barrier according to a constrained membrane elastic model. Combined continuummore » and molecular dynamics results predict a 15 kcal/mol free energy decrease on binding of each trimer of dengue envelope protein to a membrane with 30% anionic phosphatidylglycerol lipid. The bending cost depends on the preferred curvature of the lipids composing the host membrane leaflets, while the free energy gained for protein binding depends on the surface charge density of the host membrane. The fusion loop of the envelope protein inserts exactly at the level of the interface between the membrane's hydrophobic and head-group regions. As a result, the methods used in this work provide a means for further characterization of the structures and free energies of protein-assisted membrane fusion.« less
MIRNA-DISTILLER: A Stand-Alone Application to Compile microRNA Data from Databases.
Rieger, Jessica K; Bodan, Denis A; Zanger, Ulrich M
2011-01-01
MicroRNAs (miRNA) are small non-coding RNA molecules of ∼22 nucleotides which regulate large numbers of genes by binding to seed sequences at the 3'-untranslated region of target gene transcripts. The target mRNA is then usually degraded or translation is inhibited, although thus resulting in posttranscriptional down regulation of gene expression at the mRNA and/or protein level. Due to the bioinformatic difficulties in predicting functional miRNA binding sites, several publically available databases have been developed that predict miRNA binding sites based on different algorithms. The parallel use of different databases is currently indispensable, but highly uncomfortable and time consuming, especially when working with numerous genes of interest. We have therefore developed a new stand-alone program, termed MIRNA-DISTILLER, which allows to compile miRNA data for given target genes from public databases. Currently implemented are TargetScan, microCosm, and miRDB, which may be queried independently, pairwise, or together to calculate the respective intersections. Data are stored locally for application of further analysis tools including freely definable biological parameter filters, customized output-lists for both miRNAs and target genes, and various graphical facilities. The software, a data example file and a tutorial are freely available at http://www.ikp-stuttgart.de/content/language1/html/10415.asp.
MIRNA-DISTILLER: A Stand-Alone Application to Compile microRNA Data from Databases
Rieger, Jessica K.; Bodan, Denis A.; Zanger, Ulrich M.
2011-01-01
MicroRNAs (miRNA) are small non-coding RNA molecules of ∼22 nucleotides which regulate large numbers of genes by binding to seed sequences at the 3′-untranslated region of target gene transcripts. The target mRNA is then usually degraded or translation is inhibited, although thus resulting in posttranscriptional down regulation of gene expression at the mRNA and/or protein level. Due to the bioinformatic difficulties in predicting functional miRNA binding sites, several publically available databases have been developed that predict miRNA binding sites based on different algorithms. The parallel use of different databases is currently indispensable, but highly uncomfortable and time consuming, especially when working with numerous genes of interest. We have therefore developed a new stand-alone program, termed MIRNA-DISTILLER, which allows to compile miRNA data for given target genes from public databases. Currently implemented are TargetScan, microCosm, and miRDB, which may be queried independently, pairwise, or together to calculate the respective intersections. Data are stored locally for application of further analysis tools including freely definable biological parameter filters, customized output-lists for both miRNAs and target genes, and various graphical facilities. The software, a data example file and a tutorial are freely available at http://www.ikp-stuttgart.de/content/language1/html/10415.asp PMID:22303335
Neutralization of Plasmodium falciparum merozoites by antibodies against PfRH5
Douglas, Alexander D.; Williams, Andrew R.; Knuepfer, Ellen; Illingworth, Joseph J.; Furze, Julie M.; Crosnier, Cécile; Choudhary, Prateek; Bustamante, Leyla Y.; Zakutansky, Sara E.; Awuah, Dennis K.; Alanine, Daniel G. W.; Theron, Michel; Worth, Andrew; Shimkets, Richard; Rayner, Julian C.; Holder, Anthony A.; Wright, Gavin J.; Draper, Simon J.
2013-01-01
There is intense interest in induction and characterization of strain-transcending neutralizing antibody against antigenically variable human pathogens. We have recently identified the human malaria parasite Plasmodium falciparum reticulocyte-binding protein homologue 5 (PfRH5) as a target of broadly-neutralizing antibodies, but there is little information regarding the functional mechanism(s) of antibody-mediated neutralization. Here, we report that vaccine-induced polyclonal anti-PfRH5 antibodies inhibit the tight attachment of merozoites to erythrocytes, and are capable of blocking the interaction of PfRH5 with its receptor basigin. Furthermore, by developing anti-PfRH5 monoclonal antibodies (mAbs), we provide evidence that i) the ability to block the PfRH5-basigin interaction in vitro is predictive of functional activity, but absence of blockade does not predict absence of functional activity; ii) neutralizing mAbs bind spatially-related epitopes on the folded protein, involving at least two defined regions of the PfRH5 primary sequence; iii) a brief exposure window of PfRH5 is likely to necessitate rapid binding of antibody to neutralize parasites; and iv) intact bivalent IgG contributes to but is not necessary for parasite neutralization. These data provide important insight into the mechanisms of broadly-neutralizing anti-malaria antibodies and further encourage anti-PfRH5 based malaria prevention efforts. PMID:24293631
Predicting nucleic acid binding interfaces from structural models of proteins.
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2012-02-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.
DNA Methylation of Cellular Retinoic Acid-Binding Proteins in Cervical Cancer.
Arellano-Ortiz, Ana L; Salcedo-Vargas, Mauricio; Vargas-Requena, Claudia L; López-Díaz, José A; De la Mora-Covarrubias, Antonio; Silva-Espinoza, Juan C; Jiménez-Vega, Florinda
2016-01-01
This study determined the methylation status of cellular retinoic acid-binding protein ( CRABP ) gene promoters and associated them with demographic characteristics, habits, and the presence of human papilloma virus (HPV) in patients with cervical cancer (CC), low and high squamous intraepithelial lesions, and no intraepithelial lesion. Women (n = 158) were selected from the Colposcopy Clinic of Sanitary Jurisdiction II in Ciudad Juarez, Chihuahua, Mexico. Demographic characteristics and habit information were collected. Cervical biopsy and endocervical scraping were used to determine methylation in promoter regions by methylation-specific polymerase chain reaction technique. We found hemi-methylation patterns in the promoter regions of CRABP1 and CRABP2 ; there was 28.5% hemi-methylation in CRABP1 and 7.0% in that of CRABP2 . Methylation in CRABP1 was associated with age (≥35 years, P = 0.002), family history of cancer ( P = 0.032), the presence of HPV-16 ( P = 0.013), and no alcohol intake ( P = 0.035). These epigenetic changes could be involved in the CC process, and CRABP1 has the potential to be a predictive molecular marker of retinoid therapy response.
DNA Methylation of Cellular Retinoic Acid-Binding Proteins in Cervical Cancer
Arellano-Ortiz, Ana L.; Salcedo-Vargas, Mauricio; Vargas-Requena, Claudia L.; López-Díaz, José A.; De la Mora-Covarrubias, Antonio; Silva-Espinoza, Juan C.; Jiménez-Vega, Florinda
2016-01-01
This study determined the methylation status of cellular retinoic acid-binding protein (CRABP) gene promoters and associated them with demographic characteristics, habits, and the presence of human papilloma virus (HPV) in patients with cervical cancer (CC), low and high squamous intraepithelial lesions, and no intraepithelial lesion. Women (n = 158) were selected from the Colposcopy Clinic of Sanitary Jurisdiction II in Ciudad Juarez, Chihuahua, Mexico. Demographic characteristics and habit information were collected. Cervical biopsy and endocervical scraping were used to determine methylation in promoter regions by methylation-specific polymerase chain reaction technique. We found hemi-methylation patterns in the promoter regions of CRABP1 and CRABP2; there was 28.5% hemi-methylation in CRABP1 and 7.0% in that of CRABP2. Methylation in CRABP1 was associated with age (≥35 years, P = 0.002), family history of cancer (P = 0.032), the presence of HPV-16 (P = 0.013), and no alcohol intake (P = 0.035). These epigenetic changes could be involved in the CC process, and CRABP1 has the potential to be a predictive molecular marker of retinoid therapy response. PMID:27867303
Rungnim, Chompoonut; Rungrotmongkol, Thanyada; Kungwan, Nawee; Hannongbua, Supot
2016-09-01
Epidermal growth factor (EGF) was used as the targeting ligand to enhance the specificity of a cancer drug delivery system (DDS) via its specific interaction with the EGF receptor (EGFR) that is overexpressed on the surface of some cancer cells. To investigate the intermolecular interaction and binding affinity between the EGF-conjugated DDS and the EGFR, 50 ns molecular dynamics simulations were performed on the complex of tethered EGFR and EGF linked to single-wall carbon nanotube (SWCNT) through a biopolymer chitosan wrapping the tube outer surface (EGFR·EGF-CS-SWCNT-Drug complex), and compared to the EGFR·EGF complex and free EGFR. The binding pattern of the EGF-CS-SWCNT-Drug complex to the EGFR was broadly comparable to that for EGF, but the binding affinity of the EGF-CS-SWCNT-Drug complex was predicted to be somewhat better than that for EGF alone. Additionally, the chitosan chain could prevent undesired interactions of SWCNT at the binding pocket region. Therefore, EGF connected to SWCNT via a chitosan linker is a seemingly good formulation for developing a smart DDS served as part of an alternative cancer therapy.
mRNA stability in mammalian cells.
Ross, J
1995-01-01
This review concerns how cytoplasmic mRNA half-lives are regulated and how mRNA decay rates influence gene expression. mRNA stability influences gene expression in virtually all organisms, from bacteria to mammals, and the abundance of a particular mRNA can fluctuate manyfold following a change in the mRNA half-life, without any change in transcription. The processes that regulate mRNA half-lives can, in turn, affect how cells grow, differentiate, and respond to their environment. Three major questions are addressed. Which sequences in mRNAs determine their half-lives? Which enzymes degrade mRNAs? Which (trans-acting) factors regulate mRNA stability, and how do they function? The following specific topics are discussed: techniques for measuring eukaryotic mRNA stability and for calculating decay constants, mRNA decay pathways, mRNases, proteins that bind to sequences shared among many mRNAs [like poly(A)- and AU-rich-binding proteins] and proteins that bind to specific mRNAs (like the c-myc coding-region determinant-binding protein), how environmental factors like hormones and growth factors affect mRNA stability, and how translation and mRNA stability are linked. Some perspectives and predictions for future research directions are summarized at the end. PMID:7565413
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots.
Wu, Min; Kwoh, Chee-Keong; Przytycka, Teresa M; Li, Jing; Zheng, Jie
2012-06-21
The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots
2012-01-01
The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots. PMID:22759569
Abundance of intrinsic structural disorder in the histone H1 subtypes.
Kowalski, Andrzej
2015-12-01
The intrinsically disordered proteins consist of partially structured regions linked to the unstructured stretches, which consequently form the transient and dynamic conformational ensembles. They undergo disorder to order transition upon binding their partners. Intrinsic disorder is attributed to histones H1, perceived as assemblers of chromatin structure and the regulators of DNA and proteins activity. In this work, the comparison of intrinsic disorder abundance in the histone H1 subtypes was performed both by the analysis of their amino acid composition and by the prediction of disordered stretches, as well as by identifying molecular recognition features (MoRFs) and ANCHOR protein binding regions (APBR) that are responsible for recognition and binding. Both human and model organisms-animals, plants, fungi and protists-have H1 histone subtypes with the properties typical of disordered state. They possess a significantly higher content of hydrophilic and charged amino acid residues, arranged in the long regions, covering over half of the whole amino acid residues in chain. Almost complete disorder corresponds to histone H1 terminal domains, including MoRFs and ANCHOR. Those motifs were also identified in a more ordered histone H1 globular domain. Compared to the control (globular and fibrous) proteins, H1 histones demonstrate the increased folding rate and a higher proportion of low-complexity segments. The results of this work indicate that intrinsic disorder is an inherent structural property of histone H1 subtypes and it is essential for establishing a protein conformation which defines functional outcomes affecting on DNA- and/or partner protein-dependent cell processes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Structure of the Intermediate Filament-Binding Region of Desmoplakin
Kang, Hyunook; Weiss, Thomas M.; Bang, Injin; ...
2016-01-25
Here, desmoplakin (DP) is a cytoskeletal linker protein that connects the desmosomal cadherin/plakoglobin/plakophilin complex to intermediate filaments (IFs). The C-terminal region of DP (DPCT) mediates IF binding, and contains three plakin repeat domains (PRDs), termed PRD-A, PRD-B and PRD-C. Previous crystal structures of PRDs B and C revealed that each is formed by 4.5 copies of a plakin repeat (PR) and has a conserved positively charged groove on its surface. Although PRDs A and B are linked by just four amino acids, B and C are separated by a 154 residue flexible linker, which has hindered crystallographic analysis of themore » full DPCT. Here we present the crystal structure of a DPCT fragment spanning PRDs A and B, and elucidate the overall architecture of DPCT by small angle X-ray scattering (SAXS) analysis. The structure of PRD-A is similar to that of PRD-B, and the two domains are arranged in a quasi-linear arrangement, and separated by a 4 amino acid linker. Analysis of the B-C linker region using secondary structure prediction and the crystal structure of a homologous linker from the cytolinker periplakin suggests that the N-terminal ~100 amino acids of the linker form two PR-like motifs. SAXS analysis of DPCT indicates an elongated but non-linear shape with R g = 51.5 Å and D max = 178 Å. These data provide the first structural insights into an IF binding protein containing multiple PRDs and provide a foundation for studying the molecular basis of DP-IF interactions.« less
Kumar, Sandeep; Mitchell, Mark A; Rup, Bonita; Singh, Satish K
2012-08-01
Aggregation and unwanted immunogenicity are hurdles to avoid in successful commercial development of antibody-based therapeutics. In this article, the relationship between aggregation-prone regions (APRs), capable of forming cross-β motifs/amyloid fibrils, and major histocompatibility complex class II-restricted human leukocyte antigen (HLA)-DR-binding T-cell immune epitopes (TcIEs) is analyzed using amino acid sequences of 25 therapeutic antibodies, 55 TcIEs recognized by T-regulatory cells (tregitopes), 1000 randomly generated 15-residue-long peptides, 2257 human self-TcIEs (autoantigens), and 11 peptides in HLA-peptide cocrystal structures. Sequence analyses from these diverse sources consistently show a high level of correlation between APRs and TcIEs: approximately one-third of TcIEs contain APRs, but the majority of APRs occur within TcIE regions (TcIERs). Tregitopes also contain APRs. Most APR-containing TcIERs can bind multiple HLA-DR alleles, suggesting that aggregation-driven adverse immune responses could impact a broad segment of patient population. This article has identified common molecular sequence-structure loci that potentially contribute toward both manufacturability and safety profiles of the therapeutic antibodies, thereby laying a foundation for simultaneous optimization of these attributes in novel and follow-on candidates. Incidence of APRs within TcIERs is not special to biotherapeutics, self-TcIEs from human proteins, involved in various diseases, also contain predicted APRs and experimentally proven amyloid-fibril-forming peptide sequence portions. Copyright © 2012 Wiley Periodicals, Inc.
Miller, Clint L; Haas, Ulrike; Diaz, Roxanne; Leeper, Nicholas J; Kundu, Ramendra K; Patlolla, Bhagat; Assimes, Themistocles L; Kaiser, Frank J; Perisic, Ljubica; Hedin, Ulf; Maegdefessel, Lars; Schunkert, Heribert; Erdmann, Jeanette; Quertermous, Thomas; Sczakiel, Georg
2014-03-01
Genome-wide association studies (GWAS) have identified chromosomal loci that affect risk of coronary heart disease (CHD) independent of classical risk factors. One such association signal has been identified at 6q23.2 in both Caucasians and East Asians. The lead CHD-associated polymorphism in this region, rs12190287, resides in the 3' untranslated region (3'-UTR) of TCF21, a basic-helix-loop-helix transcription factor, and is predicted to alter the seed binding sequence for miR-224. Allelic imbalance studies in circulating leukocytes and human coronary artery smooth muscle cells (HCASMC) showed significant imbalance of the TCF21 transcript that correlated with genotype at rs12190287, consistent with this variant contributing to allele-specific expression differences. 3' UTR reporter gene transfection studies in HCASMC showed that the disease-associated C allele has reduced expression compared to the protective G allele. Kinetic analyses in vitro revealed faster RNA-RNA complex formation and greater binding of miR-224 with the TCF21 C allelic transcript. In addition, in vitro probing with Pb2+ and RNase T1 revealed structural differences between the TCF21 variants in proximity of the rs12190287 variant, which are predicted to provide greater access to the C allele for miR-224 binding. miR-224 and TCF21 expression levels were anti-correlated in HCASMC, and miR-224 modulates the transcriptional response of TCF21 to transforming growth factor-β (TGF-β) and platelet derived growth factor (PDGF) signaling in an allele-specific manner. Lastly, miR-224 and TCF21 were localized in human coronary artery lesions and anti-correlated during atherosclerosis. Together, these data suggest that miR-224 interaction with the TCF21 transcript contributes to allelic imbalance of this gene, thus partly explaining the genetic risk for coronary heart disease associated at 6q23.2. These studies implicating rs12190287 in the miRNA-dependent regulation of TCF21, in conjunction with previous studies showing that this variant modulates transcriptional regulation through activator protein 1 (AP-1), suggests a unique bimodal level of complexity previously unreported for disease-associated variants.
Characterization of cDNAs and genomic DNAs for human threonyl- and cysteinyl-tRNA synthetases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cruzen, M.E.
1993-01-01
Techniques of molecular biology were used to clone, sequence and map two human aminoacyl-tRNA synthetase (aaRS) cDNAs: threonyl-tRNA synthetase (ThrRS) a class II enzyme and cysteinyl-tRNA synthetase (CysRS) a class I enzyme. The predicted protein sequence of human ThrRS is highly homologous to that of lower eukaryotic and prokaryotic ThRSs, particularly in the regions containing the three structural motifs common to all class II synthetases. Signature regions 1 and 2, which characterize the class IIa subgroup (SerRS, ThrRS and HisRS) are highly conserved from bacteria to human. Structural predictions for human ThrRS based on the known structure of the closelymore » related SerRS from E.coli implicate strongly conserved residues in the signature sequences to be important in substrate binding. The amino terminal 100 residues of the deduced amino acid sequence of ThrRS shares structural similarity to SerRS consistent with forming an antiparallel helix implicated in tRNA binding. The 5' untranslated sequence of the human ThrRS gene shares short stretches of common sequence with the gene for hamster HisRS including a binding site for the promoter specific transcription factor sp-1. The deduced amino acid sequence of human CysRS has a high degree of sequence identify to E. coli CysRS. Human CysRS possesses the classic characteristics of a class I synthetase and is most closely related to the MetRS subgroup. The amino terminal half of human CysRS can be modeled as a nucleotide binding fold and shares significant sequence and structural similarity to the other enzymes in this subgroup. The CysRS structural gene (CARS) was mapped to human chromosome 11p15.5 by fluorescent in situ hybridization. CARS is the first aaRS gene to be mapped to chromosome 11. The steady state of both CysRS and ThrRs mRNA were quantitated in several human tissues. Message levels for these enzymes appear to be subjected to differential regulation in different cell types.« less
The identification of cis-regulatory elements: A review from a machine learning perspective.
Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W
2015-12-01
The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
Necci, Marco; Piovesan, Damiano; Tosatto, Silvio C E
2016-12-01
Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. © 2016 The Protein Society.
Necci, Marco; Piovesan, Damiano
2016-01-01
Abstract Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large‐scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence‐based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. PMID:27636733
Shiang, Rita
2008-01-01
Treacher Collins syndrome is an autosomal-dominant mandibulofacial dysostosis caused by haploinsufficiency of the TCOF1 gene product treacle. Mouse Tcof1 protein is approximately 61% identical and 71% similar to treacle, and heterozygous knockout of Tcof1 causes craniofacial malformation. Tcof1 expression is high in developing neural crest, but much lower in other tissues. To investigate this dual regulation, highly conserved regions upstream of TCOF1 homologs were tested through deletion and mutation reporter assays, and conserved predicted transcription factor binding sites were assessed through chromatin binding studies. Assays were performed in mouse P19 embryonic carcinoma cells and in HEK293 cells to determine differential activation in cell types at different stages of differentiation. Binding of Cebpb, Zfp161, and Sp1 transcription factors was specific to the Tcof1 regulatory region in P19 cells. The Zfp161 binding site demonstrated P19 cell–specific repression, while the Sp1/Sp3 candidate site demonstrated HEK293 cell–specific activation. Moreover, presence of c-myb and Zfp161 transcripts was specific to P19 cells. A minimal promoter fragment from −253 to +43 bp directs constitutive expression in both cell types, and dual regulation of Tcof1 appears to be through differential repression of this minimal promoter. The CpG island at the transcription start site remains unmethylated in P19 cells, 11.5 dpc mouse embryonic tissue, and adult mouse ear, which supports constitutive activation of the Tcof1 promoter. PMID:18771418
Niu, Qian; Ybe, Joel A.
2008-01-01
Summary Huntington’s disease is a genetic neurological disorder that is triggered by the dissociation of the huntingtin protein (htt) from its obligate interaction partner Huntingtin-interacting protein 1 (HIP1). The release of htt permits HIP-protein interactor (HIPPI) to bind to its recognition site on HIP1 to form a HIPPI/HIP1 complex that recruits Procaspase-8 to begin the process of apoptosis. The interaction module between HIPPI and HIP1 was predicted to resemble a death-effector domain (DED). Our 2.8 Å crystal structure of the HIP1 371-481 sub-fragment that includes F432 and K474 important for HIPPI binding is not a DED, but is a partially opened coiled-coil. The HIP1 371-481 model reveals a basic surface we hypothesize is suitable for binding HIPPI. There is an opened region next to the putative HIPPI site that is highly negatively charged. The acidic residues in this region are highly conserved in HIP1 and a related protein, HIP1R from different organisms, but are not conserved in the yeast homolog of HIP1, sla2p. We have modeled ∼85% of the coiled-coil domain by joining our new HIP1 371-481 structure to the HIP1 482-586 model (PDB code: 2NO2). Finally, the middle of this coiled-coil domain may be intrinsically flexible and suggests a new interaction model where HIPPI binds to a “U” shaped HIP1 molecule. PMID:18155047
Miao, Zhichao; Westhof, Eric
2016-07-08
RBscore&NBench combines a web server, RBscore and a database, NBench. RBscore predicts RNA-/DNA-binding residues in proteins and visualizes the prediction scores and features on protein structures. The scoring scheme of RBscore directly links feature values to nucleic acid binding probabilities and illustrates the nucleic acid binding energy funnel on the protein surface. To avoid dataset, binding site definition and assessment metric biases, we compared RBscore with 18 web servers and 3 stand-alone programs on 41 datasets, which demonstrated the high and stable accuracy of RBscore. A comprehensive comparison led us to develop a benchmark database named NBench. The web server is available on: http://ahsoka.u-strasbg.fr/rbscorenbench/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ribay, Kathryn; Kim, Marlene T; Wang, Wenyi; Pinolini, Daniel; Zhu, Hao
2016-03-01
Estrogen receptors (ERα) are a critical target for drug design as well as a potential source of toxicity when activated unintentionally. Thus, evaluating potential ERα binding agents is critical in both drug discovery and chemical toxicity areas. Using computational tools, e.g., Quantitative Structure-Activity Relationship (QSAR) models, can predict potential ERα binding agents before chemical synthesis. The purpose of this project was to develop enhanced predictive models of ERα binding agents by utilizing advanced cheminformatics tools that can integrate publicly available bioassay data. The initial ERα binding agent data set, consisting of 446 binders and 8307 non-binders, was obtained from the Tox21 Challenge project organized by the NIH Chemical Genomics Center (NCGC). After removing the duplicates and inorganic compounds, this data set was used to create a training set (259 binders and 259 non-binders). This training set was used to develop QSAR models using chemical descriptors. The resulting models were then used to predict the binding activity of 264 external compounds, which were available to us after the models were developed. The cross-validation results of training set [Correct Classification Rate (CCR) = 0.72] were much higher than the external predictivity of the unknown compounds (CCR = 0.59). To improve the conventional QSAR models, all compounds in the training set were used to search PubChem and generate a profile of their biological responses across thousands of bioassays. The most important bioassays were prioritized to generate a similarity index that was used to calculate the biosimilarity score between each two compounds. The nearest neighbors for each compound within the set were then identified and its ERα binding potential was predicted by its nearest neighbors in the training set. The hybrid model performance (CCR = 0.94 for cross validation; CCR = 0.68 for external prediction) showed significant improvement over the original QSAR models, particularly for the activity cliffs that induce prediction errors. The results of this study indicate that the response profile of chemicals from public data provides useful information for modeling and evaluation purposes. The public big data resources should be considered along with chemical structure information when predicting new compounds, such as unknown ERα binding agents.
NASA Astrophysics Data System (ADS)
Peng, Lanfang; Liu, Paiyu; Feng, Xionghan; Wang, Zimeng; Cheng, Tao; Liang, Yuzhen; Lin, Zhang; Shi, Zhenqing
2018-03-01
Predicting the kinetics of heavy metal adsorption and desorption in soil requires consideration of multiple heterogeneous soil binding sites and variations of reaction chemistry conditions. Although chemical speciation models have been developed for predicting the equilibrium of metal adsorption on soil organic matter (SOM) and important mineral phases (e.g. Fe and Al (hydr)oxides), there is still a lack of modeling tools for predicting the kinetics of metal adsorption and desorption reactions in soil. In this study, we developed a unified model for the kinetics of heavy metal adsorption and desorption in soil based on the equilibrium models WHAM 7 and CD-MUSIC, which specifically consider metal kinetic reactions with multiple binding sites of SOM and soil minerals simultaneously. For each specific binding site, metal adsorption and desorption rate coefficients were constrained by the local equilibrium partition coefficients predicted by WHAM 7 or CD-MUSIC, and, for each metal, the desorption rate coefficients of various binding sites were constrained by their metal binding constants with those sites. The model had only one fitting parameter for each soil binding phase, and all other parameters were derived from WHAM 7 and CD-MUSIC. A stirred-flow method was used to study the kinetics of Cd, Cu, Ni, Pb, and Zn adsorption and desorption in multiple soils under various pH and metal concentrations, and the model successfully reproduced most of the kinetic data. We quantitatively elucidated the significance of different soil components and important soil binding sites during the adsorption and desorption kinetic processes. Our model has provided a theoretical framework to predict metal adsorption and desorption kinetics, which can be further used to predict the dynamic behavior of heavy metals in soil under various natural conditions by coupling other important soil processes.
Lebedev, Konstantin; Mafé, Salvador; Stroeve, Pieter
2006-04-15
We study theoretically the transport and kinetic processes underlying the operation of a biosensor (particularly the surface plasmon sensor "Biacore") used to study the surface binding kinetics of biomolecules in solution to immobilized receptors. Unlike previous studies, we concentrate mainly on the modeling of system-specific phenomena rather than on the influence of mass transport limitations on the intrinsic kinetic rate constants determined from binding data. In the first problem, the case of two-site binding where each receptor unit on the surface can accommodate two analyte molecules on two different sites is considered. One analyte molecule always binds first to a specific site. Subsequently, the second analyte molecule can bind to the adjacent unoccupied site. In the second problem, two different analytes compete for one binding site on the same surface receptor. Finally, the third problem considers the case of positive cooperativity among bound molecules in the hydrogel using a simple mean-field approach. The transport in both the flow channel and the hydrogel phases of the biosensor is taken into account in this case (with few exceptions, most previous studies assume a simpler model in which the hydrogel is treated as a planar surface with the receptors). We consider simultaneously diffusion and convection through the flow channel together with diffusion and cooperativity binding on the surface and in the hydrogel. In each case, typical results for the concentration contours of the free and bound molecules in the flow channel and hydrogel regions are presented together with the time-dependent association/dissociation curves and reaction rates. For binding site competition, the analysis predicts overshoot phenomena.
Rimac, Hrvoje; Dufour, Claire; Debeljak, Željko; Zorc, Branka; Bojić, Mirza
2017-07-11
Human serum albumin (HSA) binds a variety of xenobiotics, including flavonoids and warfarin. The binding of another ligand to the IIA binding site on HSA can cause warfarin displacement and potentially the elevation of its free concentration in blood. Studies dealing with flavonoid-induced warfarin displacement from HSA provided controversial results: estimated risk of displacement ranged from none to serious. To resolve these controversies, in vitro study of simultaneous binding of warfarin and eight different flavonoid aglycons and glycosides to HSA was carried out by fluorescence spectroscopy as well as molecular docking. Results show that warfarin and flavonoids do not share the same binding region in binding to HSA. Interactions were only observed at high warfarin concentrations not attainable under recommended dosing regimes. Docking experiments show that flavonoid aglycons and glycosides do not bind at warfarin high affinity sites, but rather to different regions within the IIA HSA subdomain. Thus, the risk of clinically significant warfarin-flavonoid interaction in binding to HSA should be regarded as negligible.
Grinter, Sam Z; Yan, Chengfei; Huang, Sheng-You; Jiang, Lin; Zou, Xiaoqin
2013-08-26
In this study, we use the recently released 2012 Community Structure-Activity Resource (CSAR) data set to evaluate two knowledge-based scoring functions, ITScore and STScore, and a simple force-field-based potential (VDWScore). The CSAR data set contains 757 compounds, most with known affinities, and 57 crystal structures. With the help of the script files for docking preparation, we use the full CSAR data set to evaluate the performances of the scoring functions on binding affinity prediction and active/inactive compound discrimination. The CSAR subset that includes crystal structures is used as well, to evaluate the performances of the scoring functions on binding mode and affinity predictions. Within this structure subset, we investigate the importance of accurate ligand and protein conformational sampling and find that the binding affinity predictions are less sensitive to non-native ligand and protein conformations than the binding mode predictions. We also find the full CSAR data set to be more challenging in making binding mode predictions than the subset with structures. The script files used for preparing the CSAR data set for docking, including scripts for canonicalization of the ligand atoms, are offered freely to the academic community.
NASA Astrophysics Data System (ADS)
Duan, Rui; Xu, Xianjin; Zou, Xiaoqin
2018-01-01
D3R 2016 Grand Challenge 2 focused on predictions of binding modes and affinities for 102 compounds against the farnesoid X receptor (FXR). In this challenge, two distinct methods, a docking-based method and a template-based method, were employed by our team for the binding mode prediction. For the new template-based method, 3D ligand similarities were calculated for each query compound against the ligands in the co-crystal structures of FXR available in Protein Data Bank. The binding mode was predicted based on the co-crystal protein structure containing the ligand with the best ligand similarity score against the query compound. For the FXR dataset, the template-based method achieved a better performance than the docking-based method on the binding mode prediction. For the binding affinity prediction, an in-house knowledge-based scoring function ITScore2 and MM/PBSA approach were employed. Good performance was achieved for MM/PBSA, whereas the performance of ITScore2 was sensitive to ligand composition, e.g. the percentage of carbon atoms in the compounds. The sensitivity to ligand composition could be a clue for the further improvement of our knowledge-based scoring function.
Selb, R.; Eckl-Dorna, J.; Vrtala, S.; Valenta, R.; Niederberger, V.
2017-01-01
Background It has been shown that birch pollen immunotherapy can induce IgG antibodies which enhance IgE binding to Bet v 1. We aimed to develop a serological assay to predict the development of antibodies which enhance IgE binding to Bet v 1 during immunotherapy. Methods In 18 patients treated by Bet v 1-fragment-specific immunotherapy, the effects of IgG antibodies specific for the fragments on the binding of IgE antibodies to Bet v 1 were measured by ELISA. Blocking and possible enhancing effects on IgE binding were compared with skin sensitivity to Bet v 1 after treatment. Results We found that fragment-specific IgG enhanced IgE binding to Bet v 1 in two patients who also showed an increase of skin sensitivity to Bet v 1. Conclusion Our results indicate that it may be possible to develop serological tests which predict the induction of unfavourable IgG antibodies enhancing the binding of IgE to Bet v 1 during immunotherapy. PMID:23998344
Kuang, Zheng; Ji, Zhicheng
2018-01-01
Abstract Biological processes are usually associated with genome-wide remodeling of transcription driven by transcription factors (TFs). Identifying key TFs and their spatiotemporal binding patterns are indispensable to understanding how dynamic processes are programmed. However, most methods are designed to predict TF binding sites only. We present a computational method, dynamic motif occupancy analysis (DynaMO), to infer important TFs and their spatiotemporal binding activities in dynamic biological processes using chromatin profiling data from multiple biological conditions such as time-course histone modification ChIP-seq data. In the first step, DynaMO predicts TF binding sites with a random forests approach. Next and uniquely, DynaMO infers dynamic TF binding activities at predicted binding sites using their local chromatin profiles from multiple biological conditions. Another landmark of DynaMO is to identify key TFs in a dynamic process using a clustering and enrichment analysis of dynamic TF binding patterns. Application of DynaMO to the yeast ultradian cycle, mouse circadian clock and human neural differentiation exhibits its accuracy and versatility. We anticipate DynaMO will be generally useful for elucidating transcriptional programs in dynamic processes. PMID:29325176
Roche, Daniel Barry; Brackenridge, Danielle Allison; McGuffin, Liam James
2015-12-15
Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein-ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein-ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein-ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.
Hsin, Kun-Yi; Ghosh, Samik; Kitano, Hiroaki
2013-01-01
Increased availability of bioinformatics resources is creating opportunities for the application of network pharmacology to predict drug effects and toxicity resulting from multi-target interactions. Here we present a high-precision computational prediction approach that combines two elaborately built machine learning systems and multiple molecular docking tools to assess binding potentials of a test compound against proteins involved in a complex molecular network. One of the two machine learning systems is a re-scoring function to evaluate binding modes generated by docking tools. The second is a binding mode selection function to identify the most predictive binding mode. Results from a series of benchmark validations and a case study show that this approach surpasses the prediction reliability of other techniques and that it also identifies either primary or off-targets of kinase inhibitors. Integrating this approach with molecular network maps makes it possible to address drug safety issues by comprehensively investigating network-dependent effects of a drug or drug candidate. PMID:24391846
Legendre-Guillemin, Valerie; Metzler, Martina; Lemaire, Jean-Francois; Philie, Jacynthe; Gan, Lu; Hayden, Michael R; McPherson, Peter S
2005-02-18
Huntingtin interacting protein 1 (HIP1) is a component of clathrin coats. We previously demonstrated that HIP1 promotes clathrin assembly through its central helical domain, which binds directly to clathrin light chains (CLCs). To better understand the relationship between CLC binding and clathrin assembly we sought to dissect this interaction. Using C-terminal deletion constructs of the HIP1 helical domain, we identified a region between residues 450 and 456 that is required for CLC binding. Within this region, point mutations showed the importance of residues Leu-451, Leu-452, and Arg-453. Mutants that fail to bind CLC are unable to promote clathrin assembly in vitro but still mediate HIP1 homodimerization and heterodimerization with the family member HIP12/HIP1R. Moreover, HIP1 binding to CLC is necessary for HIP1 targeting to clathrin-coated pits and clathrin-coated vesicles. Interestingly, HIP1 binds to a highly conserved region of CLC previously demonstrated to regulate clathrin assembly. These results suggest a role for HIP1/CLC interactions in the regulation of clathrin assembly.
Predicting bioactive conformations and binding modes of macrocycles
NASA Astrophysics Data System (ADS)
Anighoro, Andrew; de la Vega de León, Antonio; Bajorath, Jürgen
2016-10-01
Macrocyclic compounds experience increasing interest in drug discovery. It is often thought that these large and chemically complex molecules provide promising candidates to address difficult targets and interfere with protein-protein interactions. From a computational viewpoint, these molecules are difficult to treat. For example, flexible docking of macrocyclic compounds is hindered by the limited ability of current docking approaches to optimize conformations of extended ring systems for pose prediction. Herein, we report predictions of bioactive conformations of macrocycles using conformational search and binding modes using docking. Conformational ensembles generated using specialized search technique of about 70 % of the tested macrocycles contained accurate bioactive conformations. However, these conformations were difficult to identify on the basis of conformational energies. Moreover, docking calculations with limited ligand flexibility starting from individual low energy conformations rarely yielded highly accurate binding modes. In about 40 % of the test cases, binding modes were approximated with reasonable accuracy. However, when conformational ensembles were subjected to rigid body docking, an increase in meaningful binding mode predictions to more than 50 % of the test cases was observed. Electrostatic effects did not contribute to these predictions in a positive or negative manner. Rather, achieving shape complementarity at macrocycle-target interfaces was a decisive factor. In summary, a combined computational protocol using pre-computed conformational ensembles of macrocycles as a starting point for docking shows promise in modeling binding modes of macrocyclic compounds.
NASA Astrophysics Data System (ADS)
McCarrick, Margaret A.; Kollman, Peter A.
1999-03-01
The relative binding free energies in HIV protease of haloperidol thioketal (THK) and three of its derivatives were examined with free energy calculations. THK is a weak inhibitor (IC50 = 15 μM) for which two cocrystal structures with HIV type 1 proteases have been solved [Rutenber, E. et al., J. Biol. Chem., 268 (1993) 15343]. A THK derivative with a phenyl group on C2 of the piperidine ring was expected to be a poor inhibitor based on experiments with haloperidol ketal and its 2- phenyl derivative (Caldera, P., personal communication). Our calculations predict that a 5-phenyl THK derivative, suggested based on examination of the crystal structure, will bind significantly better than THK. Although there are large error bars as estimated from hysteresis, the calculations predict that the 5-phenyl substituent is clearly favored over the 2-phenyl derivative as well as the parent compound. The unfavorable free energies of solvation of both phenyl THK derivatives relative to the parent compound contributed to their predicted binding free energies. In a third simulation, the change in binding free energy for 5-benzyl THK relative to THK was calculated. Although this derivative has a lower free energy in the protein, its decreased free energy of solvation increases the predicted ΔΔG(bind) to the same range as that of the 2-phenyl derivative.
Kanuru, Madhavi; Samuel, Jebakumar J; Balivada, Lavanya M; Aradhyam, Gopala K
2009-05-01
Calnuc is a novel, highly modular, EF-hand containing, Ca(2+)-binding, Golgi resident protein whose functions are not clear. Using amino acid sequences, we demonstrate that Calnuc is a highly conserved protein among various organisms, from Ciona intestinalis to humans. Maximum homology among all sequences is found in the region that binds to G-proteins. In humans, it is known to be expressed in a variety of tissues, and it interacts with several important protein partners. Among other proteins, Calnuc is known to interact with heterotrimeric G-proteins, specifically with the alpha-subunit. Herein, we report the structural implications of Ca(2+) and Mg(2+) binding, and illustrate that Calnuc functions as a downstream effector for G-protein alpha-subunit. Our results show that Ca(2+) binds with an affinity of 7 mum and causes structural changes. Although Mg(2+) binds to Calnuc with very weak affinity, the structural changes that it causes are further enhanced by Ca(2+) binding. Furthermore, isothermal titration calorimetry results show that Calnuc and the G-protein bind with an affinity of 13 nm. We also predict a probable function for Calnuc, that of maintaining Ca(2+) homeostasis in the cell. Using Stains-all and terbium as Ca(2+) mimic probes, we demonstrate that the Ca(2+)-binding ability of Calnuc is governed by the activity-based conformational state of the G-protein. We propose that Calnuc adopts structural sites similar to the ones seen in proteins such as annexins, c2 domains or chromogrannin A, and therefore binds more calcium ions upon binding to Gialpha. With the number of organelle-targeted G-protein-coupled receptors increasing, intracellular communication mediated by G-proteins could become a new paradigm. In this regard, we propose that Calnuc could be involved in the downstream signaling of G-proteins.
Gaines, J C; Acebes, S; Virrueta, A; Butler, M; Regan, L; O'Hern, C S
2018-05-01
We compare side chain prediction and packing of core and non-core regions of soluble proteins, protein-protein interfaces, and transmembrane proteins. We first identified or created comparable databases of high-resolution crystal structures of these 3 protein classes. We show that the solvent-inaccessible cores of the 3 classes of proteins are equally densely packed. As a result, the side chains of core residues at protein-protein interfaces and in the membrane-exposed regions of transmembrane proteins can be predicted by the hard-sphere plus stereochemical constraint model with the same high prediction accuracies (>90%) as core residues in soluble proteins. We also find that for all 3 classes of proteins, as one moves away from the solvent-inaccessible core, the packing fraction decreases as the solvent accessibility increases. However, the side chain predictability remains high (80% within 30°) up to a relative solvent accessibility, rSASA≲0.3, for all 3 protein classes. Our results show that ≈40% of the interface regions in protein complexes are "core", that is, densely packed with side chain conformations that can be accurately predicted using the hard-sphere model. We propose packing fraction as a metric that can be used to distinguish real protein-protein interactions from designed, non-binding, decoys. Our results also show that cores of membrane proteins are the same as cores of soluble proteins. Thus, the computational methods we are developing for the analysis of the effect of hydrophobic core mutations in soluble proteins will be equally applicable to analyses of mutations in membrane proteins. © 2018 Wiley Periodicals, Inc.
Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K
1989-11-01
Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, but not to other MHC cis-acting sequences or to mutant region II sequences, similar to the naturally occurring region II factor in mouse cells. The deduced amino acid sequence of H-2RIIBP revealed two putative zinc fingers homologous to the DNA-binding domain of steroid/thyroid hormone receptors. Although sequence similarity in other regions was minimal, H-2RIIBP has apparent modular domains characteristic of the nuclear hormone receptors. Further analyses showed that both H-2RIIBP and the natural region II factor bind to the estrogen response element (ERE) of the vitellogenin A2 gene. The ERE is composed of a palindrome, and half of this palindrome resembles the region II binding site of the MHC CRE. These results indicate that H-2RIIBP (i) is a member of the superfamily of nuclear hormone receptors and (ii) may regulate not only MHC class I genes but also genes containing the ERE and related sequences. Sequences homologous to the H-2RIIBP gene are widely conserved in the animal kingdom. H-2RIIBP mRNA is expressed in many mouse tissues, in agreement with the distribution of the natural region II factor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fallon, Jennifer L.; Baker, Mariah R.; Xiong, Liangwen
2009-11-10
Voltage-dependent calcium channels (Ca(V)) open in response to changes in membrane potential, but their activity is modulated by Ca(2+) binding to calmodulin (CaM). Structural studies of this family of channels have focused on CaM bound to the IQ motif; however, the minimal differences between structures cannot adequately describe CaM's role in the regulation of these channels. We report a unique crystal structure of a 77-residue fragment of the Ca(V)1.2 alpha(1) subunit carboxyl terminus, which includes a tandem of the pre-IQ and IQ domains, in complex with Ca(2+).CaM in 2 distinct binding modes. The structure of the Ca(V)1.2 fragment is anmore » unusual dimer of 2 coiled-coiled pre-IQ regions bridged by 2 Ca(2+).CaMs interacting with the pre-IQ regions and a canonical Ca(V)1-IQ-Ca(2+).CaM complex. Native Ca(V)1.2 channels are shown to be a mixture of monomers/dimers and a point mutation in the pre-IQ region predicted to abolish the coiled-coil structure significantly reduces Ca(2+)-dependent inactivation of heterologously expressed Ca(V)1.2 channels.« less
NASA Astrophysics Data System (ADS)
van Schaik, Joris W. J.; Kleja, Dan B.; Gustafsson, Jon Petter
2010-02-01
Vast amounts of knowledge about the proton- and metal-binding properties of dissolved organic matter (DOM) in natural waters have been obtained in studies on isolated humic and fulvic (hydrophobic) acids. Although macromolecular hydrophilic acids normally make up about one-third of DOM, their proton- and metal-binding properties are poorly known. Here, we investigated the acid-base and Cu-binding properties of the hydrophobic (fulvic) acid fraction and two hydrophilic fractions isolated from a soil solution. Proton titrations revealed a higher total charge for the hydrophilic acid fractions than for the hydrophobic acid fraction. The most hydrophilic fraction appeared to be dominated by weak acid sites, as evidenced by increased slope of the curve of surface charge versus pH at pH values above 6. The titration curves were poorly predicted by both Stockholm Humic Model (SHM) and NICA-Donnan model calculations using generic parameter values, but could be modelled accurately after optimisation of the proton-binding parameters (pH ⩽ 9). Cu-binding isotherms for the three fractions were determined at pH values of 4, 6 and 9. With the optimised proton-binding parameters, the SHM model predictions for Cu binding improved, whereas the NICA-Donnan predictions deteriorated. After optimisation of Cu-binding parameters, both models described the experimental data satisfactorily. Iron(III) and aluminium competed strongly with Cu for binding sites at both pH 4 and pH 6. The SHM model predicted this competition reasonably well, but the NICA-Donnan model underestimated the effects significantly at pH 6. Overall, the Cu-binding behaviour of the two hydrophilic acid fractions was very similar to that of the hydrophobic acid fraction, despite the differences observed in proton-binding characteristics. These results show that for modelling purposes, it is essential to include the hydrophilic acid fraction in the pool of 'active' humic substances.
Hu, Xiuzhen; Dong, Qiwen; Yang, Jianyi; Zhang, Yang
2016-11-01
More than half of proteins require binding of metal and acid radical ions for their structure and function. Identification of the ion-binding locations is important for understanding the biological functions of proteins. Due to the small size and high versatility of the metal and acid radical ions, however, computational prediction of their binding sites remains difficult. We proposed a new ligand-specific approach devoted to the binding site prediction of 13 metal ions (Zn 2+ , Cu 2+ , Fe 2+ , Fe 3+ , Ca 2+ , Mg 2+ , Mn 2+ , Na + , K + ) and acid radical ion ligands (CO3 2- , NO2 - , SO4 2- , PO4 3- ) that are most frequently seen in protein databases. A sequence-based ab initio model is first trained on sequence profiles, where a modified AdaBoost algorithm is extended to balance binding and non-binding residue samples. A composite method IonCom is then developed to combine the ab initio model with multiple threading alignments for further improving the robustness of the binding site predictions. The pipeline was tested using 5-fold cross validations on a comprehensive set of 2,100 non-redundant proteins bound with 3,075 small ion ligands. Significant advantage was demonstrated compared with the state of the art ligand-binding methods including COACH and TargetS for high-accuracy ion-binding site identification. Detailed data analyses show that the major advantage of IonCom lies at the integration of complementary ab initio and template-based components. Ion-specific feature design and binding library selection also contribute to the improvement of small ion ligand binding predictions. http://zhanglab.ccmb.med.umich.edu/IonCom CONTACT: hxz@imut.edu.cn or zhng@umich.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Ichikawa, Osamu; Fujimoto, Kazushi; Yamada, Atsushi; Okazaki, Susumu; Yamazaki, Kazuto
2016-01-01
The efficacy and bias of signal transduction induced by a drug at a target protein are closely associated with the benefits and side effects of the drug. In particular, partial agonist activity and G-protein/β-arrestin-biased agonist activity for the G-protein-coupled receptor (GPCR) family, the family with the most target proteins of launched drugs, are key issues in drug discovery. However, designing GPCR drugs with appropriate efficacy and bias is challenging because the dynamic mechanism of signal transduction induced by ligand—receptor interactions is complicated. Here, we identified the G-protein/β-arrestin-linked fluctuating network, which initiates large-scale conformational changes, using sub-microsecond molecular dynamics (MD) simulations of the β2-adrenergic receptor (β2AR) with a diverse collection of ligands and correlation analysis of their G protein/β-arrestin efficacy. The G-protein-linked fluctuating network extends from the ligand-binding site to the G-protein-binding site through the connector region, and the β-arrestin-linked fluctuating network consists of the NPxxY motif and adjacent regions. We confirmed that the averaged values of fluctuation in the fluctuating network detected are good quantitative indexes for explaining G protein/β-arrestin efficacy. These results indicate that short-term MD simulation is a practical method to predict the efficacy and bias of any compound for GPCRs. PMID:27187591
Ashraf, Naeem Mahmood; Bilal, Muhammad; Mahmood, Malik Siddique; Hussain, Aadil; Mehboob, Muhammad Zubair
2016-09-01
Mounting burden of HCV-infected individuals and soaring cost of treatment is a serious source of unease for developing countries. Numbers of various approaches have been anticipated to develop a vaccine against HCV but the majority of them proved ineffective. Development of vaccine by considering geographical distribution of HCV genotypes and host genetics shows potential. In this research article, we have tried to predict most putative HCV epitopes which are efficiently restricted by most common HLA alleles in Pakistani population through different computational algorithms. Thirteen selected, experimentally identified epitopes sequences were used to derived consensus sequences in all genotypes of HCV. Obtained consensus sequences were used to predict their binding affinities with most prevalent HLA alleles in Pakistani population. Two Class-I epitopes from NS4B region, one from Class-I epitope from NS5A and one Class-II epitope from NS3 region showed effective binding and proved to be highly putative to boost immune response. A cocktail of these four have been checked for population coverage and they gave 75.53% for Pakistani Asian and 70.77% for Pakistani Mixed populations with no allergenic response. Computational algorithms are robust way to shortlist potential candidate epitopes for vaccine development but further, in vivo and in-vitro studies are required to confirm their immunogenic properties. Copyright © 2016 Elsevier B.V. All rights reserved.
ProBiS-CHARMMing: Web Interface for Prediction and Optimization of Ligands in Protein Binding Sites.
Konc, Janez; Miller, Benjamin T; Štular, Tanja; Lešnik, Samo; Woodcock, H Lee; Brooks, Bernard R; Janežič, Dušanka
2015-11-23
Proteins often exist only as apo structures (unligated) in the Protein Data Bank, with their corresponding holo structures (with ligands) unavailable. However, apoproteins may not represent the amino-acid residue arrangement upon ligand binding well, which is especially problematic for molecular docking. We developed the ProBiS-CHARMMing web interface by connecting the ProBiS ( http://probis.cmm.ki.si ) and CHARMMing ( http://www.charmming.org ) web servers into one functional unit that enables prediction of protein-ligand complexes and allows for their geometry optimization and interaction energy calculation. The ProBiS web server predicts ligands (small compounds, proteins, nucleic acids, and single-atom ligands) that may bind to a query protein. This is achieved by comparing its surface structure against a nonredundant database of protein structures and finding those that have binding sites similar to that of the query protein. Existing ligands found in the similar binding sites are then transposed to the query according to predictions from ProBiS. The CHARMMing web server enables, among other things, minimization and potential energy calculation for a wide variety of biomolecular systems, and it is used here to optimize the geometry of the predicted protein-ligand complex structures using the CHARMM force field and to calculate their interaction energies with the corresponding query proteins. We show how ProBiS-CHARMMing can be used to predict ligands and their poses for a particular binding site, and minimize the predicted protein-ligand complexes to obtain representations of holoproteins. The ProBiS-CHARMMing web interface is freely available for academic users at http://probis.nih.gov.
NASA Astrophysics Data System (ADS)
Grudinin, Sergei; Kadukova, Maria; Eisenbarth, Andreas; Marillet, Simon; Cazals, Frédéric
2016-09-01
The 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it.
Mangold, Sabine; Norwood, Suzanne J.; Yap, Alpha S.; Collins, Brett M.
2012-01-01
We recently identified the atypical myosin, Myosin VI, as a component of epithelial cell-cell junctions that interacts with E-cadherin. Recombinant proteins bearing the cargo-binding domain of Myosin VI (Myo VI-CBD) or the cytoplasmic tail of E-cadherin can interact directly with one another. In this report we further investigate the molecular requirements of the interaction between Myo VI-CBD and E-cadherin combining truncation mutation analysis with in vitro binding assays. We report that a short (28 amino acid) juxtamembrane region of the cadherin cytoplasmic tail is sufficient to bind Myo VI-CBD. However, central regions of the cadherin tail adjacent to the juxtamembrane sequence also display binding activity for Myo VI-CBD. It is therefore possible that the cadherin tail bears two binding sites for Myosin VI, or an extended binding site that includes the juxtamembrane region. Nevertheless, our biochemical data highlight the capacity for the juxtamembrane region to interact with functionally-significant cytoplasmic proteins. PMID:23007415
Protein unfolding as a switch from self-recognition to high-affinity client binding
Groitl, Bastian; Horowitz, Scott; Makepeace, Karl A. T.; Petrotchenko, Evgeniy V.; Borchers, Christoph H.; Reichmann, Dana; Bardwell, James C. A.; Jakob, Ursula
2016-01-01
Stress-specific activation of the chaperone Hsp33 requires the unfolding of a central linker region. This activation mechanism suggests an intriguing functional relationship between the chaperone's own partial unfolding and its ability to bind other partially folded client proteins. However, identifying where Hsp33 binds its clients has remained a major gap in our understanding of Hsp33's working mechanism. By using site-specific Fluorine-19 nuclear magnetic resonance experiments guided by in vivo crosslinking studies, we now reveal that the partial unfolding of Hsp33's linker region facilitates client binding to an amphipathic docking surface on Hsp33. Furthermore, our results provide experimental evidence for the direct involvement of conditionally disordered regions in unfolded protein binding. The observed structural similarities between Hsp33's own metastable linker region and client proteins present a possible model for how Hsp33 uses protein unfolding as a switch from self-recognition to high-affinity client binding. PMID:26787517
NASA Astrophysics Data System (ADS)
Bhakat, Soumendranath; Söderhjelm, Pär
2017-01-01
The funnel metadynamics method enables rigorous calculation of the potential of mean force along an arbitrary binding path and thereby evaluation of the absolute binding free energy. A problem of such physical paths is that the mechanism characterizing the binding process is not always obvious. In particular, it might involve reorganization of the solvent in the binding site, which is not easily captured with a few geometrically defined collective variables that can be used for biasing. In this paper, we propose and test a simple method to resolve this trapped-water problem by dividing the process into an artificial host-desolvation step and an actual binding step. We show that, under certain circumstances, the contribution from the desolvation step can be calculated without introducing further statistical errors. We apply the method to the problem of predicting host-guest binding free energies in the SAMPL5 blind challenge, using two octa-acid hosts and six guest molecules. For one of the hosts, well-converged results are obtained and the prediction of relative binding free energies is the best among all the SAMPL5 submissions. For the other host, which has a narrower binding pocket, the statistical uncertainties are slightly higher; longer simulations would therefore be needed to obtain conclusive results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shin, Dae-Seop; Park, Myoung Joo; Lee, Hyang-Ae
2014-02-01
Nefazodone was used widely as an antidepressant until it was withdrawn from the U.S. market in 2004 due to hepatotoxicity. We have investigated methods to predict various toxic effects of drug candidates to reduce the failure rate of drug discovery. An electrophysiological method was used to assess the cardiotoxicity of drug candidates. Small molecules, including withdrawn drugs, were evaluated using a patch-clamp method to establish a database of hERG inhibition. Nefazodone inhibited hERG channel activity in our system. However, nefazodone-induced hERG inhibition indicated only a theoretical risk of cardiotoxicity. Nefazodone inhibited the hERG channel in a concentration-dependent manner with anmore » IC{sub 50} of 45.3 nM in HEK-293 cells. Nefazodone accelerated both the recovery from inactivation and its onset. Nefazodone also accelerated steady-state inactivation, although it did not modify the voltage-dependent character. Alanine mutants of hERG S6 and pore region residues were used to identify the nefazodone-binding site on hERG. The hERG S6 point mutants Y652A and F656A largely abolished the inhibition by nefazodone. The pore region mutant S624A mildly reduced the inhibition by nefazodone but T623A had little effect. A docking study showed that the aromatic rings of nefazodone interact with Y652 and F656 via π–π interactions, while an amine interacted with the S624 residue in the pore region. In conclusion, Y652 and F656 in the S6 domain play critical roles in nefazodone binding. - Highlights: • Nefazodone inhibits hERG channels with an IC{sub 50} of 45.3 nM in HEK-293 cells. • Nefazodone blocks hERG channels by binding to the open channels. • Y652 and F656 are important for binding of nefazodone. • The aromatic rings of nefazodone interact with Y652 and F656 via π–π interactions.« less
Long-range coupling between ATP-binding and lever-arm regions in myosin via dielectric allostery
NASA Astrophysics Data System (ADS)
Sato, Takato; Ohnuki, Jun; Takano, Mitsunori
2017-12-01
A protein molecule is a dielectric substance, so the binding of a ligand is expected to induce dielectric response in the protein molecule, considering that ligands are charged or polar in general. We previously reported that binding of adenosine triphosphate (ATP) to molecular motor myosin actually induces such a dielectric response in myosin due to the net negative charge of ATP. By this dielectric response, referred to as "dielectric allostery," spatially separated two regions in myosin, the ATP-binding region and the actin-binding region, are allosterically coupled. In this study, from the statistically stringent analyses of the extensive molecular dynamics simulation data obtained in the ATP-free and the ATP-bound states, we show that there exists the dielectric allostery that transmits the signal of ATP binding toward the distant lever-arm region. The ATP-binding-induced electrostatic potential change observed on the surface of the main domain induced a movement of the converter subdomain from which the lever arm extends. The dielectric response was found to be caused by an underlying large-scale concerted rearrangement of the electrostatic bond network, in which highly conserved charged/polar residues are involved. Our study suggests the importance of the dielectric property for molecular machines in exerting their function.
Xie, P; Yuan, C; Wang, C; Zou, X-T; Po, Z; Tong, H-B; Zou, J-M
2014-01-01
1. Peroxisome proliferator-activated receptors (PPAR) are involved in lipid metabolism through transcriptional regulation of target gene expression. The objective of the current study was to clone and characterise the PPARα and PPARγ genes in pigeon. 2. The full-length of 1941-bp PPARα and 1653-bp PPARγ were cloned from pigeons. The two genes were predicted to encode 468 and 475 amino acids, respectively. Both proteins contained two C4-type zinc fingers, a nuclear hormone receptor DNA-binding region signature and a HOLI domain (ligand binding domain of hormone receptors), and had high identities with other corresponding avian genes. 3. Using quantitative real-time PCR, pigeon PPARα gene expression was shown to be high in kidney, liver, gizzard and duodenum whereas PPARγ was predominantly expressed in adipose tissue.
Surtees, Jennifer A; Alani, Eric
2006-07-14
Genetic studies in Saccharomyces cerevisiae predict that the mismatch repair (MMR) factor MSH2-MSH3 binds and stabilizes branched recombination intermediates that form during single strand annealing and gene conversion. To test this model, we constructed a series of DNA substrates that are predicted to form during these recombination events. We show in an electrophoretic mobility shift assay that S. cerevisiae MSH2-MSH3 specifically binds branched DNA substrates containing 3' single-stranded DNA and that ATP stimulates its release from these substrates. Chemical footprinting analyses indicate that MSH2-MSH3 specifically binds at the double-strand/single-strand junction of branched substrates, alters its conformation and opens up the junction. Therefore, MSH2-MSH3 binding to its substrates creates a unique nucleoprotein structure that may signal downstream steps in repair that include interactions with MMR and nucleotide excision repair factors.
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.
Zaman, Rianon; Chowdhury, Shahana Yasmin; Rashid, Mahmood A; Sharma, Alok; Dehzangi, Abdollah; Shatabda, Swakkhar
2017-01-01
DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.
Moreira, Gustavo M. S. G.; Conceição, Fabricio R.; McBride, Alan J. A.; Pinto, Luciano da S.
2013-01-01
Bauhinia variegata lectins (BVL-I and BVL-II) are single chain lectins isolated from the plant Bauhinia variegata. Single chain lectins undergo post-translational processing on its N-terminal and C-terminal regions, which determines their physiological targeting, carbohydrate binding activity and pattern of quaternary association. These two lectins are isoforms, BVL-I being highly glycosylated, and thus far, it has not been possible to determine their structures. The present study used prediction and validation algorithms to elucidate the likely structures of BVL-I and -II. The program Bhageerath-H was chosen from among three different structure prediction programs due to its better overall reliability. In order to predict the C-terminal region cleavage sites, other lectins known to have this modification were analysed and three rules were created: (1) the first amino acid of the excised peptide is small or hydrophobic; (2) the cleavage occurs after an acid, polar, or hydrophobic residue, but not after a basic one; and (3) the cleavage spot is located 5-8 residues after a conserved Leu amino acid. These rules predicted that BVL-I and –II would have fifteen C-terminal residues cleaved, and this was confirmed experimentally by Edman degradation sequencing of BVL-I. Furthermore, the C-terminal analyses predicted that only BVL-II underwent α-helical folding in this region, similar to that seen in SBA and DBL. Conversely, BVL-I and -II contained four conserved regions of a GS-I association, providing evidence of a previously undescribed X4+unusual oligomerisation between the truncated BVL-I and the intact BVL-II. This is the first report on the structural analysis of lectins from Bauhinia spp. and therefore is important for the characterisation C-terminal cleavage and patterns of quaternary association of single chain lectins. PMID:24260572
Moreira, Gustavo M S G; Conceição, Fabricio R; McBride, Alan J A; Pinto, Luciano da S
2013-01-01
Bauhinia variegata lectins (BVL-I and BVL-II) are single chain lectins isolated from the plant Bauhinia variegata. Single chain lectins undergo post-translational processing on its N-terminal and C-terminal regions, which determines their physiological targeting, carbohydrate binding activity and pattern of quaternary association. These two lectins are isoforms, BVL-I being highly glycosylated, and thus far, it has not been possible to determine their structures. The present study used prediction and validation algorithms to elucidate the likely structures of BVL-I and -II. The program Bhageerath-H was chosen from among three different structure prediction programs due to its better overall reliability. In order to predict the C-terminal region cleavage sites, other lectins known to have this modification were analysed and three rules were created: (1) the first amino acid of the excised peptide is small or hydrophobic; (2) the cleavage occurs after an acid, polar, or hydrophobic residue, but not after a basic one; and (3) the cleavage spot is located 5-8 residues after a conserved Leu amino acid. These rules predicted that BVL-I and -II would have fifteen C-terminal residues cleaved, and this was confirmed experimentally by Edman degradation sequencing of BVL-I. Furthermore, the C-terminal analyses predicted that only BVL-II underwent α-helical folding in this region, similar to that seen in SBA and DBL. Conversely, BVL-I and -II contained four conserved regions of a GS-I association, providing evidence of a previously undescribed X4+unusual oligomerisation between the truncated BVL-I and the intact BVL-II. This is the first report on the structural analysis of lectins from Bauhinia spp. and therefore is important for the characterisation C-terminal cleavage and patterns of quaternary association of single chain lectins.
González-Díaz, Humberto; Munteanu, Cristian R; Postelnicu, Lucian; Prado-Prado, Francisco; Gestal, Marcos; Pazos, Alejandro
2012-03-01
Lipid-Binding Proteins (LIBPs) or Fatty Acid-Binding Proteins (FABPs) play an important role in many diseases such as different types of cancer, kidney injury, atherosclerosis, diabetes, intestinal ischemia and parasitic infections. Thus, the computational methods that can predict LIBPs based on 3D structure parameters became a goal of major importance for drug-target discovery, vaccine design and biomarker selection. In addition, the Protein Data Bank (PDB) contains 3000+ protein 3D structures with unknown function. This list, as well as new experimental outcomes in proteomics research, is a very interesting source to discover relevant proteins, including LIBPs. However, to the best of our knowledge, there are no general models to predict new LIBPs based on 3D structures. We developed new Quantitative Structure-Activity Relationship (QSAR) models based on 3D electrostatic parameters of 1801 different proteins, including 801 LIBPs. We calculated these electrostatic parameters with the MARCH-INSIDE software and they correspond to the entire protein or to specific protein regions named core, inner, middle, and surface. We used these parameters as inputs to develop a simple Linear Discriminant Analysis (LDA) classifier to discriminate 3D structure of LIBPs from other proteins. We implemented this predictor in the web server named LIBP-Pred, freely available at , along with other important web servers of the Bio-AIMS portal. The users can carry out an automatic retrieval of protein structures from PDB or upload their custom protein structural models from their disk created with LOMETS server. We demonstrated the PDB mining option performing a predictive study of 2000+ proteins with unknown function. Interesting results regarding the discovery of new Cancer Biomarkers in humans or drug targets in parasites have been discussed here in this sense.