Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi
2018-05-21
With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.
Zhang, Mingcheng; Li, Fangfei; Diao, Xinping; Kong, Baohua; Xia, Xiufang
2017-11-01
This study investigated the effects of multiple freeze-thaw (F-T) cycles on water mobility, microstructure damage and protein structure changes in porcine longissimus muscle. The transverse relaxation time T 2 increased significantly when muscles were subjected to multiple F-T cycles (P<0.05), which means that immobile water shifted to free water and the free water mobility increased. Multiple F-T cycles caused sarcomere shortening, Z line fractures, and I band weakening and also led to microstructural destruction of muscle tissue. The decreased free amino group content and increased dityrosine in myofibrillar protein (MP) revealed that multiple F-T cycles caused protein cross-linking and oxidation. In addition, the results of size exclusion chromatography, circular dichroism spectra, UV absorption spectra, and intrinsic fluorescence spectroscopy indirectly proved that multiple F-T cycles could cause protein aggregation and degradation, α-helix structure disruption, hydrophobic domain exposure, and conformational changes of MP. Overall, repeated F-T cycles changed the protein structure and water distribution within meat. Copyright © 2017 Elsevier Ltd. All rights reserved.
Matt: local flexibility aids protein multiple structure alignment.
Menke, Matthew; Berger, Bonnie; Cowen, Lenore
2008-01-01
Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these "bent" alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of alpha-helices and beta-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.
Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G
2012-09-01
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.
Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin
2017-01-01
Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.
New paradigm in ankyrin repeats: Beyond protein-protein interaction module.
Islam, Zeyaul; Nagampalli, Raghavendra Sashi Krishna; Fatima, Munazza Tamkeen; Ashraf, Ghulam Md
2018-04-01
Classically, ankyrin repeat (ANK) proteins are built from tandems of two or more repeats and form curved solenoid structures that are associated with protein-protein interactions. These are short, widespread structural motif of around 33 amino acids repeats in tandem, having a canonical helix-loop-helix fold, found individually or in combination with other domains. The multiplicity of structural pattern enables it to form assemblies of diverse sizes, required for their abilities to confer multiple binding and structural roles of proteins. Three-dimensional structures of these repeats determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. Recent work on the ANK has proposed novel structural information, especially protein-lipid, protein-sugar and protein-protein interaction. Self-assembly of these repeats was also shown to prevent the associated protein in forming filaments. In this review, we summarize the latest findings and how the new structural information has increased our understanding of the structural determinants of ANK proteins. We discussed latest findings on how these proteins participate in various interactions to diversify the ANK roles in numerous biological processes, and explored the emerging and evolving field of designer ankyrins and its framework for protein engineering emphasizing on biotechnological applications. Copyright © 2017 Elsevier B.V. All rights reserved.
Multiple graph regularized protein domain ranking.
Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin
2012-11-19
Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Multiple graph regularized protein domain ranking
2012-01-01
Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. PMID:23157331
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring
2012-01-01
Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.
Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl
2012-07-13
Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
Zhou, Carol L Ecale
2015-01-01
In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.
A cross docking pipeline for improving pose prediction and virtual screening performance
NASA Astrophysics Data System (ADS)
Kumar, Ashutosh; Zhang, Kam Y. J.
2018-01-01
Pose prediction and virtual screening performance of a molecular docking method depend on the choice of protein structures used for docking. Multiple structures for a target protein are often used to take into account the receptor flexibility and problems associated with a single receptor structure. However, the use of multiple receptor structures is computationally expensive when docking a large library of small molecules. Here, we propose a new cross-docking pipeline suitable to dock a large library of molecules while taking advantage of multiple target protein structures. Our method involves the selection of a suitable receptor for each ligand in a screening library utilizing ligand 3D shape similarity with crystallographic ligands. We have prospectively evaluated our method in D3R Grand Challenge 2 and demonstrated that our cross-docking pipeline can achieve similar or better performance than using either single or multiple-receptor structures. Moreover, our method displayed not only decent pose prediction performance but also better virtual screening performance over several other methods.
del Val, Coral; White, Stephen H.
2014-01-01
We combined systematic bioinformatics analyses and molecular dynamics simulations to assess the conservation patterns of Ser and Thr motifs in membrane proteins, and the effect of such motifs on the structure and dynamics of α-helical transmembrane (TM) segments. We find that Ser/Thr motifs are often present in β-barrel TM proteins. At least one Ser/Thr motif is present in almost half of the sequences of α-helical proteins analyzed here. The extensive bioinformatics analyses and inspection of protein structures led to the identification of molecular transporters with noticeable numbers of Ser/Thr motifs within the TM region. Given the energetic penalty for burying multiple Ser/Thr groups in the membrane hydrophobic core, the observation of transporters with multiple membrane-embedded Ser/Thr is intriguing and raises the question of how the presence of multiple Ser/Thr affects protein local structure and dynamics. Molecular dynamics simulations of four different Ser-containing model TM peptides indicate that backbone hydrogen bonding of membrane-buried Ser/Thr hydroxyl groups can significantly change the local structure and dynamics of the helix. Ser groups located close to the membrane interface can hydrogen bond to solvent water instead of protein backbone, leading to an enhanced local solvation of the peptide. PMID:22836667
Pandey, Aditya; Shin, Kyungsoo; Patterson, Robin E; Liu, Xiang-Qin; Rainey, Jan K
2016-12-01
Membrane proteins are still heavily under-represented in the protein data bank (PDB), owing to multiple bottlenecks. The typical low abundance of membrane proteins in their natural hosts makes it necessary to overexpress these proteins either in heterologous systems or through in vitro translation/cell-free expression. Heterologous expression of proteins, in turn, leads to multiple obstacles, owing to the unpredictability of compatibility of the target protein for expression in a given host. The highly hydrophobic and (or) amphipathic nature of membrane proteins also leads to challenges in producing a homogeneous, stable, and pure sample for structural studies. Circumventing these hurdles has become possible through the introduction of novel protein production protocols; efficient protein isolation and sample preparation methods; and, improvement in hardware and software for structural characterization. Combined, these advances have made the past 10-15 years very exciting and eventful for the field of membrane protein structural biology, with an exponential growth in the number of solved membrane protein structures. In this review, we focus on both the advances and diversity of protein production and purification methods that have allowed this growth in structural knowledge of membrane proteins through X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM).
Pandey, Aditya; Shin, Kyungsoo; Patterson, Robin E.; Liu, Xiang-Qin; Rainey, Jan K.
2017-01-01
Membrane proteins are still heavily underrepresented in the protein data bank (PDB) due to multiple bottlenecks. The typical low abundance of membrane proteins in their natural hosts makes it necessary to overexpress these proteins either in heterologous systems or through in vitro translation/cell-free expression. Heterologous expression of proteins, in turn, leads to multiple obstacles due to the unpredictability of compatibility of the target protein for expression in a given host. The highly hydrophobic and/or amphipathic nature of membrane proteins also leads to challenges in producing a homogeneous, stable, and pure sample for structural studies. Circumventing these hurdles has become possible through introduction of novel protein production protocols; efficient protein isolation and sample preparation methods; and, improvement in hardware and software for structural characterization. Combined, these advances have made the past 10–15 years very exciting and eventful for the field of membrane protein structural biology, with an exponential growth in the number of solved membrane protein structures. In this review, we focus on both the advances and diversity of protein production and purification methods that have allowed this growth in structural knowledge of membrane proteins through X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM). PMID:27010607
GeneSilico protein structure prediction meta-server.
Kurowski, Michal A; Bujnicki, Janusz M
2003-07-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server
Kurowski, Michal A.; Bujnicki, Janusz M.
2003-01-01
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
A minimalist model protein with multiple folding funnels
Locker, C. Rebecca; Hernandez, Rigoberto
2001-01-01
Kinetic and structural studies of wild-type proteins such as prions and amyloidogenic proteins provide suggestive evidence that proteins may adopt multiple long-lived states in addition to the native state. All of these states differ structurally because they lie far apart in configuration space, but their stability is not necessarily caused by cooperative (nucleation) effects. In this study, a minimalist model protein is designed to exhibit multiple long-lived states to explore the dynamics of the corresponding wild-type proteins. The minimalist protein is modeled as a 27-monomer sequence confined to a cubic lattice with three different monomer types. An order parameter—the winding index—is introduced to characterize the extent of folding. The winding index has several advantages over other commonly used order parameters like the number of native contacts. It can distinguish between enantiomers, its calculation requires less computational time than the number of native contacts, and reduced-dimensional landscapes can be developed when the native state structure is not known a priori. The results for the designed model protein prove by existence that the rugged energy landscape picture of protein folding can be generalized to include protein “misfolding” into long-lived states. PMID:11470921
O'Donoghue, Patrick; Luthey-Schulten, Zaida
2005-02-25
We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.
Ashford, Paul; Moss, David S; Alex, Alexander; Yeap, Siew K; Povia, Alice; Nobeli, Irene; Williams, Mark A
2012-03-14
Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible. We have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding. Through post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.
DNA wrapping and distortion by an oligomeric homeodomain protein.
Williams, Hannah; Jayaraman, Padma-Sheela; Gaston, Kevin
2008-10-31
Many transcription factors alter DNA or chromatin structure. Changes in chromatin structure are often brought about by the recruitment of chromatin-binding proteins, chromatin-modifying proteins, or other transcription co-activator or co-repressor proteins. However, some transcription factors form oligomeric assemblies that may themselves induce changes in DNA conformation and chromatin structure. The proline-rich homeodomain (PRH/Hex) protein is a transcription factor that regulates cell differentiation and cell proliferation, and has multiple roles in embryonic development. Earlier, we showed that PRH can repress transcription by multiple mechanisms, including the recruitment of co-repressor proteins belonging to the TLE family of chromatin-binding proteins. Our in vivo crosslinking studies have shown that PRH forms oligomeric complexes in cells and a variety of biophysical techniques suggest that the protein forms octamers. However, as yet we have little knowledge of the role played by PRH oligomerisation in the regulation of promoter activity or of the architecture of promoters that are regulated directly by PRH in cells. Here, we compare the binding of PRH and the isolated PRH homeodomain to DNA fragments with single and multiple PRH sites, using gel retardation assays and DNase I and chemical footprinting. We show that the PRH oligomer binds to multiple sites within the human Goosecoid promoter with high affinity and that the binding of PRH brings about DNA distortion. We suggest that PRH octamers wrap DNA in order to bring about transcriptional repression.
Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.
2007-01-01
SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
Discriminative structural approaches for enzyme active-site prediction.
Kato, Tsuyoshi; Nagano, Nozomi
2011-02-15
Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.
MoonProt: a database for proteins that are known to moonlight
Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.
2015-01-01
Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305
Three-Dimensional Structures Reveal Multiple ADP/ATP Binding Modes
DOE Office of Scientific and Technical Information (OSTI.GOV)
C Simmons; C Magee; D Smith
The creation of synthetic enzymes with predefined functions represents a major challenge in future synthetic biology applications. Here, we describe six structures of de novo proteins that have been determined using protein crystallography to address how simple enzymes perform catalysis. Three structures are of a protein, DX, selected for its stability and ability to tightly bind ATP. Despite the addition of ATP to the crystallization conditions, the presence of a bound but distorted ATP was found only under excess ATP conditions, with ADP being present under equimolar conditions or when crystallized for a prolonged period of time. A bound ADPmore » cofactor was evident when Asp was substituted for Val at residue 65, but ATP in a linear configuration is present when Phe was substituted for Tyr at residue 43. These new structures complement previously determined structures of DX and the protein with the Phe 43 to Tyr substitution [Simmons, C. R., et al. (2009) ACS Chem. Biol. 4, 649-658] and together demonstrate the multiple ADP/ATP binding modes from which a model emerges in which the DX protein binds ATP in a configuration that represents a transitional state for the catalysis of ATP to ADP through a slow, metal-free reaction capable of multiple turnovers. This unusual observation suggests that design-free methods can be used to generate novel protein scaffolds that are tailor-made for catalysis.« less
ERIC Educational Resources Information Center
Terrell, Cassidy R.; Listenberger, Laura L.
2017-01-01
Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…
The β-Arrestins: Multifunctional Regulators of G Protein-coupled Receptors*
Smith, Jeffrey S.; Rajagopal, Sudarshan
2016-01-01
The β-arrestins (βarrs) are versatile, multifunctional adapter proteins that are best known for their ability to desensitize G protein-coupled receptors (GPCRs), but also regulate a diverse array of cellular functions. To signal in such a complex fashion, βarrs adopt multiple conformations and are regulated at multiple levels to differentially activate downstream pathways. Recent structural studies have demonstrated that βarrs have a conserved structure and activation mechanism, with plasticity of their structural fold, allowing them to adopt a wide array of conformations. Novel roles for βarrs continue to be identified, demonstrating the importance of these dynamic regulators of cellular signaling. PMID:26984408
Prediction of β-turns in proteins from multiple alignment using neural network
Kaur, Harpreet; Raghava, Gajendra Pal Singh
2003-01-01
A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033
Parallel seed-based approach to multiple protein structure similarities detection
Chapuis, Guillaume; Le Boudic-Jamin, Mathilde; Andonov, Rumen; ...
2015-01-01
Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makesmore » our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.« less
Interactive comparison and remediation of collections of macromolecular structures.
Moriarty, Nigel W; Liebschner, Dorothee; Klei, Herbert E; Echols, Nathaniel; Afonine, Pavel V; Headd, Jeffrey J; Poon, Billy K; Adams, Paul D
2018-01-01
Often similar structures need to be compared to reveal local differences throughout the entire model or between related copies within the model. Therefore, a program to compare multiple structures and enable correction any differences not supported by the density map was written within the Phenix framework (Adams et al., Acta Cryst 2010; D66:213-221). This program, called Structure Comparison, can also be used for structures with multiple copies of the same protein chain in the asymmetric unit, that is, as a result of non-crystallographic symmetry (NCS). Structure Comparison was designed to interface with Coot(Emsley et al., Acta Cryst 2010; D66:486-501) and PyMOL(DeLano, PyMOL 0.99; 2002) to facilitate comparison of large numbers of related structures. Structure Comparison analyzes collections of protein structures using several metrics, such as the rotamer conformation of equivalent residues, displays the results in tabular form and allows superimposed protein chains and density maps to be quickly inspected and edited (via the tools in Coot) for consistency, completeness and correctness. © 2017 The Protein Society.
Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin
2007-12-01
Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide
Karasawa, N; Mitsutake, A; Takano, H
2017-12-01
Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n]polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μs molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.
NASA Astrophysics Data System (ADS)
Karasawa, N.; Mitsutake, A.; Takano, H.
2017-12-01
Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n ] polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μ s molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.
Targeting of a Nuclease to Murine Leukemia Virus Capsids Inhibits Viral Multiplication
NASA Astrophysics Data System (ADS)
Natsoulis, Georges; Seshaiah, Partha; Federspiel, Mark J.; Rein, Alan; Hughes, Stephen H.; Boeke, Jef D.
1995-01-01
Capsid-targeted viral inactivation is an antiviral strategy in which toxic fusion proteins are targeted to virions, where they inhibit viral multiplication by destroying viral components. These fusion proteins consist of a virion structural protein moiety and an enzymatic moiety such as a nuclease. Such fusion proteins can severely inhibit transposition of yeast retrotransposon Ty1, an element whose transposition mechanistically resembles retroviral multiplication. We demonstrate that expression of a murine retrovirus capsid-staphylococcal nuclease fusion protein inhibits multiplication of the corresponding murine leukemia virus by 30- to 100-fold. Staphylococcal nuclease is apparently inactive intracellularly and hence nontoxic to the host cell, but it is active extracellularly because of its requirement for high concentrations of Ca2+ ions. Virions assembled in and shed from cells expressing the fusion protein contain very small amounts of intact viral RNA, as would be predicted for nuclease-mediated inhibition of viral multiplication.
Ko, Junsu; Park, Hahnbeom; Seok, Chaok
2012-08-10
Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on "Seok-server," which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.
NASA Astrophysics Data System (ADS)
Guinn, Emily J.; Jagannathan, Bharat; Marqusee, Susan
2015-04-01
A fundamental question in protein folding is whether proteins fold through one or multiple trajectories. While most experiments indicate a single pathway, simulations suggest proteins can fold through many parallel pathways. Here, we use a combination of chemical denaturant, mechanical force and site-directed mutations to demonstrate the presence of multiple unfolding pathways in a simple, two-state folding protein. We show that these multiple pathways have structurally different transition states, and that seemingly small changes in protein sequence and environment can strongly modulate the flux between the pathways. These results suggest that in vivo, the crowded cellular environment could strongly influence the mechanisms of protein folding and unfolding. Our study resolves the apparent dichotomy between experimental and theoretical studies, and highlights the advantage of using a multipronged approach to reveal the complexities of a protein's free-energy landscape.
PDBStat: a universal restraint converter and restraint analysis software package for protein NMR.
Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M; Montelione, Gaetano T
2013-08-01
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.
PDBStat: A Universal Restraint Converter and Restraint Analysis Software Package for Protein NMR
Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M.; Montelione, Gaetano T
2013-01-01
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data. PMID:23897031
Fitting Multimeric Protein Complexes into Electron Microscopy Maps Using 3D Zernike Descriptors
Esquivel-Rodríguez, Juan; Kihara, Daisuke
2012-01-01
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root mean square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases. PMID:22417139
Fitting multimeric protein complexes into electron microscopy maps using 3D Zernike descriptors.
Esquivel-Rodríguez, Juan; Kihara, Daisuke
2012-06-14
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three-dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root-mean-square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases.
The β-Arrestins: Multifunctional Regulators of G Protein-coupled Receptors.
Smith, Jeffrey S; Rajagopal, Sudarshan
2016-04-22
The β-arrestins (βarrs) are versatile, multifunctional adapter proteins that are best known for their ability to desensitize G protein-coupled receptors (GPCRs), but also regulate a diverse array of cellular functions. To signal in such a complex fashion, βarrs adopt multiple conformations and are regulated at multiple levels to differentially activate downstream pathways. Recent structural studies have demonstrated that βarrs have a conserved structure and activation mechanism, with plasticity of their structural fold, allowing them to adopt a wide array of conformations. Novel roles for βarrs continue to be identified, demonstrating the importance of these dynamic regulators of cellular signaling. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
A Method for WD40 Repeat Detection and Secondary Structure Prediction
Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong
2013-01-01
WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530
How protein materials balance strength, robustness, and adaptability
Buehler, Markus J.; Yung, Yu Ching
2010-01-01
Proteins form the basis of a wide range of biological materials such as hair, skin, bone, spider silk, or cells, which play an important role in providing key functions to biological systems. The focus of this article is to discuss how protein materials are capable of balancing multiple, seemingly incompatible properties such as strength, robustness, and adaptability. To illustrate this, we review bottom-up materiomics studies focused on the mechanical behavior of protein materials at multiple scales, from nano to macro. We focus on alpha-helix based intermediate filament proteins as a model system to explain why the utilization of hierarchical structural features is vital to their ability to combine strength, robustness, and adaptability. Experimental studies demonstrating the activation of angiogenesis, the growth of new blood vessels, are presented as an example of how adaptability of structure in biological tissue is achieved through changes in gene expression that result in an altered material structure. We analyze the concepts in light of the universality and diversity of the structural makeup of protein materials and discuss the findings in the context of potential fundamental evolutionary principles that control their nanoscale structure. We conclude with a discussion of multiscale science in biology and de novo materials design. PMID:20676305
Nucleotide-dependent bisANS binding to tubulin.
Chakraborty, S; Sarkar, N; Bhattacharyya, B
1999-07-13
Non-covalent hydrophobic probes such as 5, 5'-bis(8-anilino-1-naphthalenesulfonate) (bisANS) have become increasingly popular to gain information about protein structure and conformation. However, there are limitations as bisANS binds non-specifically at multiple sites of many proteins. Successful use of this probe depends upon the development of binding conditions where only specific dye-protein interaction will occur. In this report, we have shown that the binding of bisANS to tubulin occurs instantaneously, specifically at one high affinity site when 1 mM guanosine 5'-triphosphate (GTP) is included in the reaction medium. Substantial portions of protein secondary structure and colchicine binding activity of tubulin are lost upon bisANS binding in absence of GTP. BisANS binding increases with time and occurs at multiple sites in the absence of GTP. Like GTP, other analogs, guanosine 5'-diphosphate, guanosine 5'-monophosphate and adenosine 5'-triphosphate, also displace bisANS from the lower affinity sites of tubulin. We believe that these multiple binding sites are generated due to the bisANS-induced structural changes on tubulin and the presence of GTP and other nucleotides protect those structural changes.
The Multiple-Minima Problem in Protein Folding
NASA Astrophysics Data System (ADS)
Scheraga, Harold A.
1991-10-01
The conformational energy surface of a polypeptide or protein has many local minima, and conventional energy minimization procedures reach only a local minimum (near the starting point of the optimization algorithm) instead of the global minimum (the multiple-minima problem). Several procedures have been developed to surmount this problem, the most promising of which are: (a) build up procedure, (b) optimization of electrostatics, (c) Monte Carlo-plus-energy minimization, (d) electrostatically-driven Monte Carlo, (e) inclusion of distance restraints, (f) adaptive importance-sampling Monte Carlo, (g) relaxation of dimensionality, (h) pattern-recognition, and (i) diffusion equation method. These procedures have been applied to a variety of polypeptide structural problems, and the results of such computations are presented. These include the computation of the structures of open-chain and cyclic peptides, fibrous proteins and globular proteins. Present efforts are being devoted to scaling up these procedures from small polypeptides to proteins, to try to compute the three-dimensional structure of a protein from its amino sequence.
Protein Structure and Function Prediction Using I-TASSER
Yang, Jianyi; Zhang, Yang
2016-01-01
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.
2013-04-01
Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less
Protein disulfide isomerase a multifunctional protein with multiple physiological roles
NASA Astrophysics Data System (ADS)
Ali Khan, Hyder; Mutus, Bulent
2014-08-01
Protein disulfide isomerase (PDI), is a member of the thioredoxin superfamily of redox proteins. PDI has three catalytic activities including, thiol-disulfide oxireductase, disulfide isomerase and redox-dependent chaperone. Originally, PDI was identified in the lumen of the endoplasmic reticulum and subsequently detected at additional locations, such as cell surfaces and the cytosol. This review will provide an overview of the recent advances in relating the structural features of PDI to its multiple catalytic roles as well as its physiological and pathophysiological functions related to redox regulation and protein folding.
Multiple scales and phases in discrete chains with application to folded proteins
NASA Astrophysics Data System (ADS)
Sinelnikova, A.; Niemi, A. J.; Nilsson, Johan; Ulybyshev, M.
2018-05-01
Chiral heteropolymers such as large globular proteins can simultaneously support multiple length scales. The interplay between the different scales brings about conformational diversity, determines the phase properties of the polymer chain, and governs the structure of the energy landscape. Most importantly, multiple scales produce complex dynamics that enable proteins to sustain live matter. However, at the moment there is incomplete understanding of how to identify and distinguish the various scales that determine the structure and dynamics of a complex protein. Here we address this impending problem. We develop a methodology with the potential to systematically identify different length scales, in the general case of a linear polymer chain. For this we introduce and analyze the properties of an order parameter that can both reveal the presence of different length scales and can also probe the phase structure. We first develop our concepts in the case of chiral homopolymers. We introduce a variant of Kadanoff's block-spin transformation to coarse grain piecewise linear chains, such as the C α backbone of a protein. We derive analytically, and then verify numerically, a number of properties that the order parameter can display, in the case of a chiral polymer chain. In particular, we propose that in the case of a chiral heteropolymer the order parameter can reveal traits of several different phases, contingent on the length scale at which it is scrutinized. We confirm that this is the case with crystallographic protein structures in the Protein Data Bank. Thus our results suggest relations between the scales, the phases, and the complexity of folding pathways.
Matching multiple rigid domain decompositions of proteins
Flynn, Emily; Streinu, Ileana
2017-01-01
We describe efficient methods for consistently coloring and visualizing collections of rigid cluster decompositions obtained from variations of a protein structure, and lay the foundation for more complex setups that may involve different computational and experimental methods. The focus here is on three biological applications: the conceptually simpler problems of visualizing results of dilution and mutation analyses, and the more complex task of matching decompositions of multiple NMR models of the same protein. Implemented into the KINARI web server application, the improved visualization techniques give useful information about protein folding cores, help examining the effect of mutations on protein flexibility and function, and provide insights into the structural motions of PDB proteins solved with solution NMR. These tools have been developed with the goal of improving and validating rigidity analysis as a credible coarse-grained model capturing essential information about a protein’s slow motions near the native state. PMID:28141528
Defining and predicting structurally conserved regions in protein superfamilies
Huang, Ivan K.; Grishin, Nick V.
2013-01-01
Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223
Protein structure modeling for CASP10 by multiple layers of global optimization.
Joo, Keehyoung; Lee, Juyong; Sim, Sangjin; Lee, Sun Young; Lee, Kiho; Heo, Seungryong; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung
2014-02-01
In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Copyright © 2013 Wiley Periodicals, Inc.
Structural Biology of Non-Ribosomal Peptide Synthetases
Miller, Bradley R.; Gulick, Andrew M.
2016-01-01
Summary The non-ribosomal peptide synthetases are modular enzymes that catalyze synthesis of important peptide products from a variety of standard and non-proteinogenic amino acid substrates. Within a single module are multiple catalytic domains that are responsible for incorporation of a single residue. After the amino acid is activated and covalently attached to an integrated carrier protein domain, the substrates and intermediates are delivered to neighboring catalytic domains for peptide bond formation or, in some modules, chemical modification. In the final module, the peptide is delivered to a terminal thioesterase domain that catalyzes release of the peptide product. This multi-domain modular architecture raises questions about the structural features that enable this assembly line synthesis in an efficient manner. The structures of the core component domains have been determined and demonstrate insights into the catalytic activity. More recently, multi-domain structures have been determined and are providing clues to the features of these enzyme systems that govern the functional interaction between multiple domains. This chapter describes the structures of NRPS proteins and the strategies that are being used to assist structural studies of these dynamic proteins, including careful consideration of domain boundaries for generation of truncated proteins and the use of mechanism-based inhibitors that trap interactions between the catalytic and carrier protein domains. PMID:26831698
Shi, Xiaohu; Zhang, Jingfen; He, Zhiquan; Shang, Yi; Xu, Dong
2011-09-01
One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
A coarse grain model for protein-surface interactions
NASA Astrophysics Data System (ADS)
Wei, Shuai; Knotts, Thomas A.
2013-09-01
The interaction of proteins with surfaces is important in numerous applications in many fields—such as biotechnology, proteomics, sensors, and medicine—but fundamental understanding of how protein stability and structure are affected by surfaces remains incomplete. Over the last several years, molecular simulation using coarse grain models has yielded significant insights, but the formalisms used to represent the surface interactions have been rudimentary. We present a new model for protein surface interactions that incorporates the chemical specificity of both the surface and the residues comprising the protein in the context of a one-bead-per-residue, coarse grain approach that maintains computational efficiency. The model is parameterized against experimental adsorption energies for multiple model peptides on different types of surfaces. The validity of the model is established by its ability to quantitatively and qualitatively predict the free energy of adsorption and structural changes for multiple biologically-relevant proteins on different surfaces. The validation, done with proteins not used in parameterization, shows that the model produces remarkable agreement between simulation and experiment.
MOCASSIN-prot: A multi-objective clustering approach for protein similarity networks
USDA-ARS?s Scientific Manuscript database
Motivation: Proteins often include multiple conserved domains. Various evolutionary events including duplication and loss of domains, domain shuffling, as well as sequence divergence contribute to generating complexities in protein structures, and consequently, in their functions. The evolutionary h...
Neutron protein crystallography: A complementary tool for locating hydrogens in proteins.
O'Dell, William B; Bodenheimer, Annette M; Meilleur, Flora
2016-07-15
Neutron protein crystallography is a powerful tool for investigating protein chemistry because it directly locates hydrogen atom positions in a protein structure. The visibility of hydrogen and deuterium atoms arises from the strong interaction of neutrons with the nuclei of these isotopes. Positions can be unambiguously assigned from diffraction at resolutions typical of protein crystals. Neutrons have the additional benefit to structural biology of not inducing radiation damage in protein crystals. The same crystal could be measured multiple times for parametric studies. Here, we review the basic principles of neutron protein crystallography. The information that can be gained from a neutron structure is presented in balance with practical considerations. Methods to produce isotopically-substituted proteins and to grow large crystals are provided in the context of neutron structures reported in the literature. Available instruments for data collection and software for data processing and structure refinement are described along with technique-specific strategies including joint X-ray/neutron structure refinement. Examples are given to illustrate, ultimately, the unique scientific value of neutron protein crystal structures. Copyright © 2015 Elsevier Inc. All rights reserved.
Yan, Si; Guo, Changmiao; Hou, Guangjin; Zhang, Huilan; Lu, Xingyu; Williams, John Charles; Polenova, Tatyana
2015-11-24
Microtubules and their associated proteins perform a broad array of essential physiological functions, including mitosis, polarization and differentiation, cell migration, and vesicle and organelle transport. As such, they have been extensively studied at multiple levels of resolution (e.g., from structural biology to cell biology). Despite these efforts, there remain significant gaps in our knowledge concerning how microtubule-binding proteins bind to microtubules, how dynamics connect different conformational states, and how these interactions and dynamics affect cellular processes. Structures of microtubule-associated proteins assembled on polymeric microtubules are not known at atomic resolution. Here, we report a structure of the cytoskeleton-associated protein glycine-rich (CAP-Gly) domain of dynactin motor on polymeric microtubules, solved by magic angle spinning NMR spectroscopy. We present the intermolecular interface of CAP-Gly with microtubules, derived by recording direct dipolar contacts between CAP-Gly and tubulin using double rotational echo double resonance (dREDOR)-filtered experiments. Our results indicate that the structure adopted by CAP-Gly varies, particularly around its loop regions, permitting its interaction with multiple binding partners and with the microtubules. To our knowledge, this study reports the first atomic-resolution structure of a microtubule-associated protein on polymeric microtubules. Our approach lays the foundation for atomic-resolution structural analysis of other microtubule-associated motors.
Lin, Changsheng; Ear, Jason; Midde, Krishna; Lopez-Sanchez, Inmaculada; Aznar, Nicolas; Garcia-Marcos, Mikel; Kufareva, Irina; Abagyan, Ruben; Ghosh, Pradipta
2014-01-01
A long-standing issue in the field of signal transduction is to understand the cross-talk between receptor tyrosine kinases (RTKs) and heterotrimeric G proteins, two major and distinct signaling hubs that control eukaryotic cell behavior. Although stimulation of many RTKs leads to activation of trimeric G proteins, the molecular mechanisms behind this phenomenon remain elusive. We discovered a unifying mechanism that allows GIV/Girdin, a bona fide metastasis-related protein and a guanine-nucleotide exchange factor (GEF) for Gαi, to serve as a direct platform for multiple RTKs to activate Gαi proteins. Using a combination of homology modeling, protein–protein interaction, and kinase assays, we demonstrate that a stretch of ∼110 amino acids within GIV C-terminus displays structural plasticity that allows folding into a SH2-like domain in the presence of phosphotyrosine ligands. Using protein–protein interaction assays, we demonstrated that both SH2 and GEF domains of GIV are required for the formation of a ligand-activated ternary complex between GIV, Gαi, and growth factor receptors and for activation of Gαi after growth factor stimulation. Expression of a SH2-deficient GIV mutant (Arg 1745→Leu) that cannot bind RTKs impaired all previously demonstrated functions of GIV—Akt enhancement, actin remodeling, and cell migration. The mechanistic and structural insights gained here shed light on the long-standing questions surrounding RTK/G protein cross-talk, set a novel paradigm, and characterize a unique pharmacological target for uncoupling GIV-dependent signaling downstream of multiple oncogenic RTKs. PMID:25187647
ProTSAV: A protein tertiary structure analysis and validation server.
Singh, Ankita; Kaushik, Rahul; Mishra, Avinash; Shanker, Asheesh; Jayaram, B
2016-01-01
Quality assessment of predicted model structures of proteins is as important as the protein tertiary structure prediction. A highly efficient quality assessment of predicted model structures directs further research on function. Here we present a new server ProTSAV, capable of evaluating predicted model structures based on some popular online servers and standalone tools. ProTSAV furnishes the user with a single quality score in case of individual protein structure along with a graphical representation and ranking in case of multiple protein structure assessment. The server is validated on ~64,446 protein structures including experimental structures from RCSB and predicted model structures for CASP targets and from public decoy sets. ProTSAV succeeds in predicting quality of protein structures with a specificity of 100% and a sensitivity of 98% on experimentally solved structures and achieves a specificity of 88%and a sensitivity of 91% on predicted protein structures of CASP11 targets under 2Å.The server overcomes the limitations of any single server/method and is seen to be robust in helping in quality assessment. ProTSAV is freely available at http://www.scfbio-iitd.res.in/software/proteomics/protsav.jsp. Copyright © 2015 Elsevier B.V. All rights reserved.
CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area.
Terashi, Genki; Takeda-Shitaka, Mayuko
2015-01-01
Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue-residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue-residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.
Single-site labeling of lysine in proteins through a metal-free multicomponent approach.
Chilamari, Maheshwerreddy; Kalra, Neetu; Shukla, Sanjeev; Rai, Vishal
2018-06-15
We report a chemoselective and site-selective approach that distinguishes one Lys from its multiple copies, N-terminus, and other competitors. The phospha-Mannich protocol works with multiple proteins and installs probes without structural and functional perturbations. It delivers an antibody-drug conjugate with selective anti-proliferative activity towards HER2 expressing SKBR3 breast cancer cells.
Thermostabilisation of membrane proteins for structural studies
Magnani, Francesca; Serrano-Vega, Maria J.; Shibata, Yoko; Abdul-Hussein, Saba; Lebon, Guillaume; Miller-Gallacher, Jennifer; Singhal, Ankita; Strege, Annette; Thomas, Jennifer A.; Tate, Christopher G.
2017-01-01
The thermostability of an integral membrane protein in detergent solution is a key parameter that dictates the likelihood of obtaining well-diffracting crystals suitable for structure determination. However, many mammalian membrane proteins are too unstable for crystallisation. We developed a thermostabilisation strategy based on systematic mutagenesis coupled to a radioligand-binding thermostability assay that can be applied to receptors, ion channels and transporters. It takes approximately 6-12 months to thermostabilise a G protein-coupled receptor (GPCR) containing 300 amino acid residues. The resulting thermostabilised membrane proteins are more easily crystallised and result in high-quality structures. This methodology has facilitated structure-based drug design applied to GPCRs, because it is possible to determine multiple structures of the thermostabilised receptors bound to low affinity ligands. Protocols and advice are given on how to develop thermostability assays for membrane proteins and how to combine mutations to make an optimally stable mutant suitable for structural studies. PMID:27466713
A structural study of F-actin - filamin networks
NASA Astrophysics Data System (ADS)
Ahrens-Braunstein, Ashley; Nguyen, Lam; Hirst, Linda
2010-03-01
The cell's ability to move and contract is attributed to the semi-flexible filamentous protein, F -actin, one of the three filaments in the cytoskeleton. Actin bundling can be formed by a cross-linking actin binding protein (ABP) filamin. By examining filamin's cross-linking abilities at different concentrations and molar ratios, we can study the flexibility, structure and multiple network formations created when cross-linking F-actin with this protein. We have studied the phase diagram of this protein system using fluorescence microscopy, analyzing the network structures observed in the context of a coarse grained molecular dynamics simulation carried out by our group.
Kirschner, Andreas; Frishman, Dmitrij
2008-10-01
Prediction of beta-turns from amino acid sequences has long been recognized as an important problem in structural bioinformatics due to their frequent occurrence as well as their structural and functional significance. Because various structural features of proteins are intercorrelated, secondary structure information has been often employed as an additional input for machine learning algorithms while predicting beta-turns. Here we present a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) capable of predicting multiple mutually dependent structural motifs and demonstrate its efficiency in recognizing three aspects of protein structure: beta-turns, beta-turn types, and secondary structure. The advantage of our method compared to other predictors is that it does not require any external input except for sequence profiles because interdependencies between different structural features are taken into account implicitly during the learning process. In a sevenfold cross-validation experiment on a standard test dataset our method exhibits the total prediction accuracy of 77.9% and the Mathew's Correlation Coefficient of 0.45, the highest performance reported so far. It also outperforms other known methods in delineating individual turn types. We demonstrate how simultaneous prediction of multiple targets influences prediction performance on single targets. The MOLEBRNN presented here is a generic method applicable in a variety of research fields where multiple mutually depending target classes need to be predicted. http://webclu.bio.wzw.tum.de/predator-web/.
A series of PDB related databases for everyday needs.
Joosten, Robbie P; te Beek, Tim A H; Krieger, Elmar; Hekkelman, Maarten L; Hooft, Rob W W; Schneider, Reinhard; Sander, Chris; Vriend, Gert
2011-01-01
The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design.
Structural mechanisms of chaperone mediated protein disaggregation
Sousa, Rui
2014-01-01
The ClpB/Hsp104 and Hsp70 classes of molecular chaperones use ATP hydrolysis to dissociate protein aggregates and complexes, and to move proteins through membranes. ClpB/Hsp104 are members of the AAA+ family of proteins which form ring-shaped hexamers. Loops lining the pore in the ring engage substrate proteins as extended polypeptides. Interdomain rotations and conformational changes in these loops coupled to ATP hydrolysis unfold and pull proteins through the pore. This provides a mechanism that progressively disrupts local secondary and tertiary structure in substrates, allowing these chaperones to dissociate stable aggregates such as β-sheet rich prions or coiled coil SNARE complexes. While the ClpB/Hsp104 mechanism appears to embody a true power-stroke in which an ATP powered conformational change in one protein is directly coupled to movement or structural change in another, the mechanism of force generation by Hsp70s is distinct and less well understood. Both active power-stroke and purely passive mechanisms in which Hsp70 captures spontaneous fluctuations in a substrate have been proposed, while a third proposed mechanism—entropic pulling—may be able to generate forces larger than seen in ATP-driven molecular motors without the conformational coupling required for a power-stroke. The disaggregase activity of these chaperones is required for thermotolerance, but unrestrained protein complex/aggregate dissociation is potentially detrimental. Disaggregating chaperones are strongly auto-repressed, and are regulated by co-chaperones which recruit them to protein substrates and activate the disaggregases via mechanisms involving either sequential transfer of substrate from one chaperone to another and/or simultaneous interaction of substrate with multiple chaperones. By effectively subjecting substrates to multiple levels of selection by multiple chaperones, this may insure that these potent disaggregases are only activated in the appropriate context. PMID:25988153
Structural basis for specific recognition of multiple mRNA targets by a PUF regulatory protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yeming; Opperman, Laura; Wickens, Marvin
2011-11-02
Caenorhabditis elegans fem-3 binding factor (FBF) is a founding member of the PUMILIO/FBF (PUF) family of mRNA regulatory proteins. It regulates multiple mRNAs critical for stem cell maintenance and germline development. Here, we report crystal structures of FBF in complex with 6 different 9-nt RNA sequences, including elements from 4 natural mRNAs. These structures reveal that FBF binds to conserved bases at positions 1-3 and 7-8. The key specificity determinant of FBF vs. other PUF proteins lies in positions 4-6. In FBF/RNA complexes, these bases stack directly with one another and turn away from the RNA-binding surface. A short regionmore » of FBF is sufficient to impart its unique specificity and lies directly opposite the flipped bases. We suggest that this region imposes a flattened curvature on the protein; hence, the requirement for the additional nucleotide. The principles of FBF/RNA recognition suggest a general mechanism by which PUF proteins recognize distinct families of RNAs yet exploit very nearly identical atomic contacts in doing so.« less
Structural basis for specific recognition of multiple mRNA targets by a PUF regulatory protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yeming; Opperman, Laura; Wickens, Marvin
2010-08-19
Caenorhabditis elegans fem-3 binding factor (FBF) is a founding member of the PUMILIO/FBF (PUF) family of mRNA regulatory proteins. It regulates multiple mRNAs critical for stem cell maintenance and germline development. Here, we report crystal structures of FBF in complex with 6 different 9-nt RNA sequences, including elements from 4 natural mRNAs. These structures reveal that FBF binds to conserved bases at positions 1-3 and 7-8. The key specificity determinant of FBF vs. other PUF proteins lies in positions 4-6. In FBF/RNA complexes, these bases stack directly with one another and turn away from the RNA-binding surface. A short regionmore » of FBF is sufficient to impart its unique specificity and lies directly opposite the flipped bases. We suggest that this region imposes a flattened curvature on the protein; hence, the requirement for the additional nucleotide. The principles of FBF/RNA recognition suggest a general mechanism by which PUF proteins recognize distinct families of RNAs yet exploit very nearly identical atomic contacts in doing so.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, S.; Park, S.; Makowski, L.
Small angle X-ray scattering (SAXS) is an increasingly powerful technique to characterize the structure of biomolecules in solution. We present a computational method for accurately and efficiently computing the solution scattering curve from a protein with dynamical fluctuations. The method is built upon a coarse-grained (CG) representation of the protein. This CG approach takes advantage of the low-resolution character of solution scattering. It allows rapid determination of the scattering pattern from conformations extracted from CG simulations to obtain scattering characterization of the protein conformational landscapes. Important elements incorporated in the method include an effective residue-based structure factor for each aminomore » acid, an explicit treatment of the hydration layer at the surface of the protein, and an ensemble average of scattering from all accessible conformations to account for macromolecular flexibility. The CG model is calibrated and illustrated to accurately reproduce the experimental scattering curve of Hen egg white lysozyme. We then illustrate the computational method by calculating the solution scattering pattern of several representative protein folds and multiple conformational states. The results suggest that solution scattering data, when combined with a reliable computational method, have great potential for a better structural description of multi-domain complexes in different functional states, and for recognizing structural folds when sequence similarity to a protein of known structure is low. Possible applications of the method are discussed.« less
Structure of Protein Layers in Polyelectrolyte Matrices Studied by Neutron Reflectivity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kozlovskaya, Veronika; Ankner, John Francis; O'Neill, Hugh Michael
2011-01-01
Polyelectrolyte multilayer films obtained by localized incorporation of Green Fluorescent Protein (GFP) within electrostatically assembled matrices of poly(styrene sulfonate)/poly(allylamine hydrochloride) (PSS/PAH) via spin-assisted layer-by-layer growth were discovered to be highly structured, with closely packed monomolecular layers of the protein within the bio-hybrid films. The structure of the films was evaluated in both vertical and lateral directions with neutron reflectometry, using deuterated GFP as a marker for neutron scattering contrast. Importantly, the GFP preserves its structural stability upon assembly as confirmed by circular dichroism (CD) and in situ attenuated total reflection Fourier Transform Infrared spectroscopy (ATR-FTIR). Atomic force microscopy was complimentedmore » with X-ray reflectometry to characterize the external roughness of the biohybrid films. Remarkably, films assembled with a single GFP layer confined at various distances from the substrate exhibit a strong localization of the GFP layer without intermixing into the LbL matrix. However, partial intermixing of the GFP layers with polymeric material is evidenced in multiple-GFP layer films with alternating protein-rich and protein-deficient regions. We hypothesize that the polymer-protein exchange observed in the multiple-GFP layer films suggests the existence of a critical protein concentration which can be accommodated by the multilayer matrix. Our results yield new insights into the mechanism of GFP interaction with a polyelectrolyte matrix and open opportunities for fabrication of bio-hybrid films with well-organized structure and controllable function, a crucial requirement for advanced sensing applications.« less
Phan, Isabelle Q. H.; Scheib, Holger; Subramanian, Sandhya; Edwards, Thomas E.; Lehman, Stephanie S.; Piitulainen, Hanna; Sayeedur Rahman, M.; Rennoll-Bankert, Kristen E.; Staker, Bart L.; Taira, Suvi; Stacy, Robin; Myler, Peter J.; Azad, Abdu F.
2015-01-01
ABSTRACT Prokaryotes use type IV secretion systems (T4SSs) to translocate substrates (e.g., nucleoprotein, DNA, and protein) and/or elaborate surface structures (i.e., pili or adhesins). Bacterial genomes may encode multiple T4SSs, e.g., there are three functionally divergent T4SSs in some Bartonella species (vir, vbh, and trw). In a unique case, most rickettsial species encode a T4SS (rvh) enriched with gene duplication. Within single genomes, the evolutionary and functional implications of cross-system interchangeability of analogous T4SS protein components remains poorly understood. To lend insight into cross-system interchangeability, we analyzed the VirB8 family of T4SS channel proteins. Crystal structures of three VirB8 and two TrwG Bartonella proteins revealed highly conserved C-terminal periplasmic domain folds and dimerization interfaces, despite tremendous sequence divergence. This implies remarkable structural constraints for VirB8 components in the assembly of a functional T4SS. VirB8/TrwG heterodimers, determined via bacterial two-hybrid assays and molecular modeling, indicate that differential expression of trw and vir systems is the likely barrier to VirB8-TrwG interchangeability. We also determined the crystal structure of Rickettsia typhi RvhB8-II and modeled its coexpressed divergent paralog RvhB8-I. Remarkably, while RvhB8-I dimerizes and is structurally similar to other VirB8 proteins, the RvhB8-II dimer interface deviates substantially from other VirB8 structures, potentially preventing RvhB8-I/RvhB8-II heterodimerization. For the rvh T4SS, the evolution of divergent VirB8 paralogs implies a functional diversification that is unknown in other T4SSs. Collectively, our data identify two different constraints (spatiotemporal for Bartonella trw and vir T4SSs and structural for rvh T4SSs) that mediate the functionality of multiple divergent T4SSs within a single bacterium. PMID:26646013
Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling.
Meier, Armin; Söding, Johannes
2015-10-01
Homology modeling predicts the 3D structure of a query protein based on the sequence alignment with one or more template proteins of known structure. Its great importance for biological research is owed to its speed, simplicity, reliability and wide applicability, covering more than half of the residues in protein sequence space. Although multiple templates have been shown to generally increase model quality over single templates, the information from multiple templates has so far been combined using empirically motivated, heuristic approaches. We present here a rigorous statistical framework for multi-template homology modeling. First, we find that the query proteins' atomic distance restraints can be accurately described by two-component Gaussian mixtures. This insight allowed us to apply the standard laws of probability theory to combine restraints from multiple templates. Second, we derive theoretically optimal weights to correct for the redundancy among related templates. Third, a heuristic template selection strategy is proposed. We improve the average GDT-ha model quality score by 11% over single template modeling and by 6.5% over a conventional multi-template approach on a set of 1000 query proteins. Robustness with respect to wrong constraints is likewise improved. We have integrated our multi-template modeling approach with the popular MODELLER homology modeling software in our free HHpred server http://toolkit.tuebingen.mpg.de/hhpred and also offer open source software for running MODELLER with the new restraints at https://bitbucket.org/soedinglab/hh-suite.
Protein folding and misfolding: mechanism and principles
Englander, S. Walter; Mayne, Leland; Krishna, Mallela M. G.
2012-01-01
Two fundamentally different views of how proteins fold are now being debated. Do proteins fold through multiple unpredictable routes directed only by the energetically downhill nature of the folding landscape or do they fold through specific intermediates in a defined pathway that systematically puts predetermined pieces of the target native protein into place? It has now become possible to determine the structure of protein folding intermediates, evaluate their equilibrium and kinetic parameters, and establish their pathway relationships. Results obtained for many proteins have serendipitously revealed a new dimension of protein structure. Cooperative structural units of the native protein, called foldons, unfold and refold repeatedly even under native conditions. Much evidence obtained by hydrogen exchange and other methods now indicates that cooperative foldon units and not individual amino acids account for the unit steps in protein folding pathways. The formation of foldons and their ordered pathway assembly systematically puts native-like foldon building blocks into place, guided by a sequential stabilization mechanism in which prior native-like structure templates the formation of incoming foldons with complementary structure. Thus the same propensities and interactions that specify the final native state, encoded in the amino-acid sequence of every protein, determine the pathway for getting there. Experimental observations that have been interpreted differently, in terms of multiple independent pathways, appear to be due to chance misfolding errors that cause different population fractions to block at different pathway points, populate different pathway intermediates, and fold at different rates. This paper summarizes the experimental basis for these three determining principles and their consequences. Cooperative native-like foldon units and the sequential stabilization process together generate predetermined stepwise pathways. Optional misfolding errors are responsible for 3-state and heterogeneous kinetic folding. PMID:18405419
NASA Astrophysics Data System (ADS)
Ishmukhametov, Robert R.; Russell, Aidan N.; Berry, Richard M.
2016-10-01
An important goal in synthetic biology is the assembly of biomimetic cell-like structures, which combine multiple biological components in synthetic lipid vesicles. A key limiting assembly step is the incorporation of membrane proteins into the lipid bilayer of the vesicles. Here we present a simple method for delivery of membrane proteins into a lipid bilayer within 5 min. Fusogenic proteoliposomes, containing charged lipids and membrane proteins, fuse with oppositely charged bilayers, with no requirement for detergent or fusion-promoting proteins, and deliver large, fragile membrane protein complexes into the target bilayers. We demonstrate the feasibility of our method by assembling a minimal electron transport chain capable of adenosine triphosphate (ATP) synthesis, combining Escherichia coli F1Fo ATP-synthase and the primary proton pump bo3-oxidase, into synthetic lipid vesicles with sizes ranging from 100 nm to ~10 μm. This provides a platform for the combination of multiple sets of membrane protein complexes into cell-like artificial structures.
The porous borders of the protein world.
Cordes, Matthew H J; Stewart, Katie L
2012-02-08
Fold switching may play a role in the evolution of new protein folds and functions. He et al., in this issue of Structure, use protein design to illustrate that the same drastic change in a protein fold can occur via multiple different mutational pathways. Copyright © 2012 Elsevier Ltd. All rights reserved.
Vanhoutteghem, Amandine; Maciejewski-Duval, Anna; Bouche, Cyril; Delhomme, Brigitte; Hervé, Françoise; Daubigney, Fabrice; Soubigou, Guillaume; Araki, Masatake; Araki, Kimi; Yamamura, Ken-ichi; Djian, Philippe
2009-01-01
Basonuclin 2 is a recently discovered zinc finger protein of unknown function. Its paralog, basonuclin 1, is associated with the ability of keratinocytes to multiply. The basonuclin zinc fingers are closely related to those of the Drosophila proteins disco and discorelated, but the relation between disco proteins and basonuclins has remained elusive because the function of the disco proteins in larval head development seems to have no relation to that of basonuclin 1 and because the amino acid sequence of disco, apart from the zinc fingers, also has no similarity to that of the basonuclins. We have generated mice lacking basonuclin 2. These mice die within 24 h of birth with a cleft palate and abnormalities of craniofacial bones and tongue. In the embryonic head, expression of the basonuclin 2 gene is restricted to mesenchymal cells in the palate, at the periphery of the tongue, and in the mesenchymal sheaths that surround the brain and the osteocartilagineous structures. In late embryos, the rate of multiplication of these mesenchymal cells is greatly diminished. Therefore, basonuclin 2 is essential for the multiplication of craniofacial mesenchymal cells during embryogenesis. Non-Drosophila insect databases available since 2008 reveal that the basonuclins and the disco proteins share much more extensive sequence and gene structure similarity than noted when only Drosophila sequences were examined. We conclude that basonuclin 2 is both structurally and functionally the vertebrate ortholog of the disco proteins. We also note the possibility that some human craniofacial abnormalities are due to a lack of basonuclin 2. PMID:19706529
Understanding the Structural Ensembles of a Highly Extended Disordered Protein†
Daughdrill, Gary W.; Kashtanov, Stepan; Stancik, Amber; Hill, Shannon E.; Helms, Gregory; Muschol, Martin
2013-01-01
Developing a comprehensive description of the equilibrium structural ensembles for intrinsically disordered proteins (IDPs) is essential to understanding their function. The p53 transactivation domain (p53TAD) is an IDP that interacts with multiple protein partners and contains numerous phosphorylation sites. Multiple techniques were used to investigate the equilibrium structural ensemble of p53TAD in its native and chemically unfolded states. The results from these experiments show that the native state of p53TAD has dimensions similar to a classical random coil while the chemically unfolded state is more extended. To investigate the molecular properties responsible for this behavior, a novel algorithm that generates diverse and unbiased structural ensembles of IDPs was developed. This algorithm was used to generate a large pool of plausible p53TAD structures that were reweighted to identify a subset of structures with the best fit to small angle X-ray scattering data. High weight structures in the native state ensemble show features that are localized to protein binding sites and regions with high proline content. The features localized to the protein binding sites are mostly eliminated in the chemically unfolded ensemble; while, the regions with high proline content remain relatively unaffected. Data from NMR experiments support these results, showing that residues from the protein binding sites experience larger environmental changes upon unfolding by urea than regions with high proline content. This behavior is consistent with the urea-induced exposure of nonpolar and aromatic side-chains in the protein binding sites that are partially excluded from solvent in the native state ensemble. PMID:21979461
Reed, Benjamin J.; Locke, Melissa N.; Gardner, Richard G.
2015-01-01
In the canonical view of protein function, it is generally accepted that the three-dimensional structure of a protein determines its function. However, the past decade has seen a dramatic growth in the identification of proteins with extensive intrinsically disordered regions (IDRs), which are conformationally plastic and do not appear to adopt single three-dimensional structures. One current paradigm for IDR function is that disorder enables IDRs to adopt multiple conformations, expanding the ability of a protein to interact with a wide variety of disparate proteins. The capacity for many interactions is an important feature of proteins that occupy the hubs of protein networks, in particular protein-modifying enzymes that usually have a broad spectrum of substrates. One such protein modification is ubiquitination, where ubiquitin is attached to proteins through ubiquitin ligases (E3s) and removed through deubiquitinating enzymes. Numerous proteomic studies have found that thousands of proteins are dynamically regulated by cycles of ubiquitination and deubiquitination. Thus, how these enzymes target their wide array of substrates is of considerable importance for understanding the function of the cell's diverse ubiquitination networks. Here, we characterize a yeast deubiquitinating enzyme, Ubp10, that possesses IDRs flanking its catalytic protease domain. We show that Ubp10 possesses multiple, distinct binding modules within its IDRs that are necessary and sufficient for directing protein interactions important for Ubp10's known roles in gene silencing and ribosome biogenesis. The human homolog of Ubp10, USP36, also has IDRs flanking its catalytic domain, and these IDRs similarly contain binding modules important for protein interactions. This work highlights the significant protein interaction scaffolding abilities of IDRs in the regulation of dynamic protein ubiquitination. PMID:26149687
Compton, L A; Johnson, W C
1986-05-15
Inverse circular dichroism (CD) spectra are presented for each of the five major secondary structures of proteins: alpha-helix, antiparallel and parallel beta-sheet, beta-turn, and other (random) structures. The fraction of the each secondary structure in a protein is predicted by forming the dot product of the corresponding inverse CD spectrum, expressed as a vector, with the CD spectrum of the protein digitized in the same way. We show how this method is based on the construction of the generalized inverse from the singular value decomposition of a set of CD spectra corresponding to proteins whose secondary structures are known from X-ray crystallography. These inverse spectra compute secondary structure directly from protein CD spectra without resorting to least-squares fitting and standard matrix inversion techniques. In addition, spectra corresponding to the individual secondary structures, analogous to the CD spectra of synthetic polypeptides, are generated from the five most significant CD eigenvectors.
PreSSAPro: a software for the prediction of secondary structure by amino acid properties.
Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M
2007-10-01
PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha-beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/.
T-RMSD: a web server for automated fine-grained protein structural classification.
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-07-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.
T-RMSD: a web server for automated fine-grained protein structural classification
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-01-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642
Xu, Dong; Zhang, Jian; Roy, Ambrish; Zhang, Yang
2011-01-01
I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and FG-MD, were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of beta-proteins are still needed to further improve the I-TASSER pipeline. PMID:22069036
Sehgal, Lalit; Budnar, Srikanth; Bhatt, Khyati; Sansare, Sneha; Mukhopadhaya, Amitabha; Kalraiya, Rajiv D; Dalal, Sorab N
2012-10-01
The study of protein-protein interactions, protein localization, protein organization into higher order structures and organelle dynamics in live cells, has greatly enhanced the understanding of various cellular processes. Live cell imaging experiments employ plasmid or viral vectors to express the protein/proteins of interest fused to a fluorescent protein. Unlike plasmid vectors, lentiviral vectors can be introduced into both dividing and non dividing cells, can be pseudotyped to infect a broad or narrow range of cells, and can be used to generate transgenic animals. However, the currently available lentiviral vectors are limited by the choice of fluorescent protein tag, choice of restriction enzyme sites in the Multiple Cloning Sites (MCS) and promoter choice for gene expression. In this report, HIV-1 based bi-cistronic lentiviral vectors have been generated that drive the expression of multiple fluorescent tags (EGFP, mCherry, ECFP, EYFP and dsRed), using two different promoters. The presence of a unique MCS with multiple restriction sites allows the generation of fusion proteins with the fluorescent tag of choice, allowing analysis of multiple fusion proteins in live cell imaging experiments. These novel lentiviral vectors are improved delivery vehicles for gene transfer applications and are important tools for live cell imaging in vivo.
NASA Astrophysics Data System (ADS)
Cui, P. X.; Lian, F. L.; Wang, Y.; Wen, Yi; Chu, W. S.; Zhao, H. F.; Zhang, S.; Li, J.; Lin, D. H.; Wu, Z. Y.
2014-02-01
Prion-related protein (PrP), a cell-surface copper-binding glycoprotein, is considered to be responsible for a number of transmissible spongiform encephalopathies (TSEs). The structural conversion of PrP from the normal cellular isoform (PrPC) to the post-translationally modified form (PrPSc) is thought to be relevant to Cu2+ binding to histidine residues. Rabbits are one of the few mammalian species that appear to be resistant to TSEs, because of the structural characteristics of the rabbit prion protein (RaPrPC) itself. Here we determined the three-dimensional local structure around the C-terminal high-affinity copper-binding sites using X-ray absorption near-edge structure combined with ab initio calculations in the framework of the multiple-scattering (MS) theory. Result shows that two amino acid resides, Gln97 and Met108, and two histidine residues, His95 and His110, are involved in binding this copper(II) ion. It might help us understand the roles of copper in prion conformation conversions, and the molecular mechanisms of prion-involved diseases.
Okazaki, Kei-ichi; Koga, Nobuyasu; Takada, Shoji; Onuchic, Jose N.; Wolynes, Peter G.
2006-01-01
Biomolecules often undergo large-amplitude motions when they bind or release other molecules. Unlike macroscopic machines, these biomolecular machines can partially disassemble (unfold) and then reassemble (fold) during such transitions. Here we put forward a minimal structure-based model, the “multiple-basin model,” that can directly be used for molecular dynamics simulation of even very large biomolecular systems so long as the endpoints of the conformational change are known. We investigate the model by simulating large-scale motions of four proteins: glutamine-binding protein, S100A6, dihydrofolate reductase, and HIV-1 protease. The mechanisms of conformational transition depend on the protein basin topologies and change with temperature near the folding transition. The conformational transition rate varies linearly with driving force over a fairly large range. This linearity appears to be a consequence of partial unfolding during the conformational transition. PMID:16877541
Structure of a Trypanosoma Brucei Alpha/Beta--Hydrolase Fold Protein With Unknown Function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Merritt, E.A.; Holmes, M.; Buckner, F.S.
2009-05-26
The structure of a structural genomics target protein, Tbru020260AAA from Trypanosoma brucei, has been determined to a resolution of 2.2 {angstrom} using multiple-wavelength anomalous diffraction at the Se K edge. This protein belongs to Pfam sequence family PF08538 and is only distantly related to previously studied members of the {alpha}/{beta}-hydrolase fold family. Structural superposition onto representative {alpha}/{beta}-hydrolase fold proteins of known function indicates that a possible catalytic nucleophile, Ser116 in the T. brucei protein, lies at the expected location. However, the present structure and by extension the other trypanosomatid members of this sequence family have neither sequence nor structural similaritymore » at the location of other active-site residues typical for proteins with this fold. Together with the presence of an additional domain between strands {beta}6 and {beta}7 that is conserved in trypanosomatid genomes, this suggests that the function of these homologs has diverged from other members of the fold family.« less
Rapid search for tertiary fragments reveals protein sequence–structure relationships
Zhou, Jianfu; Grigoryan, Gevorg
2015-01-01
Finding backbone substructures from the Protein Data Bank that match an arbitrary query structural motif, composed of multiple disjoint segments, is a problem of growing relevance in structure prediction and protein design. Although numerous protein structure search approaches have been proposed, methods that address this specific task without additional restrictions and on practical time scales are generally lacking. Here, we propose a solution, dubbed MASTER, that is both rapid, enabling searches over the Protein Data Bank in a matter of seconds, and provably correct, finding all matches below a user-specified root-mean-square deviation cutoff. We show that despite the potentially exponential time complexity of the problem, running times in practice are modest even for queries with many segments. The ability to explore naturally plausible structural and sequence variations around a given motif has the potential to synthesize its design principles in an automated manner; so we go on to illustrate the utility of MASTER to protein structural biology. We demonstrate its capacity to rapidly establish structure–sequence relationships, uncover the native designability landscapes of tertiary structural motifs, identify structural signatures of binding, and automatically rewire protein topologies. Given the broad utility of protein tertiary fragment searches, we hope that providing MASTER in an open-source format will enable novel advances in understanding, predicting, and designing protein structure. PMID:25420575
NewProt - a protein engineering portal.
Schwarte, Andreas; Genz, Maika; Skalden, Lilly; Nobili, Alberto; Vickers, Clare; Melse, Okke; Kuipers, Remko; Joosten, Henk-Jan; Stourac, Jan; Bendl, Jaroslav; Black, Jon; Haase, Peter; Baakman, Coos; Damborsky, Jiri; Bornscheuer, Uwe; Vriend, Gert; Venselaar, Hanka
2017-06-01
The NewProt protein engineering portal is a one-stop-shop for in silico protein engineering. It gives access to a large number of servers that compute a wide variety of protein structure characteristics supporting work on the modification of proteins through the introduction of (multiple) point mutations. The results can be inspected through multiple visualizers. The HOPE software is included to indicate mutations with possible undesired side effects. The Hotspot Wizard software is embedded for the design of mutations that modify a proteins' activity, specificity, or stability. The NewProt portal is freely accessible at http://newprot.cmbi.umcn.nl/ and http://newprot.fluidops.net/. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An ambiguity principle for assigning protein structural domains.
Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe
2017-01-01
Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object-in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our "multipartitioning" approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules.
Accurate high-throughput structure mapping and prediction with transition metal ion FRET
Yu, Xiaozhen; Wu, Xiongwu; Bermejo, Guillermo A.; Brooks, Bernard R.; Taraska, Justin W.
2013-01-01
Mapping the landscape of a protein’s conformational space is essential to understanding its functions and regulation. The limitations of many structural methods have made this process challenging for most proteins. Here, we report that transition metal ion FRET (tmFRET) can be used in a rapid, highly parallel screen, to determine distances from multiple locations within a protein at extremely low concentrations. The distances generated through this screen for the protein Maltose Binding Protein (MBP) match distances from the crystal structure to within a few angstroms. Furthermore, energy transfer accurately detects structural changes during ligand binding. Finally, fluorescence-derived distances can be used to guide molecular simulations to find low energy states. Our results open the door to rapid, accurate mapping and prediction of protein structures at low concentrations, in large complex systems, and in living cells. PMID:23273426
Chéron, Jean-Baptiste; Triki, Dhoha; Senac, Caroline; Flatters, Delphine; Camproux, Anne-Claude
2017-01-01
Protein flexibility is often implied in binding with different partners and is essential for protein function. The growing number of macromolecular structures in the Protein Data Bank entries and their redundancy has become a major source of structural knowledge of the protein universe. The analysis of structural variability through available redundant structures of a target, called multiple target conformations (MTC), obtained using experimental or modeling methods and under different biological conditions or different sources is one way to explore protein flexibility. This analysis is essential to improve the understanding of various mechanisms associated with protein target function and flexibility. In this study, we explored structural variability of three biological targets by analyzing different MTC sets associated with these targets. To facilitate the study of these MTC sets, we have developed an efficient tool, SA-conf, dedicated to capturing and linking the amino acid and local structure variability and analyzing the target structural variability space. The advantage of SA-conf is that it could be applied to divers sets composed of MTCs available in the PDB obtained using NMR and crystallography or homology models. This tool could also be applied to analyze MTC sets obtained by dynamics approaches. Our results showed that SA-conf tool is effective to quantify the structural variability of a MTC set and to localize the structural variable positions and regions of the target. By selecting adapted MTC subsets and comparing their variability detected by SA-conf, we highlighted different sources of target flexibility such as induced by binding partner, by mutation and intrinsic flexibility. Our results support the interest to mine available structures associated with a target using to offer valuable insight into target flexibility and interaction mechanisms. The SA-conf executable script, with a set of pre-compiled binaries are available at http://www.mti.univ-paris-diderot.fr/recherche/plateformes/logiciels. PMID:28817602
Regad, Leslie; Chéron, Jean-Baptiste; Triki, Dhoha; Senac, Caroline; Flatters, Delphine; Camproux, Anne-Claude
2017-01-01
Protein flexibility is often implied in binding with different partners and is essential for protein function. The growing number of macromolecular structures in the Protein Data Bank entries and their redundancy has become a major source of structural knowledge of the protein universe. The analysis of structural variability through available redundant structures of a target, called multiple target conformations (MTC), obtained using experimental or modeling methods and under different biological conditions or different sources is one way to explore protein flexibility. This analysis is essential to improve the understanding of various mechanisms associated with protein target function and flexibility. In this study, we explored structural variability of three biological targets by analyzing different MTC sets associated with these targets. To facilitate the study of these MTC sets, we have developed an efficient tool, SA-conf, dedicated to capturing and linking the amino acid and local structure variability and analyzing the target structural variability space. The advantage of SA-conf is that it could be applied to divers sets composed of MTCs available in the PDB obtained using NMR and crystallography or homology models. This tool could also be applied to analyze MTC sets obtained by dynamics approaches. Our results showed that SA-conf tool is effective to quantify the structural variability of a MTC set and to localize the structural variable positions and regions of the target. By selecting adapted MTC subsets and comparing their variability detected by SA-conf, we highlighted different sources of target flexibility such as induced by binding partner, by mutation and intrinsic flexibility. Our results support the interest to mine available structures associated with a target using to offer valuable insight into target flexibility and interaction mechanisms. The SA-conf executable script, with a set of pre-compiled binaries are available at http://www.mti.univ-paris-diderot.fr/recherche/plateformes/logiciels.
Evolutionarily Conserved Linkage between Enzyme Fold, Flexibility, and Catalysis
Ramanathan, Arvind; Agarwal, Pratul K.
2011-01-01
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function. Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 Å away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme–substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme–substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design. PMID:22087074
Evolutionarily conserved linkage between enzyme fold, flexibility, and catalysis.
Ramanathan, Arvind; Agarwal, Pratul K
2011-11-01
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function. Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 Å away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme-substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme-substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramanathan, Arvind; Agarwal, Pratul K
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function.more » Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design.« less
SARS-unique fold in the Rousettus bat coronavirus HKU9.
Hammond, Robert G; Tan, Xuan; Johnson, Margaret A
2017-09-01
The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.
Islam, Nazrul; Woo, Sun-Hee; Tsujimoto, Hisashi; Kawasaki, Hiroshi; Hirano, Hisashi
2002-09-01
Changes in protein composition of wheat endosperm proteome were investigated in 39 ditelocentric chromosome lines of common wheat (Triticum aestivum L.) cv. Chinese Spring. Two-dimensional gel electrophoresis followed by Coomassie Brilliant Blue staining has resolved a total of 105 protein spots in a gel. Quantitative image analysis of protein spots was performed by PDQuest. Variations in protein spots between the euploid and the 39 ditelocentric lines were evaluated by spot number, appearance, disappearance and intensity. A specific spot present in all gels was taken as an internal standard, and the intensity of all other spots was calculated as the ratio of the internal standard. Out of the 1755 major spots detected in 39 ditelocentric lines, 1372 (78%) spots were found variable in different spot parameters: 147 (11%) disappeared, 978 (71%) up-regulated and 247 (18%) down-regulated. Correlation studies in changes in protein intensities among 24 protein spots across the ditelocentric lines were performed. High correlations in changes of protein intensities were observed among the proteins encoded by genes located in the homoeologous arms. Locations of structural genes controlling 26 spots were identified in 10 chromosomal arms. Multiple regulators of the same protein located at various chromosomal arms were also noticed. Identification of structural genes for most of the proteins was found difficult due to multiple regulators encoding the same protein. Two novel subunits (1B(Z,) 1BDz), the structure of which are very similar to the high molecular weight glutenin subunit 12, were identified, and the chromosome arm locations of these subunits were assigned.
Milles, Sigrid; Koehler, Christine; Gambin, Yann; Deniz, Ashok A; Lemke, Edward A
2012-10-01
Single molecule observation of fluorescence resonance energy transfer can be used to provide insight into the structure and dynamics of proteins. Using a straightforward triple-colour labelling strategy, we present a measurement and analysis scheme that can simultaneously study multiple regions within single intrinsically disordered proteins.
Milles, Sigrid; Koehler, Christine; Gambin, Yann
2012-01-01
Single molecule observation of fluorescence resonance energy transfer can be used to provide insights into the structure and dynamics of proteins. Using a straightforward triple-colour labelling strategy, we present a measurement and analysis scheme that can simultaneously study multiple regions within single intrinsically disordered proteins. PMID:22739670
Bayesian module identification from multiple noisy networks.
Zamani Dadaneh, Siamak; Qian, Xiaoning
2016-12-01
Module identification has been studied extensively in order to gain deeper understanding of complex systems, such as social networks as well as biological networks. Modules are often defined as groups of vertices in these networks that are topologically cohesive with similar interaction patterns with the rest of the vertices. Most of the existing module identification algorithms assume that the given networks are faithfully measured without errors. However, in many real-world applications, for example, when analyzing protein-protein interaction networks from high-throughput profiling techniques, there is significant noise with both false positive and missing links between vertices. In this paper, we propose a new model for more robust module identification by taking advantage of multiple observed networks with significant noise so that signals in multiple networks can be strengthened and help improve the solution quality by combining information from various sources. We adopt a hierarchical Bayesian model to integrate multiple noisy snapshots that capture the underlying modular structure of the networks under study. By introducing a latent root assignment matrix and its relations to instantaneous module assignments in all the observed networks to capture the underlying modular structure and combine information across multiple networks, an efficient variational Bayes algorithm can be derived to accurately and robustly identify the underlying modules from multiple noisy networks. Experiments on synthetic and protein-protein interaction data sets show that our proposed model enhances both the accuracy and resolution in detecting cohesive modules, and it is less vulnerable to noise in the observed data. In addition, it shows higher power in predicting missing edges compared to individual-network methods.
Li de La Sierra-Gallay, Ines; Collinet, Bruno; Graille, Marc; Quevillon-Cheruel, Sophie; Liger, Dominique; Minard, Philippe; Blondeau, Karine; Henckes, Gilles; Aufrère, Robert; Leulliot, Nicolas; Zhou, Cong-Zhao; Sorel, Isabelle; Ferrer, Jean-Luc; Poupon, Anne; Janin, Joël; van Tilbeurgh, Herman
2004-03-01
The protein product of the YGR205w gene of Saccharomyces cerevisiae was targeted as part of our yeast structural genomics project. YGR205w codes for a small (290 amino acids) protein with unknown structure and function. The only recognizable sequence feature is the presence of a Walker A motif (P loop) indicating a possible nucleotide binding/converting function. We determined the three-dimensional crystal structure of Se-methionine substituted protein using multiple anomalous diffraction. The structure revealed a well known mononucleotide fold and strong resemblance to the structure of small metabolite phosphorylating enzymes such as pantothenate and phosphoribulo kinase. Biochemical experiments show that YGR205w binds specifically ATP and, less tightly, ADP. The structure also revealed the presence of two bound sulphate ions, occupying opposite niches in a canyon that corresponds to the active site of the protein. One sulphate is bound to the P-loop in a position that corresponds to the position of beta-phosphate in mononucleotide protein ATP complex, suggesting the protein is indeed a kinase. The nature of the phosphate accepting substrate remains to be determined. Copyright 2004 Wiley-Liss, Inc.
Kwon, Daehong; Lee, Daehwan; Kim, Juyeon; Lee, Jongin; Sim, Mikang; Kim, Jaebum
2018-05-09
Proteins perform biological functions through cascading interactions with each other by forming protein complexes. As a result, interactions among proteins, called protein-protein interactions (PPIs) are not completely free from selection constraint during evolution. Therefore, the identification and analysis of PPI changes during evolution can give us new insight into the evolution of functions. Although many algorithms, databases and websites have been developed to help the study of PPIs, most of them are limited to visualize the structure and features of PPIs in a chosen single species with limited functions in the visualization perspective. This leads to difficulties in the identification of different patterns of PPIs in different species and their functional consequences. To resolve these issues, we developed a web application, called INTER-Species Protein Interaction Analysis (INTERSPIA). Given a set of proteins of user's interest, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins and searches for different patterns of PPIs in multiple species through a server-side pipeline, and second visualizes the dynamics of PPIs in multiple species using an easy-to-use web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/.
Finding the target sites of RNA-binding proteins
Li, Xiao; Kazan, Hilal; Lipshitz, Howard D; Morris, Quaid D
2014-01-01
RNA–protein interactions differ from DNA–protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs associated in vitro and in vivo with RNA-binding proteins (RBPs). Generalizing outside of the experimental conditions tested by these assays requires computational motif finding. Often RBP motif finding is done by adapting DNA motif finding methods; but modeling secondary structure context leads to better recovery of RBP-binding preferences. Genome-wide assessment of mRNA secondary structure has recently become possible, but these data must be combined with computational predictions of secondary structure before they add value in predicting in vivo binding. There are two main approaches to incorporating structural information into motif models: supplementing primary sequence motif models with preferred secondary structure contexts (e.g., MEMERIS and RNAcontext) and directly modeling secondary structure recognized by the RBP using stochastic context-free grammars (e.g., CMfinder and RNApromo). The former better reconstruct known binding preferences for sequence-specific RBPs but are not suitable for modeling RBPs that recognize shape and geometry of RNAs. Future work in RBP motif finding should incorporate interactions between multiple RBDs and multiple RBPs in binding to RNA. WIREs RNA 2014, 5:111–130. doi: 10.1002/wrna.1201 PMID:24217996
Guerrero-Muñoz, Marcos J; Castillo-Carranza, Diana L; Kayed, Rakez
2014-04-15
Impaired proteostasis is one of the main features of all amyloid diseases, which are associated with the formation of insoluble aggregates from amyloidogenic proteins. The aggregation process can be caused by overproduction or poor clearance of these proteins. However, numerous reports suggest that amyloid oligomers are the most toxic species, rather than insoluble fibrillar material, in Alzheimer's, Parkinson's, and Prion diseases, among others. Although the exact protein that aggregates varies between amyloid disorders, they all share common structural features that can be used as therapeutic targets. In this review, we focus on therapeutic approaches against shared features of toxic oligomeric structures and future directions. Copyright © 2014 Elsevier Inc. All rights reserved.
Gifford, Lida K; Carter, Lester G; Gabanyi, Margaret J; Berman, Helen M; Adams, Paul D
2012-06-01
The Technology Portal of the Protein Structure Initiative Structural Biology Knowledgebase (PSI SBKB; http://technology.sbkb.org/portal/ ) is a web resource providing information about methods and tools that can be used to relieve bottlenecks in many areas of protein production and structural biology research. Several useful features are available on the web site, including multiple ways to search the database of over 250 technological advances, a link to videos of methods on YouTube, and access to a technology forum where scientists can connect, ask questions, get news, and develop collaborations. The Technology Portal is a component of the PSI SBKB ( http://sbkb.org ), which presents integrated genomic, structural, and functional information for all protein sequence targets selected by the Protein Structure Initiative. Created in collaboration with the Nature Publishing Group, the SBKB offers an array of resources for structural biologists, such as a research library, editorials about new research advances, a featured biological system each month, and a functional sleuth for searching protein structures of unknown function. An overview of the various features and examples of user searches highlight the information, tools, and avenues for scientific interaction available through the Technology Portal.
Multiple intermediates on the energy landscape of a 15-HEAT-repeat protein
Tsytlonok, Maksym; Craig, Patricio O.; Sivertsson, Elin; Serquera, David; Perrett, Sarah; Best, Robert B.; Wolynes, Peter G.; Itzhaki, Laura S.
2014-01-01
Repeat proteins are a special class of modular, non-globular proteins composed of small structural motifs arrayed to form elongated architectures and stabilised solely by short-range contacts. We find a remarkable complexity in the unfolding of the large HEAT repeat protein PR65/A. In contrast to what has been seen for small repeat proteins in which unfolding propagates from one end, the HEAT array of PR65/A ruptures at multiple distant sites, leading to intermediate states with non-contiguous folded subdomains. Kinetic analysis allows us to define a network of intermediates and to delineate the pathways that connect them. There is a dominant sequence of unfolding, reflecting a non-uniform distribution of stability across the repeat array; however the unfolding of certain intermediates is competitive, leading to parallel pathways. Theoretical models accounting for the heterogeneous contact density in the folded structure are able to rationalize the variation in stability across the array. This variation in stability also suggests how folding may direct function in a large repeat protein: The stability distribution enables certain regions to present rigid motifs for molecular recognition while affording others flexibility to broaden the search area as in a fly-casting mechanism. Thus PR65/A uses the two ends of the repeat array to bind diverse partners and thereby coordinate the dephosphorylation of many different substrates and of multiple sites within hyperphosphorylated substrates. PMID:24120762
Exploring the repeat protein universe through computational protein design
Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...
2015-12-16
A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less
Prediction of protein secondary structure content for the twilight zone sequences.
Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke
2007-11-15
Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.
Structural Protein 4.1 in the Nucleus of Human Cells: Dynamic Rearrangements during Cell Division
Krauss, Sharon Wald; Larabell, Carolyn A.; Lockett, Stephen; Gascard, Philippe; Penman, Sheldon; Mohandas, Narla; Chasis, Joel Anne
1997-01-01
Structural protein 4.1, first identified as a crucial 80-kD protein in the mature red cell membrane skeleton, is now known to be a diverse family of protein isoforms generated by complex alternative mRNA splicing, variable usage of translation initiation sites, and posttranslational modification. Protein 4.1 epitopes are detected at multiple intracellular sites in nucleated mammalian cells. We report here investigations of protein 4.1 in the nucleus. Reconstructions of optical sections of human diploid fibroblast nuclei using antibodies specific for 80-kD red cell 4.1 and for 4.1 peptides showed 4.1 immunofluorescent signals were intranuclear and distributed throughout the volume of the nucleus. After sequential extractions of cells in situ, 4.1 epitopes were detected in nuclear matrix both by immunofluorescence light microscopy and resinless section immunoelectron microscopy. Western blot analysis of fibroblast nuclear matrix protein fractions, isolated under identical extraction conditions as those for microscopy, revealed several polypeptide bands reactive to multiple 4.1 antibodies against different domains. Epitope-tagged protein 4.1 was detected in fibroblast nuclei after transient transfections using a construct encoding red cell 80-kD 4.1 fused to an epitope tag. Endogenous protein 4.1 epitopes were detected throughout the cell cycle but underwent dynamic spatial rearrangements during cell division. Protein 4.1 was observed in nucleoplasm and centrosomes at interphase, in the mitotic spindle during mitosis, in perichromatin during telophase, as well as in the midbody during cytokinesis. These results suggest that multiple protein 4.1 isoforms may contribute significantly to nuclear architecture and ultimately to nuclear function. PMID:9128242
De novo identification of highly diverged protein repeats by probabilistic consistency.
Biegert, A; Söding, J
2008-03-15
An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
The SARS coronavirus nucleocapsid protein--forms and functions.
Chang, Chung-ke; Hou, Ming-Hon; Chang, Chi-Fon; Hsiao, Chwan-Deng; Huang, Tai-huang
2014-03-01
The nucleocapsid phosphoprotein of the severe acute respiratory syndrome coronavirus (SARS-CoV N protein) packages the viral genome into a helical ribonucleocapsid (RNP) and plays a fundamental role during viral self-assembly. It is a protein with multifarious activities. In this article we will review our current understanding of the N protein structure and its interaction with nucleic acid. Highlights of the progresses include uncovering the modular organization, determining the structures of the structural domains, realizing the roles of protein disorder in protein-protein and protein-nucleic acid interactions, and visualizing the ribonucleoprotein (RNP) structure inside the virions. It was also demonstrated that N-protein binds to nucleic acid at multiple sites with a coupled-allostery manner. We propose a SARS-CoV RNP model that conforms to existing data and bears resemblance to the existing RNP structures of RNA viruses. The model highlights the critical role of modular organization and intrinsic disorder of the N protein in the formation and functions of the dynamic RNP capsid in RNA viruses. This paper forms part of a symposium in Antiviral Research on "From SARS to MERS: 10 years of research on highly pathogenic human coronaviruses." Copyright © 2014 Elsevier B.V. All rights reserved.
An ambiguity principle for assigning protein structural domains
Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe
2017-01-01
Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object—in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our “multipartitioning” approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules. PMID:28097215
Cleland, Timothy P.; Schroeter, Elena R.; Zamdborg, Leonid; Zheng, Wenxia; Lee, Ji Eun; Tran, John C.; Bern, Marshall; Duncan, Michael B.; Lebleu, Valerie S.; Ahlf, Dorothy R.; Thomas, Paul M.; Kalluri, Raghu; Kelleher, Neil L.; Schweitzer, Mary H.
2016-01-01
Structures similar to blood vessels in location, morphology, flexibility, and transparency have been recovered after demineralization of multiple dinosaur cortical bone fragments from multiple specimens, some of which are as old as 80 Ma. These structures were hypothesized to be either endogenous to the bone (i.e., of vascular origin) or the result of biofilm colonizing the empty osteonal network after degradation of original organic components. Here, we test the hypothesis that these structures are endogenous and thus retain proteins in common with extant archosaur blood vessels that can be detected with high-resolution mass spectrometry and confirmed by immunofluorescence. Two lines of evidence support this hypothesis. First, peptide sequencing of Brachylophosaurus canadensis blood vessel extracts is consistent with peptides comprising extant archosaurian blood vessels and is not consistent with a bacterial, cellular slime mold, or fungal origin. Second, proteins identified by mass spectrometry can be localized to the tissues using antibodies specific to these proteins, validating their identity. Data are available via ProteomeXchange with identifier PXD001738. PMID:26595531
Cleland, Timothy P; Schroeter, Elena R; Zamdborg, Leonid; Zheng, Wenxia; Lee, Ji Eun; Tran, John C; Bern, Marshall; Duncan, Michael B; Lebleu, Valerie S; Ahlf, Dorothy R; Thomas, Paul M; Kalluri, Raghu; Kelleher, Neil L; Schweitzer, Mary H
2015-12-04
Structures similar to blood vessels in location, morphology, flexibility, and transparency have been recovered after demineralization of multiple dinosaur cortical bone fragments from multiple specimens, some of which are as old as 80 Ma. These structures were hypothesized to be either endogenous to the bone (i.e., of vascular origin) or the result of biofilm colonizing the empty osteonal network after degradation of original organic components. Here, we test the hypothesis that these structures are endogenous and thus retain proteins in common with extant archosaur blood vessels that can be detected with high-resolution mass spectrometry and confirmed by immunofluorescence. Two lines of evidence support this hypothesis. First, peptide sequencing of Brachylophosaurus canadensis blood vessel extracts is consistent with peptides comprising extant archosaurian blood vessels and is not consistent with a bacterial, cellular slime mold, or fungal origin. Second, proteins identified by mass spectrometry can be localized to the tissues using antibodies specific to these proteins, validating their identity. Data are available via ProteomeXchange with identifier PXD001738.
PredictProtein—an open resource for online prediction of protein structural and functional features
Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard
2014-01-01
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren
2016-11-01
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.
Protein Multifunctionality: Principles and Mechanisms
Zaretsky, Joseph Z.; Wreschner, Daniel H.
2008-01-01
In the review, the nature of protein multifunctionality is analyzed. In the first part of the review the principles of structural/functional organization of protein are discussed. In the second part, the main mechanisms involved in development of multiple functions on a single gene product(s) are analyzed. The last part represents a number of examples showing that multifunctionality is a basic feature of biologically active proteins. PMID:21566747
Chae, Pil Seok; Rasmussen, Søren G F; Rana, Rohini R; Gotfryd, Kamil; Chandra, Richa; Goren, Michael A; Kruse, Andrew C; Nurva, Shailika; Loland, Claus J; Pierre, Yves; Drew, David; Popot, Jean-Luc; Picot, Daniel; Fox, Brian G; Guan, Lan; Gether, Ulrik; Byrne, Bernadette; Kobilka, Brian; Gellman, Samuel H
2010-12-01
The understanding of integral membrane protein (IMP) structure and function is hampered by the difficulty of handling these proteins. Aqueous solubilization, necessary for many types of biophysical analysis, generally requires a detergent to shield the large lipophilic surfaces of native IMPs. Many proteins remain difficult to study owing to a lack of suitable detergents. We introduce a class of amphiphiles, each built around a central quaternary carbon atom derived from neopentyl glycol, with hydrophilic groups derived from maltose. Representatives of this maltose-neopentyl glycol (MNG) amphiphile family show favorable behavior relative to conventional detergents, as manifested in multiple membrane protein systems, leading to enhanced structural stability and successful crystallization. MNG amphiphiles are promising tools for membrane protein science because of the ease with which they may be prepared and the facility with which their structures may be varied.
Surflex-Dock: Docking benchmarks and real-world application
NASA Astrophysics Data System (ADS)
Spitzer, Russell; Jain, Ajay N.
2012-06-01
Benchmarks for molecular docking have historically focused on re-docking the cognate ligand of a well-determined protein-ligand complex to measure geometric pose prediction accuracy, and measurement of virtual screening performance has been focused on increasingly large and diverse sets of target protein structures, cognate ligands, and various types of decoy sets. Here, pose prediction is reported on the Astex Diverse set of 85 protein ligand complexes, and virtual screening performance is reported on the DUD set of 40 protein targets. In both cases, prepared structures of targets and ligands were provided by symposium organizers. The re-prepared data sets yielded results not significantly different than previous reports of Surflex-Dock on the two benchmarks. Minor changes to protein coordinates resulting from complex pre-optimization had large effects on observed performance, highlighting the limitations of cognate ligand re-docking for pose prediction assessment. Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction. Performance on virtual screening performance was shown to benefit by employing and combining multiple screening methods: docking, 2D molecular similarity, and 3D molecular similarity. In addition, use of multiple protein conformations significantly improved screening enrichment.
SiteBinder: an improved approach for comparing multiple protein structural motifs.
Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav
2012-02-27
There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.
Wachnowsky, Christine; Wesley, Nathaniel A; Fidai, Insiya; Cowan, J A
2017-03-24
Iron-sulfur (Fe/S)-cluster-containing proteins constitute one of the largest protein classes, with varied functions that include electron transport, regulation of gene expression, substrate binding and activation, and radical generation. Consequently, the biosynthetic machinery for Fe/S clusters is evolutionarily conserved, and mutations in a variety of putative intermediate Fe/S cluster scaffold proteins can cause disease states, including multiple mitochondrial dysfunctions syndrome (MMDS), sideroblastic anemia, and mitochondrial encephalomyopathy. Herein, we have characterized the impact of defects occurring in the MMDS1 disease state that result from a point mutation (Gly208Cys) near the active site of NFU1, an Fe/S scaffold protein, via an in vitro investigation into the structural and functional consequences. Analysis of protein stability and oligomeric state demonstrates that the mutant increases the propensity to dimerize and perturbs the secondary structure composition. These changes appear to underlie the severely decreased ability of mutant NFU1 to accept an Fe/S cluster from physiologically relevant sources. Therefore, the point mutation on NFU1 impairs downstream cluster trafficking and results in the disease phenotype, because there does not appear to be an alternative in vivo reconstitution path, most likely due to greater protein oligomerization from a minor structural change. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dithiol amino acids can structurally shape and enhance the ligand-binding properties of polypeptides
NASA Astrophysics Data System (ADS)
Chen, Shiyu; Gopalakrishnan, Ranganath; Schaer, Tifany; Marger, Fabrice; Hovius, Ruud; Bertrand, Daniel; Pojer, Florence; Heinis, Christian
2014-11-01
The disulfide bonds that form between two cysteine residues are important in defining and rigidifying the structures of proteins and peptides. In polypeptides containing multiple cysteine residues, disulfide isomerization can lead to multiple products with different biological activities. Here, we describe the development of a dithiol amino acid (Dtaa) that can form two disulfide bridges at a single amino acid site. Application of Dtaas to a serine protease inhibitor and a nicotinic acetylcholine receptor inhibitor that contain disulfide constraints enhanced their inhibitory activities 40- and 7.6-fold, respectively. X-ray crystallographic and NMR structure analysis show that the peptide ligands containing Dtaas have retained their native tertiary structures. We furthermore show that replacement of two cysteines by Dtaas can avoid the formation of disulfide bond isomers. With these properties, Dtaas are likely to have broad application in the rational design or directed evolution of peptides and proteins with high activity and stability.
Mixture models for protein structure ensembles.
Hirsch, Michael; Habeck, Michael
2008-10-01
Protein structure ensembles provide important insight into the dynamics and function of a protein and contain information that is not captured with a single static structure. However, it is not clear a priori to what extent the variability within an ensemble is caused by internal structural changes. Additional variability results from overall translations and rotations of the molecule. And most experimental data do not provide information to relate the structures to a common reference frame. To report meaningful values of intrinsic dynamics, structural precision, conformational entropy, etc., it is therefore important to disentangle local from global conformational heterogeneity. We consider the task of disentangling local from global heterogeneity as an inference problem. We use probabilistic methods to infer from the protein ensemble missing information on reference frames and stable conformational sub-states. To this end, we model a protein ensemble as a mixture of Gaussian probability distributions of either entire conformations or structural segments. We learn these models from a protein ensemble using the expectation-maximization algorithm. Our first model can be used to find multiple conformers in a structure ensemble. The second model partitions the protein chain into locally stable structural segments or core elements and less structured regions typically found in loops. Both models are simple to implement and contain only a single free parameter: the number of conformers or structural segments. Our models can be used to analyse experimental ensembles, molecular dynamics trajectories and conformational change in proteins. The Python source code for protein ensemble analysis is available from the authors upon request.
General mechanism of two-state protein folding kinetics.
Rollins, Geoffrey C; Dill, Ken A
2014-08-13
We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s.
Uchikoga, Nobuyuki; Hirokawa, Takatsugu
2010-05-11
Protein-protein docking for proteins with large conformational changes was analyzed by using interaction fingerprints, one of the scales for measuring similarities among complex structures, utilized especially for searching near-native protein-ligand or protein-protein complex structures. Here, we have proposed a combined method for analyzing protein-protein docking by taking large conformational changes into consideration. This combined method consists of ensemble soft docking with multiple protein structures, refinement of complexes, and cluster analysis using interaction fingerprints and energy profiles. To test for the applicability of this combined method, various CaM-ligand complexes were reconstructed from the NMR structures of unbound CaM. For the purpose of reconstruction, we used three known CaM-ligands, namely, the CaM-binding peptides of cyclic nucleotide gateway (CNG), CaM kinase kinase (CaMKK) and the plasma membrane Ca2+ ATPase pump (PMCA), and thirty-one structurally diverse CaM conformations. For each ligand, 62000 CaM-ligand complexes were generated in the docking step and the relationship between their energy profiles and structural similarities to the native complex were analyzed using interaction fingerprint and RMSD. Near-native clusters were obtained in the case of CNG and CaMKK. The interaction fingerprint method discriminated near-native structures better than the RMSD method in cluster analysis. We showed that a combined method that includes the interaction fingerprint is very useful for protein-protein docking analysis of certain cases.
Interactions," Journal of Biomolecular Structure & Dynamics (2009) "Structure-Based Protocol for from left to right with several dots of multiple colors. "Cellulase Linkers Are Optimized Based on the Sequence and Structure of a Protein-Binding Peptide," Journal of the American Chemical
Buried and accessible surface area control intrinsic protein flexibility.
Marsh, Joseph A
2013-09-09
Proteins experience a wide variety of conformational dynamics that can be crucial for facilitating their diverse functions. How is the intrinsic flexibility required for these motions encoded in their three-dimensional structures? Here, the overall flexibility of a protein is demonstrated to be tightly coupled to the total amount of surface area buried within its fold. A simple proxy for this, the relative solvent-accessible surface area (Arel), therefore shows excellent agreement with independent measures of global protein flexibility derived from various experimental and computational methods. Application of Arel on a large scale demonstrates its utility by revealing unique sequence and structural properties associated with intrinsic flexibility. In particular, flexibility as measured by Arel shows little correspondence with intrinsic disorder, but instead tends to be associated with multiple domains and increased α-helical structure. Furthermore, the apparent flexibility of monomeric proteins is found to be useful for identifying quaternary-structure errors in published crystal structures. There is also a strong tendency for the crystal structures of more flexible proteins to be solved to lower resolutions. Finally, local solvent accessibility is shown to be a primary determinant of local residue flexibility. Overall, this work provides both fundamental mechanistic insight into the origin of protein flexibility and a simple, practical method for predicting flexibility from protein structures. © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Yizhou; De Luca, Roberto; Cazzamalli, Samuele; Pretto, Francesca; Bajic, Davor; Scheuermann, Jörg; Neri, Dario
2018-03-01
In nature, specific antibodies can be generated as a result of an adaptive selection and expansion of lymphocytes with suitable protein binding properties. We attempted to mimic antibody-antigen recognition by displaying multiple chemical diversity elements on a defined macrocyclic scaffold. Encoding of the displayed combinations was achieved using distinctive DNA tags, resulting in a library size of 35,393,112. Specific binders could be isolated against a variety of proteins, including carbonic anhydrase IX, horseradish peroxidase, tankyrase 1, human serum albumin, alpha-1 acid glycoprotein, calmodulin, prostate-specific antigen and tumour necrosis factor. Similar to antibodies, the encoded display of multiple chemical elements on a constant scaffold enabled practical applications, such as fluorescence microscopy procedures or the selective in vivo delivery of payloads to tumours. Furthermore, the versatile structure of the scaffold facilitated the generation of protein-specific chemical probes, as illustrated by photo-crosslinking.
Laine, Elodie; Carbone, Alessandra
2015-01-01
Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2. PMID:26690684
Yu, Clinton; Huszagh, Alexander; Viner, Rosa; Novitsky, Eric J; Rychnovsky, Scott D; Huang, Lan
2016-10-18
Cross-linking mass spectrometry (XL-MS) represents a recently popularized hybrid methodology for defining protein-protein interactions (PPIs) and analyzing structures of large protein assemblies. In particular, XL-MS strategies have been demonstrated to be effective in elucidating molecular details of PPIs at the peptide resolution, providing a complementary set of structural data that can be utilized to refine existing complex structures or direct de novo modeling of unknown protein structures. To study structural and interaction dynamics of protein complexes, quantitative cross-linking mass spectrometry (QXL-MS) strategies based on isotope-labeled cross-linkers have been developed. Although successful, these approaches are mostly limited to pairwise comparisons. In order to establish a robust workflow enabling comparative analysis of multiple cross-linked samples simultaneously, we have developed a multiplexed QXL-MS strategy, namely, QMIX (Quantitation of Multiplexed, Isobaric-labeled cross (X)-linked peptides) by integrating MS-cleavable cross-linkers with isobaric labeling reagents. This study has established a new analytical platform for quantitative analysis of cross-linked peptides, which can be directly applied for multiplexed comparisons of the conformational dynamics of protein complexes and PPIs at the proteome scale in future studies.
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Characterization of the infrared spectra of serum from patients with multiple myeloma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plotnikova, L., E-mail: ljusja@mail.ru; Nosenko, T.; Uspenskaya, M., E-mail: mv-uspenskaya@mail.ru
Multiple myeloma (MM) accounts for about 1% of all types of cancers. MM is characterized by the proliferation of a single clone of plasma cells, which may produce and secrete a homogeneous monoclonal immunoglobulin. The monoclonal immunoglobulin is commonly referred to as an M protein. The M protein acts as a serological “tumor” marker that is useful for diagnosis and disease monitoring. The electrophoretic pattern reveals the M-protein in 80% of MM patients as a single peak or localized band. In our study we applied a combination of high-resolution agarose gel protein electrophoresis (PEL), spectroscopic techniques and thermal analysis tomore » identify the key differences in protein composition, protein structure and their thermal behavior for the samples obtained from the serum of MM patients and healthy donors.« less
AbDb: antibody structure database—a database of PDB-derived antibody structures
Ferdous, Saba
2018-01-01
Abstract In order to analyse structures of proteins of a particular class, these need to be extracted from Protein Data Bank (PDB) files. In the case of antibodies, there are a number of special considerations: (i) identifying antibodies in the PDB is not trivial, (ii) they may be crystallized with or without antigen, (iii) for analysis purposes, one is normally only interested in the Fv region of the antibody, (iv) structural analysis of epitopes, in particular, requires individual antibody–antigen complexes from a PDB file which may contain multiple copies of the same, or different, antibodies and (v) standard numbering schemes should be applied. Consequently, there is a need for a specialist resource containing pre-numbered non-redundant antibody Fv structures with their cognate antigens. We have created an automatically updated resource, AbDb, which collects the Fv regions from antibody structures using information from our SACS database which summarizes antibody structures from the PDB. PDB files containing multiple structures are split and numbered and each antibody structure is associated with its antigen where available. Antibody structures with only light or heavy chains have also been processed and sequences of antibodies are compared to identify multiple structures of the same antibody. The data may be queried on the basis of PDB code, or the name or species of the antibody or antigen, and the complete datasets may be downloaded. Database URL: www.bioinf.org.uk/abs/abdb/ PMID:29718130
Chae, Pil Seok; Rasmussen, Søren G. F.; Rana, Rohini; Gotfryd, Kamil; Chandra, Richa; Goren, Michael A.; Kruse, Andrew C.; Nurva, Shailika; Loland, Claus J.; Pierre, Yves; Drew, David; Popot, Jean-Luc; Picot, Daniel; Fox, Brian G.; Guan, Lan; Gether, Ulrik; Byrne, Bernadette; Kobilka, Brian; Gellman, Samuel H.
2011-01-01
The understanding of integral membrane protein (IMP) structure and function is hampered by the difficulty of handling these proteins. Aqueous solubilization, necessary for many types of biophysical analysis, generally requires a detergent to shield the large lipophilic surfaces displayed by native IMPs. Many proteins remain difficult to study owing to a lack of suitable detergents. We introduce a class of amphiphiles, each of which is built around a central quaternary carbon atom derived from neopentyl glycol, with hydrophilic groups derived from maltose. Representatives of this maltose-neopentyl glycol (MNG) amphiphile family display favorable behavior relative to conventional detergents, as tested on multiple membrane protein systems, leading to enhanced structural stability and successful crystallization. MNG amphiphiles are promising tools for membrane protein science because of the ease with which they may be prepared and the facility with which their structures may be varied. PMID:21037590
Sequence harmony: detecting functional specificity from alignments
Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap
2007-01-01
Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793
Structural analysis of a set of proteins resulting from a bacterial genomics project.
Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R
2005-09-01
The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.
Ellis, Heidi J C; Nowling, Ronald J; Vyas, Jay; Martyn, Timothy O; Gryk, Michael R
2011-04-11
The CONNecticut Joint University Research (CONNJUR) team is a group of biochemical and software engineering researchers at multiple institutions. The vision of the team is to develop a comprehensive application that integrates a variety of existing analysis tools with workflow and data management to support the process of protein structure determination using Nuclear Magnetic Resonance (NMR). The use of multiple disparate tools and lack of data management, currently the norm in NMR data processing, provides strong motivation for such an integrated environment. This manuscript briefly describes the domain of NMR as used for protein structure determination and explains the formation of the CONNJUR team and its operation in developing the CONNJUR application. The manuscript also describes the evolution of the CONNJUR application through four prototypes and describes the challenges faced while developing the CONNJUR application and how those challenges were met.
Structural studies of G protein-coupled receptors.
Lu, Mengjie; Wu, Beili
2016-11-01
G protein-coupled receptors (GPCRs) comprise the largest membrane protein family. These receptors sense a variety of signaling molecules, activate multiple intracellular signal pathways, and act as the targets of over 40% of marketed drugs. Recent progress on GPCR structural studies provides invaluable insights into the structure-function relationship of the GPCR superfamily, deepening our understanding about the molecular mechanisms of GPCR signal transduction. Here, we review recent breakthroughs on GPCR structure determination and the structural features of GPCRs, and take the structures of chemokine receptor CCR5 and purinergic receptors P2Y 1 R and P2Y 12 R as examples to discuss the importance of GPCR structures on functional studies and drug discovery. In addition, we discuss the prospect of GPCR structure-based drug discovery. © 2016 IUBMB Life, 68(11):894-903, 2016. © 2016 International Union of Biochemistry and Molecular Biology.
Comparative analysis of diguanylate cyclase and phosphodiesterase genes in Klebsiella pneumoniae.
Cruz, Diana P; Huertas, Mónica G; Lozano, Marcela; Zárate, Lina; Zambrano, María Mercedes
2012-07-09
Klebsiella pneumoniae can be found in environmental habitats as well as in hospital settings where it is commonly associated with nosocomial infections. One of the factors that contribute to virulence is its capacity to form biofilms on diverse biotic and abiotic surfaces. The second messenger Bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) is a ubiquitous signal in bacteria that controls biofilm formation as well as several other cellular processes. The cellular levels of this messenger are controlled by c-di-GMP synthesis and degradation catalyzed by diguanylate cyclase (DGC) and phophodiesterase (PDE) enzymes, respectively. Many bacteria contain multiple copies of these proteins with diverse organizational structure that highlight the complex regulatory mechanisms of this signaling network. This work was undertaken to identify DGCs and PDEs and analyze the domain structure of these proteins in K. pneumoniae. A search for conserved GGDEF and EAL domains in three sequenced K. pneumoniae genomes showed that there were multiple copies of GGDEF and EAL containing proteins. Both single domain and hybrid GGDEF proteins were identified: 21 in K. pneumoniae Kp342, 18 in K. pneumoniae MGH 78578 and 17 in K. pneumoniae NTUH-K2044. The majority had only the GGDEF domain, most with the GGEEF motif, and hybrid proteins containing both GGDEF and EAL domains were also found. The I site for allosteric control was identified only in single GGDEF domain proteins and not in hybrid proteins. EAL-only proteins, containing either intact or degenerate domains, were also identified: 15 in Kp342, 15 in MGH 78578 and 10 in NTUH-K2044. Several input sensory domains and transmembrane segments were identified, which together indicate complex regulatory circuits that in many cases can be membrane associated. The comparative analysis of proteins containing GGDEF/EAL domains in K. pneumoniae showed that most copies were shared among the three strains and that some were unique to a particular strain. The multiplicity of these proteins and the diversity of structural characteristics suggest that the c-di-GMP network in this enteric bacterium is highly complex and reflects the importance of having diverse mechanisms to control cellular processes in environments as diverse as soils or plants and clinical settings.
Bhagavat, Raghu; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2017-09-01
Nucleoside triphosphate (NTP) ligands are of high biological importance and are essential for all life forms. A pre-requisite for them to participate in diverse biochemical processes is their recognition by diverse proteins. It is thus of great interest to understand the basis for such recognition in different proteins. Towards this, we have used a structural bioinformatics approach and analyze structures of 4677 NTP complexes available in Protein Data Bank (PDB). Binding sites were extracted and compared exhaustively using PocketMatch, a sensitive in-house site comparison algorithm, which resulted in grouping the entire dataset into 27 site-types. Each of these site-types represent a structural motif comprised of two or more residue conservations, derived using another in-house tool for superposing binding sites, PocketAlign. The 27 site-types could be grouped further into 9 super-types by considering partial similarities in the sites, which indicated that the individual site-types comprise different combinations of one or more site features. A scan across PDB using the 27 structural motifs determined the motifs to be specific to NTP binding sites, and a computational alanine mutagenesis indicated that residues identified to be highly conserved in the motifs are also most contributing to binding. Alternate orientations of the ligand in several site-types were observed and rationalized, indicating the possibility of some residues serving as anchors for NTP recognition. The presence of multiple site-types and the grouping of multiple folds into each site-type is strongly suggestive of convergent evolution. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. Proteins 2017; 85:1699-1712. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arakaki, Tracy; Le Trong, Isolde; Structural Genomics of Pathogenic Protozoa
2006-03-01
The crystal structure of a conserved hypothetical protein from L. major, Pfam sequence family PF04543, structural genomics target ID Lmaj006129AAA, has been determined at a resolution of 1.6 Å. The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure elements and disorder. The structure of one of these variants, Lmaj006129AAH, was solved by multiple-wavelength anomalous diffraction (MAD)more » using ELVES, an automatic protein crystal structure-determination system. This model was then successfully used as a molecular-replacement probe for the parent full-length target, Lmaj006129AAA. The final structure of Lmaj006129AAA was refined to an R value of 0.185 (R{sub free} = 0.229) at 1.60 Å resolution. Structure and sequence comparisons based on Lmaj006129AAA suggest that proteins belonging to Pfam sequence families PF04543 and PF01878 may share a common ligand-binding motif.« less
Byssus Structure and Protein Composition in the Highly Invasive Fouling Mussel Limnoperna fortunei
Li, Shiguo; Xia, Zhiqiang; Chen, Yiyong; Gao, Yangchun; Zhan, Aibin
2018-01-01
Biofouling mediated by byssus adhesion in invasive bivalves has become a global environmental problem in aquatic ecosystems, resulting in negative ecological and economic consequences. Previous studies suggested that mechanisms responsible for byssus adhesion largely vary among bivalves, but it is poorly understood in freshwater species. Understanding of byssus structure and protein composition is the prerequisite for revealing these mechanisms. Here, we used multiple methods, including scanning electron microscope, liquid chromatography–tandem mass spectrometry, transcriptome sequencing, real-time quantitative PCR, inductively coupled plasma mass spectrometry, to investigate structure, and protein composition of byssus in the highly invasive freshwater mussel Limnoperna fortunei. The results indicated that the structure characteristics of adhesive plaque, proximal and distal threads were conducive to byssus adhesion, contributing to the high biofouling capacity of this species. The 3,4-dihydroxyphenyl-α-alanine (Dopa) is a major post-transnationally modification in L. fortunei byssus. We identified 16 representative foot proteins with typical repetitive motifs and conserved domains by integrating transcriptomic and proteomic approaches. In these proteins, Lfbp-1, Lffp-2, and Lfbp-3 were specially located in foot tissue and highly expressed in the rapid byssus formation period, suggesting the involvement of these foot proteins in byssus production and adhesion. Multiple metal irons, including Ca2+, Mg2+, Zn2+, Al3+, and Fe3+, were abundant in both foot tissue and byssal thread. The heavy metals in these irons may be directly accumulated by L. fortunei from surrounding environments. Nevertheless, some metal ions (e.g., Ca2+) corresponded well with amino acid preferences of L. fortunei foot proteins, suggesting functional roles of these metal ions by interacting with foot proteins in byssus adhesion. Overall, this study provides structural and molecular bases of adhesive mechanisms of byssus in L. fortunei, and findings here are expected to develop strategies against biofouling by freshwater organisms. PMID:29713291
Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan
2009-01-01
We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
Designing and benchmarking the MULTICOM protein structure prediction system
2013-01-01
Background Predicting protein structure from sequence is one of the most significant and challenging problems in bioinformatics. Numerous bioinformatics techniques and tools have been developed to tackle almost every aspect of protein structure prediction ranging from structural feature prediction, template identification and query-template alignment to structure sampling, model quality assessment, and model refinement. How to synergistically select, integrate and improve the strengths of the complementary techniques at each prediction stage and build a high-performance system is becoming a critical issue for constructing a successful, competitive protein structure predictor. Results Over the past several years, we have constructed a standalone protein structure prediction system MULTICOM that combines multiple sources of information and complementary methods at all five stages of the protein structure prediction process including template identification, template combination, model generation, model assessment, and model refinement. The system was blindly tested during the ninth Critical Assessment of Techniques for Protein Structure Prediction (CASP9) in 2010 and yielded very good performance. In addition to studying the overall performance on the CASP9 benchmark, we thoroughly investigated the performance and contributions of each component at each stage of prediction. Conclusions Our comprehensive and comparative study not only provides useful and practical insights about how to select, improve, and integrate complementary methods to build a cutting-edge protein structure prediction system but also identifies a few new sources of information that may help improve the design of a protein structure prediction system. Several components used in the MULTICOM system are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:23442819
NASA Astrophysics Data System (ADS)
Prakash, Priyanka; Sayyed-Ahmad, Abdallah; Cho, Kwang-Jin; Dolino, Drew M.; Chen, Wei; Li, Hongyang; Grant, Barry J.; Hancock, John F.; Gorfe, Alemayehu A.
2017-01-01
Recent studies found that membrane-bound K-Ras dimers are important for biological function. However, the structure and thermodynamic stability of these complexes remained unknown because they are hard to probe by conventional approaches. Combining data from a wide range of computational and experimental approaches, here we describe the structure, dynamics, energetics and mechanism of assembly of multiple K-Ras dimers. Utilizing a range of techniques for the detection of reactive surfaces, protein-protein docking and molecular simulations, we found that two largely polar and partially overlapping surfaces underlie the formation of multiple K-Ras dimers. For validation we used mutagenesis, electron microscopy and biochemical assays under non-denaturing conditions. We show that partial disruption of a predicted interface through charge reversal mutation of apposed residues reduces oligomerization while introduction of cysteines at these positions enhanced dimerization likely through the formation of an intermolecular disulfide bond. Free energy calculations indicated that K-Ras dimerization involves direct but weak protein-protein interactions in solution, consistent with the notion that dimerization is facilitated by membrane binding. Taken together, our atomically detailed analyses provide unique mechanistic insights into K-Ras dimer formation and membrane organization as well as the conformational fluctuations and equilibrium thermodynamics underlying these processes.
Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B
1989-01-01
The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501
Tunable and reversible drug control of protein production via a self-excising degron.
Chung, Hokyung K; Jacobs, Conor L; Huo, Yunwen; Yang, Jin; Krumm, Stefanie A; Plemper, Richard K; Tsien, Roger Y; Lin, Michael Z
2015-09-01
An effective method for direct chemical control over the production of specific proteins would be widely useful. We describe small molecule-assisted shutoff (SMASh), a technique in which proteins are fused to a degron that removes itself in the absence of drug, resulting in the production of an untagged protein. Clinically tested HCV protease inhibitors can then block degron removal, inducing rapid degradation of subsequently synthesized copies of the protein. SMASh allows reversible and dose-dependent shutoff of various proteins in multiple mammalian cell types and in yeast. We also used SMASh to confer drug responsiveness onto an RNA virus for which no licensed inhibitors exist. As SMASh does not require the permanent fusion of a large domain, it should be useful when control over protein production with minimal structural modification is desired. Furthermore, as SMASh involves only a single genetic modification and does not rely on modulating protein-protein interactions, it should be easy to generalize to multiple biological contexts.
da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas
2017-10-28
Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.
Parton, Robert G; Tillu, Vikas A; Collins, Brett M
2018-04-23
Caveolae are one of the most abundant and striking features of the plasma membrane of many mammalian cell types. These surface pits have fascinated biologists since their discovery by the pioneers of electron microscopy in the middle of the last century, but we are only just starting to understand their multiple functions. Molecular understanding of caveolar formation is advancing rapidly and we now know that sculpting the membrane to generate the characteristic bulb-shaped caveolar pit involves the coordinated action of integral membrane proteins and peripheral membrane coat proteins in a process dependent on their multiple interactions with membrane lipids. The resulting structure is further stabilised by protein complexes at the caveolar neck. Caveolae can bud to generate an endocytic carrier but can also be disassembled in response to specific stimuli to function as a mechanoprotective device. These structures have also been linked to numerous signalling pathways. Here, we will briefly summarise the current molecular and structural understanding of caveolar formation and dynamics, discuss how the crucial structural components of caveolae work together to generate a dynamic sensing domain, and discuss the implications of recent studies on the diverse roles proposed for caveolae in different cells and tissues. Copyright © 2018 Elsevier Ltd. All rights reserved.
Transmembrane Helices Tilt, Bend, Slide, Torque, and Unwind between Functional States of Rhodopsin
Ren, Zhong; Ren, Peter X.; Balusu, Rohith; Yang, Xiaojing
2016-01-01
The seven-helical bundle of rhodopsin and other G-protein coupled receptors undergoes structural rearrangements as the transmembrane receptor protein is activated. These structural changes are known to involve tilting and bending of various transmembrane helices. However, the cause and effect relationship among structural events leading to a cytoplasmic crevasse for G-protein binding is less well defined. Here we present a mathematical model of the protein helix and a simple procedure to determine multiple parameters that offer precise depiction of a helical conformation. A comprehensive survey of bovine rhodopsin structures shows that the helical rearrangements during the activation of rhodopsin involve a variety of angular and linear motions such as torsion, unwinding, and sliding in addition to the previously reported tilting and bending. These hitherto undefined motion components unify the results obtained from different experimental approaches, and demonstrate conformational similarity between the active opsin structure and the photoactivated structures in crystallo near the retinal anchor despite their marked differences. PMID:27658480
Solid state NMR: The essential technology for helical membrane protein structural characterization
Cross, Timothy A.; Ekanayake, Vindana; Paulino, Joana; Wright, Anna
2014-01-01
NMR spectroscopy of helical membrane proteins has been very challenging on multiple fronts. The expression and purification of these proteins while maintaining functionality has consumed countless graduate student hours. Sample preparations have depended on whether solution or solid-state NMR spectroscopy was to be performed – neither have been easy. In recent years it has become increasingly apparent that membrane mimic environments influence the structural result. Indeed, in these recent years we have rediscovered that Nobel laureate, Christian Anfinsen, did not say that protein structure was exclusively dictated by the amino acid sequence, but rather by the sequence in a given environment (Anfinsen, 1973) [106]. The environment matters, molecular interactions with the membrane environment are significant and many examples of distorted, non-native membrane protein structures have recently been documented in the literature. However, solid-state NMR structures of helical membrane proteins in proteoliposomes and bilayers are proving to be native structures that permit a high resolution characterization of their functional states. Indeed, solid-state NMR is uniquely able to characterize helical membrane protein structures in lipid environments without detergents. Recent progress in expression, purification, reconstitution, sample preparation and in the solid-state NMR spectroscopy of both oriented samples and magic angle spinning samples has demonstrated that helical membrane protein structures can be achieved in a timely fashion. Indeed, this is a spectacular opportunity for the NMR community to have a major impact on biomedical research through the solid-state NMR spectroscopy of these proteins. PMID:24412099
Solid state NMR: The essential technology for helical membrane protein structural characterization
NASA Astrophysics Data System (ADS)
Cross, Timothy A.; Ekanayake, Vindana; Paulino, Joana; Wright, Anna
2014-02-01
NMR spectroscopy of helical membrane proteins has been very challenging on multiple fronts. The expression and purification of these proteins while maintaining functionality has consumed countless graduate student hours. Sample preparations have depended on whether solution or solid-state NMR spectroscopy was to be performed - neither have been easy. In recent years it has become increasingly apparent that membrane mimic environments influence the structural result. Indeed, in these recent years we have rediscovered that Nobel laureate, Christian Anfinsen, did not say that protein structure was exclusively dictated by the amino acid sequence, but rather by the sequence in a given environment (Anfinsen, 1973) [106]. The environment matters, molecular interactions with the membrane environment are significant and many examples of distorted, non-native membrane protein structures have recently been documented in the literature. However, solid-state NMR structures of helical membrane proteins in proteoliposomes and bilayers are proving to be native structures that permit a high resolution characterization of their functional states. Indeed, solid-state NMR is uniquely able to characterize helical membrane protein structures in lipid environments without detergents. Recent progress in expression, purification, reconstitution, sample preparation and in the solid-state NMR spectroscopy of both oriented samples and magic angle spinning samples has demonstrated that helical membrane protein structures can be achieved in a timely fashion. Indeed, this is a spectacular opportunity for the NMR community to have a major impact on biomedical research through the solid-state NMR spectroscopy of these proteins.
Shen, Rong; Han, Wei; Fiorin, Giacomo; Islam, Shahidul M; Schulten, Klaus; Roux, Benoît
2015-10-01
The knowledge of multiple conformational states is a prerequisite to understand the function of membrane transport proteins. Unfortunately, the determination of detailed atomic structures for all these functionally important conformational states with conventional high-resolution approaches is often difficult and unsuccessful. In some cases, biophysical and biochemical approaches can provide important complementary structural information that can be exploited with the help of advanced computational methods to derive structural models of specific conformational states. In particular, functional and spectroscopic measurements in combination with site-directed mutations constitute one important source of information to obtain these mixed-resolution structural models. A very common problem with this strategy, however, is the difficulty to simultaneously integrate all the information from multiple independent experiments involving different mutations or chemical labels to derive a unique structural model consistent with the data. To resolve this issue, a novel restrained molecular dynamics structural refinement method is developed to simultaneously incorporate multiple experimentally determined constraints (e.g., engineered metal bridges or spin-labels), each treated as an individual molecular fragment with all atomic details. The internal structure of each of the molecular fragments is treated realistically, while there is no interaction between different molecular fragments to avoid unphysical steric clashes. The information from all the molecular fragments is exploited simultaneously to constrain the backbone to refine a three-dimensional model of the conformational state of the protein. The method is illustrated by refining the structure of the voltage-sensing domain (VSD) of the Kv1.2 potassium channel in the resting state and by exploring the distance histograms between spin-labels attached to T4 lysozyme. The resulting VSD structures are in good agreement with the consensus model of the resting state VSD and the spin-spin distance histograms from ESR/DEER experiments on T4 lysozyme are accurately reproduced.
Tarnowski, Krzysztof; Fituch, Kinga; Szczepanowski, Roman H; Dadlez, Michal; Kaus-Drobek, Magdalena
2014-01-01
RACK1 is a member of the WD repeat family of proteins and is involved in multiple fundamental cellular processes. An intriguing feature of RACK1 is its ability to interact with at least 80 different protein partners. Thus, the structural features enabling such interactomic flexibility are of great interest. Several previous studies of the crystal structures of RACK1 orthologs described its detailed architecture and confirmed predictions that RACK1 adopts a seven-bladed β-propeller fold. However, this did not explain its ability to bind to multiple partners. We performed hydrogen-deuterium (H-D) exchange mass spectrometry on three orthologs of RACK1 (human, yeast, and plant) to obtain insights into the dynamic properties of RACK1 in solution. All three variants retained similar patterns of deuterium uptake, with some pronounced differences that can be attributed to RACK1's divergent biological functions. In all cases, the most rigid structural elements were confined to B-C turns and, to some extent, strands B and C, while the remaining regions retained much flexibility. We also compared the average rate constants for H-D exchange in different regions of RACK1 and found that amide protons in some regions exchanged at least 1000-fold faster than in others. We conclude that its evolutionarily retained structural architecture might have allowed RACK1 to accommodate multiple molecular partners. This was exemplified by our additional analysis of yeast RACK1 dimer, which showed stabilization, as well as destabilization, of several interface regions upon dimer formation. PMID:24591271
Khvostichenko, Daria S.; Schieferstein, Jeremy M.; Pawate, Ashtamurthy S.; ...
2014-08-21
Crystallization from lipidic mesophase matrices is a promising route to diffraction-quality crystals and structures of membrane proteins. The microfluidic approach reported here eliminates two bottlenecks of the standard mesophase-based crystallization protocols: (i) manual preparation of viscous mesophases and (ii) manual harvesting of often small and fragile protein crystals. In the approach reported here, protein-loaded mesophases are formulated in an X-ray transparent microfluidic chip using only 60 nL of the protein solution per crystallization trial. The X-ray transparency of the chip enables diffraction data collection from multiple crystals residing in microfluidic wells, eliminating the normally required manual harvesting and mounting ofmore » individual crystals. In addition, we validated our approach by on-chip crystallization of photosynthetic reaction center, a membrane protein from Rhodobacter sphaeroides, followed by solving its structure to a resolution of 2.5 Å using X-ray diffraction data collected on-chip under ambient conditions. A moderate conformational change in hydrophilic chains of the protein was observed when comparing the on-chip, room temperature structure with known structures for which data were acquired under cryogenic conditions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khvostichenko, Daria S.; Schieferstein, Jeremy M.; Pawate, Ashtamurthy S.
2014-10-01
Crystallization from lipidic mesophase matrices is a promising route to diffraction-quality crystals and structures of membrane proteins. The microfluidic approach reported here eliminates two bottlenecks of the standard mesophase-based crystallization protocols: (i) manual preparation of viscous mesophases and (ii) manual harvesting of often small and fragile protein crystals. In the approach reported here, protein-loaded mesophases are formulated in an X-ray transparent microfluidic chip using only 60 nL of the protein solution per crystallization trial. The X-ray transparency of the chip enables diffraction data collection from multiple crystals residing in microfluidic wells, eliminating the normally required manual harvesting and mounting ofmore » individual crystals. We validated our approach by on-chip crystallization of photosynthetic reaction center, a membrane protein from Rhodobacter sphaeroides, followed by solving its structure to a resolution of 2.5 Å using X-ray diffraction data collected on-chip under ambient conditions. A moderate conformational change in hydrophilic chains of the protein was observed when comparing the on-chip, room temperature structure with known structures for which data were acquired under cryogenic conditions.« less
A tool for calculating binding-site residues on proteins from PDB structures.
Hu, Jing; Yan, Changhui
2009-08-03
In the research on protein functional sites, researchers often need to identify binding-site residues on a protein. A commonly used strategy is to find a complex structure from the Protein Data Bank (PDB) that consists of the protein of interest and its interacting partner(s) and calculate binding-site residues based on the complex structure. However, since a protein may participate in multiple interactions, the binding-site residues calculated based on one complex structure usually do not reveal all binding sites on a protein. Thus, this requires researchers to find all PDB complexes that contain the protein of interest and combine the binding-site information gleaned from them. This process is very time-consuming. Especially, combing binding-site information obtained from different PDB structures requires tedious work to align protein sequences. The process becomes overwhelmingly difficult when researchers have a large set of proteins to analyze, which is usually the case in practice. In this study, we have developed a tool for calculating binding-site residues on proteins, TCBRP http://yanbioinformatics.cs.usu.edu:8080/ppbindingsubmit. For an input protein, TCBRP can quickly find all binding-site residues on the protein by automatically combining the information obtained from all PDB structures that consist of the protein of interest. Additionally, TCBRP presents the binding-site residues in different categories according to the interaction type. TCBRP also allows researchers to set the definition of binding-site residues. The developed tool is very useful for the research on protein binding site analysis and prediction.
MODBASE, a database of annotated comparative protein structure models
Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej
2002-01-01
MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
Liu, Fuxiao; Wu, Xiaodong; Li, Lin; Liu, Zengshan; Wang, Zhiliang
2013-08-01
The baculovirus expression system (BES) has been one of the versatile platforms for the production of recombinant proteins requiring multiple post-translational modifications, such as folding, oligomerization, phosphorylation, glycosylation, acylation, disulfide bond formation and proteolytic cleavage. Advances in recombinant DNA technology have facilitated application of the BES, and made it possible to express multiple proteins simultaneously in a single infection and to produce multimeric proteins sharing functional similarity with their natural analogs. Therefore, the BES has been used for the production of recombinant proteins and the construction of virus-like particles (VLPs), as well as for the development of subunit vaccines, including VLP-based vaccines. The VLP, which consists of one or more structural proteins but no viral genome, resembles the authentic virion but cannot replicate in cells. The high-quality recombinant protein expression and post-translational modifications obtained with the BES, along with its capacity to produce multiple proteins, imply that it is ideally suited to VLP production. In this article, we critically review the pros and cons of using the BES as a platform to produce both enveloped and non-enveloped VLPs. Copyright © 2013 Elsevier Inc. All rights reserved.
Automatic prediction of protein domains from sequence information using a hybrid learning system.
Nagarajan, Niranjan; Yona, Golan
2004-06-12
We describe a novel method for detecting the domain structure of a protein from sequence information alone. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence and are combined into a single predictor using a neural network. The output is further smoothed and post-processed using a probabilistic model to predict the most likely transition positions between domains. The method was assessed using the domain definitions in SCOP and CATH for proteins of known structure and was compared with several other existing methods. Our method performs well both in terms of accuracy and sensitivity. It improves significantly over the best methods available, even some of the semi-manual ones, while being fully automatic. Our method can also be used to suggest and verify domain partitions based on structural data. A few examples of predicted domain definitions and alternative partitions, as suggested by our method, are also discussed. An online domain-prediction server is available at http://biozon.org/tools/domains/
Ray, Sumanta; Maulik, Ujjwal
2016-12-20
Detecting perturbation in modular structure during HIV-1 disease progression is an important step to understand stage specific infection pattern of HIV-1 virus in human cell. In this article, we proposed a novel methodology on integration of multiple biological information to identify such disruption in human gene module during different stages of HIV-1 infection. We integrate three different biological information: gene expression information, protein-protein interaction information and gene ontology information in single gene meta-module, through non negative matrix factorization (NMF). As the identified metamodules inherit those information so, detecting perturbation of these, reflects the changes in expression pattern, in PPI structure and in functional similarity of genes during the infection progression. To integrate modules of different data sources into strong meta-modules, NMF based clustering is utilized here. Perturbation in meta-modular structure is identified by investigating the topological and intramodular properties and putting rank to those meta-modules using a rank aggregation algorithm. We have also analyzed the preservation structure of significant GO terms in which the human proteins of the meta-modules participate. Moreover, we have performed an analysis to show the change of coregulation pattern of identified transcription factors (TFs) over the HIV progression stages.
Self-Chaperoning of the Type III Secretion System needle tip proteins IpaD and BipD
Johnson, Steven; Roversi, Pietro; Espina, Marianela; Olive, Andrew; Deane, Janet E.; Birket, Susan; Field, Terry; Picking, William D.; Blocker, Ariel; Galyov, Edouard E.; Picking, Wendy L.; Lea, Susan M.
2007-01-01
Bacteria expressing type III secretion systems (T3SS) have been responsible for the deaths of millions worldwide, acting as key virulence elements in diseases ranging from plague to typhoid fever. The T3SS is composed of a basal body, which traverses both bacterial membranes, and an external needle through which effector proteins are secreted. We report multiple crystal structures of two proteins that sit at the tip of the needle and are essential for virulence; IpaD from Shigella flexneri and BipD from Burkholderia pseudomallei. The structures reveal that the N-terminal domains of the molecules are intra-molecular chaperones that prevent premature oligomerization, as well as sharing structural homology with proteins involved in eukaryotic actin rearrangement. Crystal packing has allowed us to construct a model for the tip complex that is supported by mutations designed using the structure. PMID:17077085
Self-chaperoning of the type III secretion system needle tip proteins IpaD and BipD.
Johnson, Steven; Roversi, Pietro; Espina, Marianela; Olive, Andrew; Deane, Janet E; Birket, Susan; Field, Terry; Picking, William D; Blocker, Ariel J; Galyov, Edouard E; Picking, Wendy L; Lea, Susan M
2007-02-09
Bacteria expressing type III secretion systems (T3SS) have been responsible for the deaths of millions worldwide, acting as key virulence elements in diseases ranging from plague to typhoid fever. The T3SS is composed of a basal body, which traverses both bacterial membranes, and an external needle through which effector proteins are secreted. We report multiple crystal structures of two proteins that sit at the tip of the needle and are essential for virulence: IpaD from Shigella flexneri and BipD from Burkholderia pseudomallei. The structures reveal that the N-terminal domains of the molecules are intramolecular chaperones that prevent premature oligomerization, as well as sharing structural homology with proteins involved in eukaryotic actin rearrangement. Crystal packing has allowed us to construct a model for the tip complex that is supported by mutations designed using the structure.
Time, space, and disorder in the expanding proteome universe.
Minde, David-Paul; Dunker, A Keith; Lilley, Kathryn S
2017-04-01
Proteins are highly dynamic entities. Their myriad functions require specific structures, but proteins' dynamic nature ranges all the way from the local mobility of their amino acid constituents to mobility within and well beyond single cells. A truly comprehensive view of the dynamic structural proteome includes: (i) alternative sequences, (ii) alternative conformations, (iii) alternative interactions with a range of biomolecules, (iv) cellular localizations, (v) alternative behaviors in different cell types. While these aspects have traditionally been explored one protein at a time, we highlight recently emerging global approaches that accelerate comprehensive insights into these facets of the dynamic nature of protein structure. Computational tools that integrate and expand on multiple orthogonal data types promise to enable the transition from a disjointed list of static snapshots to a structurally explicit understanding of the dynamics of cellular mechanisms. © 2017 The Authors. Proteomics Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Multiple solvent crystal structures of ribonuclease A: An assessment of the method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dechene, Michelle; Wink, Glenna; Smith, Mychal
2010-11-12
The multiple solvent crystal structures (MSCS) method uses organic solvents to map the surfaces of proteins. It identifies binding sites and allows for a more thorough examination of protein plasticity and hydration than could be achieved by a single structure. The crystal structures of bovine pancreatic ribonuclease A (RNAse A) soaked in the following organic solvents are presented: 50% dioxane, 50% dimethylformamide, 70% dimethylsulfoxide, 70% 1,6-hexanediol, 70% isopropanol, 50% R,S,R-bisfuran alcohol, 70% t-butanol, 50% trifluoroethanol, or 1.0M trimethylamine-N-oxide. This set of structures is compared with four sets of crystal structures of RNAse A from the protein data bank (PDB) andmore » with the solution NMR structure to assess the validity of previously untested assumptions associated with MSCS analysis. Plasticity from MSCS is the same as from PDB structures obtained in the same crystal form and deviates only at crystal contacts when compared to structures from a diverse set of crystal environments. Furthermore, there is a good correlation between plasticity as observed by MSCS and the dynamic regions seen by NMR. Conserved water binding sites are identified by MSCS to be those that are conserved in the sets of structures taken from the PDB. Comparison of the MSCS structures with inhibitor-bound crystal structures of RNAse A reveals that the organic solvent molecules identify key interactions made by inhibitor molecules, highlighting ligand binding hot-spots in the active site. The present work firmly establishes the relevance of information obtained by MSCS.« less
Can small hydrophobic gold nanoparticles inhibit β2-microglobulin fibrillation?
NASA Astrophysics Data System (ADS)
Brancolini, Giorgia; Toroz, Dimitrios; Corni, Stefano
2014-06-01
Inorganic nanoparticles stabilized by a shell of organic ligands can enhance or suppress the natural propensity of proteins to form fibrils. Functionalization facilitates targeted delivery of the nanoparticles to various cell types, bioimaging, drug delivery and other therapeutic and diagnostic applications. In this study, we provide a computational model of the effect of a prototypical thiol-protected gold nanoparticle, Au25L18- (L = S(CH2)2Ph) on the β2-microglobulin natural fibrillation propensity. To reveal the molecular basis of the protein-nanoparticle association process, we performed various simulations at multiple levels (Classical Molecular Dynamics and Brownian Dynamics) that cover multiple length- and timescales. The results provide a model of the ensemble of structures constituting the protein-gold nanoparticle complexes, and insights into the driving forces for the binding of β2-microglobulin to hydrophobic small size gold nanoparticles. We have found that the small nanoparticles can bind the protein to form persistent complexes. This binding of nanoparticles is able to block the active sites of domains from binding to another protein, thus leading to potential inhibition of the fibrillation activity. A comparison with the binding patches identified for the interaction of the protein with a known inhibitor of fibrillation, supports our conclusion.Inorganic nanoparticles stabilized by a shell of organic ligands can enhance or suppress the natural propensity of proteins to form fibrils. Functionalization facilitates targeted delivery of the nanoparticles to various cell types, bioimaging, drug delivery and other therapeutic and diagnostic applications. In this study, we provide a computational model of the effect of a prototypical thiol-protected gold nanoparticle, Au25L18- (L = S(CH2)2Ph) on the β2-microglobulin natural fibrillation propensity. To reveal the molecular basis of the protein-nanoparticle association process, we performed various simulations at multiple levels (Classical Molecular Dynamics and Brownian Dynamics) that cover multiple length- and timescales. The results provide a model of the ensemble of structures constituting the protein-gold nanoparticle complexes, and insights into the driving forces for the binding of β2-microglobulin to hydrophobic small size gold nanoparticles. We have found that the small nanoparticles can bind the protein to form persistent complexes. This binding of nanoparticles is able to block the active sites of domains from binding to another protein, thus leading to potential inhibition of the fibrillation activity. A comparison with the binding patches identified for the interaction of the protein with a known inhibitor of fibrillation, supports our conclusion. Electronic supplementary information (ESI) available: Details on the molecular dynamics simulation results. Table S1 reports results of the MD trajectories with a single NP at different initial velocities (d1, d2, d3, and d4) (three-dimensional structures and contact residues). Table S2 reports results of the MD trajectories with a couple of NPs at different initial velocities (initial orientations, three-dimensional structures, contact residues and root-mean-square deviations). Table S3 reports root-mean-square fluctuations and divergence of the protein structure with respect to the NMR model. Table S4 describes the average energy of the final complexes. See DOI: 10.1039/c4nr01514b
Recent developments in the theory of protein folding: searching for the global energy minimum.
Scheraga, H A
1996-04-16
Statistical mechanical theories and computer simulation are being used to gain an understanding of the fundamental features of protein folding. A major obstacle in the computation of protein structures is the multiple-minima problem arising from the existence of many local minima in the multidimensional energy landscape of the protein. This problem has been surmounted for small open-chain and cyclic peptides, and for regular-repeating sequences of models of fibrous proteins. Progress is being made in resolving this problem for globular proteins.
Looking at the Disordered Proteins through the Computational Microscope.
Das, Payel; Matysiak, Silvina; Mittal, Jeetain
2018-05-23
Intrinsically disordered proteins (IDPs) have attracted wide interest over the past decade due to their surprising prevalence in the proteome and versatile roles in cell physiology and pathology. A large selection of IDPs has been identified as potential targets for therapeutic intervention. Characterizing the structure-function relationship of disordered proteins is therefore an essential but daunting task, as these proteins can adapt transient structure, necessitating a new paradigm for connecting structural disorder to function. Molecular simulation has emerged as a natural complement to experiments for atomic-level characterizations and mechanistic investigations of this intriguing class of proteins. The diverse range of length and time scales involved in IDP function requires performing simulations at multiple levels of resolution. In this Outlook, we focus on summarizing available simulation methods, along with a few interesting example applications. We also provide an outlook on how these simulation methods can be further improved in order to provide a more accurate description of IDP structure, binding, and assembly.
General Mechanism of Two-State Protein Folding Kinetics
Rollins, Geoffrey C.; Dill, Ken A.
2016-01-01
We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s. PMID:25056406
Integrated Structural Biology for α-Helical Membrane Protein Structure Determination.
Xia, Yan; Fischer, Axel W; Teixeira, Pedro; Weiner, Brian; Meiler, Jens
2018-04-03
While great progress has been made, only 10% of the nearly 1,000 integral, α-helical, multi-span membrane protein families are represented by at least one experimentally determined structure in the PDB. Previously, we developed the algorithm BCL::MP-Fold, which samples the large conformational space of membrane proteins de novo by assembling predicted secondary structure elements guided by knowledge-based potentials. Here, we present a case study of rhodopsin fold determination by integrating sparse and/or low-resolution restraints from multiple experimental techniques including electron microscopy, electron paramagnetic resonance spectroscopy, and nuclear magnetic resonance spectroscopy. Simultaneous incorporation of orthogonal experimental restraints not only significantly improved the sampling accuracy but also allowed identification of the correct fold, which is demonstrated by a protein size-normalized transmembrane root-mean-square deviation as low as 1.2 Å. The protocol developed in this case study can be used for the determination of unknown membrane protein folds when limited experimental restraints are available. Copyright © 2018 Elsevier Ltd. All rights reserved.
Template-based protein structure modeling using the RaptorX web server.
Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo
2012-07-19
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.
Template-based protein structure modeling using the RaptorX web server
Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo
2016-01-01
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world. PMID:22814390
Human Autoantibodies Reveal Titin as a Chromosomal Protein
Machado, Cristina; Sunkel, Claudio E.; Andrew, Deborah J.
1998-01-01
Assembly of the higher-order structure of mitotic chromosomes is a prerequisite for proper chromosome condensation, segregation and integrity. Understanding the details of this process has been limited because very few proteins involved in the assembly of chromosome structure have been discovered. Using a human autoimmune scleroderma serum that identifies a chromosomal protein in human cells and Drosophila embryos, we cloned the corresponding Drosophila gene that encodes the homologue of vertebrate titin based on protein size, sequence similarity, developmental expression and subcellular localization. Titin is a giant sarcomeric protein responsible for the elasticity of striated muscle that may also function as a molecular scaffold for myofibrillar assembly. Molecular analysis and immunostaining with antibodies to multiple titin epitopes indicates that the chromosomal and muscle forms of titin may vary in their NH2 termini. The identification of titin as a chromosomal component provides a molecular basis for chromosome structure and elasticity. PMID:9548712
Structure of a group II intron in complex with its reverse transcriptase.
Qu, Guosheng; Kaushal, Prem Singh; Wang, Jia; Shigematsu, Hideki; Piazza, Carol Lyn; Agrawal, Rajendra Kumar; Belfort, Marlene; Wang, Hong-Wei
2016-06-01
Bacterial group II introns are large catalytic RNAs related to nuclear spliceosomal introns and eukaryotic retrotransposons. They self-splice, yielding mature RNA, and integrate into DNA as retroelements. A fully active group II intron forms a ribonucleoprotein complex comprising the intron ribozyme and an intron-encoded protein that performs multiple activities including reverse transcription, in which intron RNA is copied into the DNA target. Here we report cryo-EM structures of an endogenously spliced Lactococcus lactis group IIA intron in its ribonucleoprotein complex form at 3.8-Å resolution and in its protein-depleted form at 4.5-Å resolution, revealing functional coordination of the intron RNA with the protein. Remarkably, the protein structure reveals a close relationship between the reverse transcriptase catalytic domain and telomerase, whereas the active splicing center resembles the spliceosomal Prp8 protein. These extraordinary similarities hint at intricate ancestral relationships and provide new insights into splicing and retromobility.
Functional dynamics of cell surface membrane proteins
NASA Astrophysics Data System (ADS)
Nishida, Noritaka; Osawa, Masanori; Takeuchi, Koh; Imai, Shunsuke; Stampoulis, Pavlos; Kofuku, Yutaka; Ueda, Takumi; Shimada, Ichio
2014-04-01
Cell surface receptors are integral membrane proteins that receive external stimuli, and transmit signals across plasma membranes. In the conventional view of receptor activation, ligand binding to the extracellular side of the receptor induces conformational changes, which convert the structure of the receptor into an active conformation. However, recent NMR studies of cell surface membrane proteins have revealed that their structures are more dynamic than previously envisioned, and they fluctuate between multiple conformations in an equilibrium on various timescales. In addition, NMR analyses, along with biochemical and cell biological experiments indicated that such dynamical properties are critical for the proper functions of the receptors. In this review, we will describe several NMR studies that revealed direct linkage between the structural dynamics and the functions of the cell surface membrane proteins, such as G-protein coupled receptors (GPCRs), ion channels, membrane transporters, and cell adhesion molecules.
Functional dynamics of cell surface membrane proteins.
Nishida, Noritaka; Osawa, Masanori; Takeuchi, Koh; Imai, Shunsuke; Stampoulis, Pavlos; Kofuku, Yutaka; Ueda, Takumi; Shimada, Ichio
2014-04-01
Cell surface receptors are integral membrane proteins that receive external stimuli, and transmit signals across plasma membranes. In the conventional view of receptor activation, ligand binding to the extracellular side of the receptor induces conformational changes, which convert the structure of the receptor into an active conformation. However, recent NMR studies of cell surface membrane proteins have revealed that their structures are more dynamic than previously envisioned, and they fluctuate between multiple conformations in an equilibrium on various timescales. In addition, NMR analyses, along with biochemical and cell biological experiments indicated that such dynamical properties are critical for the proper functions of the receptors. In this review, we will describe several NMR studies that revealed direct linkage between the structural dynamics and the functions of the cell surface membrane proteins, such as G-protein coupled receptors (GPCRs), ion channels, membrane transporters, and cell adhesion molecules. Copyright © 2013 Elsevier Inc. All rights reserved.
The rearrangement of motif F in the flavivirus RNA-directed RNA polymerase.
Potapova, Ulyana; Feranchuk, Sergey; Leonova, Galina; Belikov, Sergei
2018-03-01
In the flavivirus genus, the non-structural protein NS5 plays a central role in RNA viral replication and constitutes a major target for drug discovery. One of the prime challenges in the study of NS5 protein is to investigate the interplay between the two protein domains, namely, the RNA-dependent RNA polymerase (RdRp) domain and the methyltransferase (MTase) domain. These investigations could clarify the multiple roles of NS5 protein in the virus life cycle. Here we present the results of sequence analyses and structural bioinformatics studies of NS5 protein, which suggest that the conserved motif F in the NS5 protein could act as a lock which controls the rearrangement of the domains and as a switch in the protein enzymatic activity. Copyright © 2017 Elsevier B.V. All rights reserved.
Atomic-level characterization of the structural dynamics of proteins.
Shaw, David E; Maragakis, Paul; Lindorff-Larsen, Kresten; Piana, Stefano; Dror, Ron O; Eastwood, Michael P; Bank, Joseph A; Jumper, John M; Salmon, John K; Shan, Yibing; Wriggers, Willy
2010-10-15
Molecular dynamics (MD) simulations are widely used to study protein motions at an atomic level of detail, but they have been limited to time scales shorter than those of many biologically critical conformational changes. We examined two fundamental processes in protein dynamics--protein folding and conformational change within the folded state--by means of extremely long all-atom MD simulations conducted on a special-purpose machine. Equilibrium simulations of a WW protein domain captured multiple folding and unfolding events that consistently follow a well-defined folding pathway; separate simulations of the protein's constituent substructures shed light on possible determinants of this pathway. A 1-millisecond simulation of the folded protein BPTI reveals a small number of structurally distinct conformational states whose reversible interconversion is slower than local relaxations within those states by a factor of more than 1000.
Ab initio folding of proteins using all-atom discrete molecular dynamics
Ding, Feng; Tsao, Douglas; Nie, Huifen; Dokholyan, Nikolay V.
2008-01-01
Summary Discrete molecular dynamics (DMD) is a rapid sampling method used in protein folding and aggregation studies. Until now, DMD was used to perform simulations of simplified protein models in conjunction with structure-based force fields. Here, we develop an all-atom protein model and a transferable force field featuring packing, solvation, and environment-dependent hydrogen bond interactions. Using the replica exchange method, we perform folding simulations of six small proteins (20–60 residues) with distinct native structures. In all cases, native or near-native states are reached in simulations. For three small proteins, multiple folding transitions are observed and the computationally-characterized thermodynamics are in quantitative agreement with experiments. The predictive power of all-atom DMD highlights the importance of environment-dependent hydrogen bond interactions in modeling protein folding. The developed approach can be used for accurate and rapid sampling of conformational spaces of proteins and protein-protein complexes, and applied to protein engineering and design of protein-protein interactions. PMID:18611374
Improved data visualization techniques for analyzing macromolecule structural changes.
Kim, Jae Hyun; Iyer, Vidyashankara; Joshi, Sangeeta B; Volkin, David B; Middaugh, C Russell
2012-10-01
The empirical phase diagram (EPD) is a colored representation of overall structural integrity and conformational stability of macromolecules in response to various environmental perturbations. Numerous proteins and macromolecular complexes have been analyzed by EPDs to summarize results from large data sets from multiple biophysical techniques. The current EPD method suffers from a number of deficiencies including lack of a meaningful relationship between color and actual molecular features, difficulties in identifying contributions from individual techniques, and a limited ability to be interpreted by color-blind individuals. In this work, three improved data visualization approaches are proposed as techniques complementary to the EPD. The secondary, tertiary, and quaternary structural changes of multiple proteins as a function of environmental stress were first measured using circular dichroism, intrinsic fluorescence spectroscopy, and static light scattering, respectively. Data sets were then visualized as (1) RGB colors using three-index EPDs, (2) equiangular polygons using radar charts, and (3) human facial features using Chernoff face diagrams. Data as a function of temperature and pH for bovine serum albumin, aldolase, and chymotrypsin as well as candidate protein vaccine antigens including a serine threonine kinase protein (SP1732) and surface antigen A (SP1650) from S. pneumoniae and hemagglutinin from an H1N1 influenza virus are used to illustrate the advantages and disadvantages of each type of data visualization technique. Copyright © 2012 The Protein Society.
Staufen1 senses overall transcript secondary structure to regulate translation
Ricci, Emiliano P; Kucukural, Alper; Cenik, Can; Mercier, Blandine C; Singh, Guramrit; Heyer, Erin E; Ashar-Patel, Ami; Peng, Lingtao; Moore, Melissa J
2015-01-01
Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3′ untranslated regions (UTRs) or in ‘strongly distal’ 3′ UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3′ UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins. PMID:24336223
Multi-PAS domain-mediated protein oligomerization of PpsR from Rhodobacter sphaeroides
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heintz, Udo; Meinhart, Anton; Winkler, Andreas, E-mail: andreas.winkler@mpimf-heidelberg.mpg.de
2014-03-01
Crystal structures of two truncated variants of the transcription factor PpsR from R. sphaeroides are presented that enabled the phasing of a triple PAS domain construct. Together, these structures reveal the importance of α-helical PAS extensions for multi-PAS domain-mediated protein oligomerization and function. Per–ARNT–Sim (PAS) domains are essential modules of many multi-domain signalling proteins that mediate protein interaction and/or sense environmental stimuli. Frequently, multiple PAS domains are present within single polypeptide chains, where their interplay is required for protein function. Although many isolated PAS domain structures have been reported over the last decades, only a few structures of multi-PAS proteinsmore » are known. Therefore, the molecular mechanism of multi-PAS domain-mediated protein oligomerization and function is poorly understood. The transcription factor PpsR from Rhodobacter sphaeroides is such a multi-PAS domain protein that, in addition to its three PAS domains, contains a glutamine-rich linker and a C-terminal helix–turn–helix DNA-binding motif. Here, crystal structures of two N-terminally and C-terminally truncated PpsR variants that comprise a single (PpsR{sub Q-PAS1}) and two (PpsR{sub N-Q-PAS1}) PAS domains, respectively, are presented and the multi-step strategy required for the phasing of a triple PAS domain construct (PpsR{sub ΔHTH}) is illustrated. While parts of the biologically relevant dimerization interface can already be observed in the two shorter constructs, the PpsR{sub ΔHTH} structure reveals how three PAS domains enable the formation of multiple oligomeric states (dimer, tetramer and octamer), highlighting that not only the PAS cores but also their α-helical extensions are essential for protein oligomerization. The results demonstrate that the long helical glutamine-rich linker of PpsR results from a direct fusion of the N-cap of the PAS1 domain with the C-terminal extension of the N-domain that plays an important role in signal transduction.« less
Resin embedded multicycle imaging (REMI): a tool to evaluate protein domains.
Busse, B L; Bezrukov, L; Blank, P S; Zimmerberg, J
2016-08-08
Protein complexes associated with cellular processes comprise a significant fraction of all biology, but our understanding of their heterogeneous organization remains inadequate, particularly for physiological densities of multiple protein species. Towards resolving this limitation, we here present a new technique based on resin-embedded multicycle imaging (REMI) of proteins in-situ. By stabilizing protein structure and antigenicity in acrylic resins, affinity labels were repeatedly applied, imaged, removed, and replaced. In principle, an arbitrarily large number of proteins of interest may be imaged on the same specimen with subsequent digital overlay. A series of novel preparative methods were developed to address the problem of imaging multiple protein species in areas of the plasma membrane or volumes of cytoplasm of individual cells. For multiplexed examination of antibody staining we used straightforward computational techniques to align sequential images, and super-resolution microscopy was used to further define membrane protein colocalization. We give one example of a fibroblast membrane with eight multiplexed proteins. A simple statistical analysis of this limited membrane proteomic dataset is sufficient to demonstrate the analytical power contributed by additional imaged proteins when studying membrane protein domains.
Resin embedded multicycle imaging (REMI): a tool to evaluate protein domains
Busse, B. L.; Bezrukov, L.; Blank, P. S.; Zimmerberg, J.
2016-01-01
Protein complexes associated with cellular processes comprise a significant fraction of all biology, but our understanding of their heterogeneous organization remains inadequate, particularly for physiological densities of multiple protein species. Towards resolving this limitation, we here present a new technique based on resin-embedded multicycle imaging (REMI) of proteins in-situ. By stabilizing protein structure and antigenicity in acrylic resins, affinity labels were repeatedly applied, imaged, removed, and replaced. In principle, an arbitrarily large number of proteins of interest may be imaged on the same specimen with subsequent digital overlay. A series of novel preparative methods were developed to address the problem of imaging multiple protein species in areas of the plasma membrane or volumes of cytoplasm of individual cells. For multiplexed examination of antibody staining we used straightforward computational techniques to align sequential images, and super-resolution microscopy was used to further define membrane protein colocalization. We give one example of a fibroblast membrane with eight multiplexed proteins. A simple statistical analysis of this limited membrane proteomic dataset is sufficient to demonstrate the analytical power contributed by additional imaged proteins when studying membrane protein domains. PMID:27499335
Rich, R L; Deivanayagam, C C; Owens, R T; Carson, M; Höök, A; Moore, D; Symersky, J; Yang, V W; Narayana, S V; Höök, M
1999-08-27
Most mammalian cells and some pathogenic bacteria are capable of adhering to collagenous substrates in processes mediated by specific cell surface adherence molecules. Crystal structures of collagen-binding regions of the human integrin alpha(2)beta(1) and a Staphylococcus aureus adhesin reveal a "trench" on the surface of both of these proteins. This trench can accommodate a collagen triple-helical structure and presumably represents the ligand-binding site (Emsley, J., King, S. L., Bergelson, J. M., and Liddington, R. C. (1997) J. Biol. Chem. 272, 28512-28517; Symersky, J., Patti, J. M., Carson, M., House-Pompeo, K., Teale, M., Moore, D., Jin, L., Schneider, A., DeLucas, L. J., Höök, M., and Narayana, S. V. L. (1997) Nat. Struct. Biol. 4, 833-838). We report here the crystal structure of the alpha subunit I domain from the alpha(1)beta(1) integrin. This collagen-binding protein also contains a trench on one face in which the collagen triple helix may be docked. Furthermore, we compare the collagen-binding mechanisms of the human alpha(1) integrin I domain and the A domain from the S. aureus collagen adhesin, Cna. Although the S. aureus and human proteins have unrelated amino acid sequences, secondary structure composition, and cation requirements for effective ligand binding, both proteins bind at multiple sites within one collagen molecule, with the sites in collagen varying in their affinity for the adherence molecule. We propose that (i) these evolutionarily dissimilar adherence proteins recognize collagen via similar mechanisms, (ii) the multisite, multiclass protein/ligand interactions observed in these two systems result from a binding-site trench, and (iii) this unusual binding mechanism may be thematic for proteins binding extended, rigid ligands that contain repeating structural motifs.
Trujillo, Uldaeliz; Vázquez-Rosa, Edwin; Oyola-Robles, Delise; Stagg, Loren J; Vassallo, David A; Vega, Irving E; Arold, Stefan T; Baerga-Ortiz, Abel
2013-01-01
The polyunsaturated fatty acid (PUFA) synthases from deep-sea bacteria invariably contain multiple acyl carrier protein (ACP) domains in tandem. This conserved tandem arrangement has been implicated in both amplification of fatty acid production (additive effect) and in structural stabilization of the multidomain protein (synergistic effect). While the more accepted model is one in which domains act independently, recent reports suggest that ACP domains may form higher oligomers. Elucidating the three-dimensional structure of tandem arrangements may therefore give important insights into the functional relevance of these structures, and hence guide bioengineering strategies. In an effort to elucidate the three-dimensional structure of tandem repeats from deep-sea anaerobic bacteria, we have expressed and purified a fragment consisting of five tandem ACP domains from the PUFA synthase from Photobacterium profundum. Analysis of the tandem ACP fragment by analytical gel filtration chromatography showed a retention time suggestive of a multimeric protein. However, small angle X-ray scattering (SAXS) revealed that the multi-ACP fragment is an elongated monomer which does not form a globular unit. Stokes radii calculated from atomic monomeric SAXS models were comparable to those measured by analytical gel filtration chromatography, showing that in the gel filtration experiment, the molecular weight was overestimated due to the elongated protein shape. Thermal denaturation monitored by circular dichroism showed that unfolding of the tandem construct was not cooperative, and that the tandem arrangement did not stabilize the protein. Taken together, these data are consistent with an elongated beads-on-a-string arrangement of the tandem ACP domains in PUFA synthases, and speak against synergistic biocatalytic effects promoted by quaternary structuring. Thus, it is possible to envision bioengineering strategies which simply involve the artificial linking of multiple ACP domains for increasing the yield of fatty acids in bacterial cultures.
Trujillo, Uldaeliz; Vázquez-Rosa, Edwin; Oyola-Robles, Delise; Stagg, Loren J.; Vassallo, David A.; Vega, Irving E.; Arold, Stefan T.; Baerga-Ortiz, Abel
2013-01-01
The polyunsaturated fatty acid (PUFA) synthases from deep-sea bacteria invariably contain multiple acyl carrier protein (ACP) domains in tandem. This conserved tandem arrangement has been implicated in both amplification of fatty acid production (additive effect) and in structural stabilization of the multidomain protein (synergistic effect). While the more accepted model is one in which domains act independently, recent reports suggest that ACP domains may form higher oligomers. Elucidating the three-dimensional structure of tandem arrangements may therefore give important insights into the functional relevance of these structures, and hence guide bioengineering strategies. In an effort to elucidate the three-dimensional structure of tandem repeats from deep-sea anaerobic bacteria, we have expressed and purified a fragment consisting of five tandem ACP domains from the PUFA synthase from Photobacterium profundum. Analysis of the tandem ACP fragment by analytical gel filtration chromatography showed a retention time suggestive of a multimeric protein. However, small angle X-ray scattering (SAXS) revealed that the multi-ACP fragment is an elongated monomer which does not form a globular unit. Stokes radii calculated from atomic monomeric SAXS models were comparable to those measured by analytical gel filtration chromatography, showing that in the gel filtration experiment, the molecular weight was overestimated due to the elongated protein shape. Thermal denaturation monitored by circular dichroism showed that unfolding of the tandem construct was not cooperative, and that the tandem arrangement did not stabilize the protein. Taken together, these data are consistent with an elongated beads-on-a-string arrangement of the tandem ACP domains in PUFA synthases, and speak against synergistic biocatalytic effects promoted by quaternary structuring. Thus, it is possible to envision bioengineering strategies which simply involve the artificial linking of multiple ACP domains for increasing the yield of fatty acids in bacterial cultures. PMID:23469090
GCView: the genomic context viewer for protein homology searches
Grin, Iwan; Linke, Dirk
2011-01-01
Genomic neighborhood can provide important insights into evolution and function of a protein or gene. When looking at operons, changes in operon structure and composition can only be revealed by looking at the operon as a whole. To facilitate the analysis of the genomic context of a query in multiple organisms we have developed Genomic Context Viewer (GCView). GCView accepts results from one or multiple protein homology searches such as BLASTp as input. For each hit, the neighboring protein-coding genes are extracted, the regions of homology are labeled for each input and the results are presented as a clear, interactive graphical output. It is also possible to add more searches to iteratively refine the output. GCView groups outputs by the hits for different proteins. This allows for easy comparison of different operon compositions and structures. The tool is embedded in the framework of the Bioinformatics Toolkit of the Max-Planck Institute for Developmental Biology (MPI Toolkit). Job results from the homology search tools inside the MPI Toolkit can be forwarded to GCView and results can be subsequently analyzed by sequence analysis tools. Results are stored online, allowing for later reinspection. GCView is freely available at http://toolkit.tuebingen.mpg.de/gcview. PMID:21609955
Neshich, Goran; Rocchia, Walter; Mancini, Adauto L.; Yamagishi, Michel E. B.; Kuser, Paula R.; Fileto, Renato; Baudet, Christian; Pinto, Ivan P.; Montagner, Arnaldo J.; Palandrani, Juliana F.; Krauchenco, Joao N.; Torres, Renato C.; Souza, Savio; Togawa, Roberto C.; Higa, Roberto H.
2004-01-01
JavaProtein Dossier (JPD) is a new concept, database and visualization tool providing one of the largest collections of the physicochemical parameters describing proteins' structure, stability, function and interaction with other macromolecules. By collecting as many descriptors/parameters as possible within a single database, we can achieve a better use of the available data and information. Furthermore, data grouping allows us to generate different parameters with the potential to provide new insights into the sequence–structure–function relationship. In JPD, residue selection can be performed according to multiple criteria. JPD can simultaneously display and analyze all the physicochemical parameters of any pair of structures, using precalculated structural alignments, allowing direct parameter comparison at corresponding amino acid positions among homologous structures. In order to focus on the physicochemical (and consequently pharmacological) profile of proteins, visualization tools (showing the structure and structural parameters) also had to be optimized. Our response to this challenge was the use of Java technology with its exceptional level of interactivity. JPD is freely accessible (within the Gold Sting Suite) at http://sms.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS, http://trantor.bioc.columbia.edu/SMS and http://www.es.embnet.org/SMS/ (Option: JavaProtein Dossier). PMID:15215458
Functional and genomic analyses of alpha-solenoid proteins.
Fournier, David; Palidwor, Gareth A; Shcherbinin, Sergey; Szengel, Angelika; Schaefer, Martin H; Perez-Iratxeta, Carol; Andrade-Navarro, Miguel A
2013-01-01
Alpha-solenoids are flexible protein structural domains formed by ensembles of alpha-helical repeats (Armadillo and HEAT repeats among others). While homology can be used to detect many of these repeats, some alpha-solenoids have very little sequence homology to proteins of known structure and we expect that many remain undetected. We previously developed a method for detection of alpha-helical repeats based on a neural network trained on a dataset of protein structures. Here we improved the detection algorithm and updated the training dataset using recently solved structures of alpha-solenoids. Unexpectedly, we identified occurrences of alpha-solenoids in solved protein structures that escaped attention, for example within the core of the catalytic subunit of PI3KC. Our results expand the current set of known alpha-solenoids. Application of our tool to the protein universe allowed us to detect their significant enrichment in proteins interacting with many proteins, confirming that alpha-solenoids are generally involved in protein-protein interactions. We then studied the taxonomic distribution of alpha-solenoids to discuss an evolutionary scenario for the emergence of this type of domain, speculating that alpha-solenoids have emerged in multiple taxa in independent events by convergent evolution. We observe a higher rate of alpha-solenoids in eukaryotic genomes and in some prokaryotic families, such as Cyanobacteria and Planctomycetes, which could be associated to increased cellular complexity. The method is available at http://cbdm.mdc-berlin.de/~ard2/.
Protein cage assembly across multiple length scales.
Aumiller, William M; Uchida, Masaki; Douglas, Trevor
2018-05-21
Within the materials science community, proteins with cage-like architectures are being developed as versatile nanoscale platforms for use in protein nanotechnology. Much effort has been focused on the functionalization of protein cages with biological and non-biological moieties to bring about new properties of not only individual protein cages, but collective bulk-scale assemblies of protein cages. In this review, we report on the current understanding of protein cage assembly, both of the cages themselves from individual subunits, and the assembly of the individual protein cages into higher order structures. We start by discussing the key properties of natural protein cages (for example: size, shape and structure) followed by a review of some of the mechanisms of protein cage assembly and the factors that influence it. We then explore the current approaches for functionalizing protein cages, on the interior or exterior surfaces of the capsids. Lastly, we explore the emerging area of higher order assemblies created from individual protein cages and their potential for new and exciting collective properties.
McHaourab, Hassane S; Steed, P Ryan; Kazmier, Kelli
2011-11-09
Trapping membrane proteins in the confines of a crystal lattice obscures dynamic modes essential for interconversion between multiple conformations in the functional cycle. Moreover, lattice forces could conspire with detergent solubilization to stabilize a minor conformer in an ensemble thus confounding mechanistic interpretation. Spin labeling in conjunction with electron paramagnetic resonance (EPR) spectroscopy offers an exquisite window into membrane protein dynamics in the native-like environment of a lipid bilayer. Systematic application of spin labeling and EPR identifies sequence-specific secondary structures, defines their topology and their packing in the tertiary fold. Long range distance measurements (60 Å-80 Å) between pairs of spin labels enable quantitative analysis of equilibrium dynamics and triggered conformational changes. This review highlights the contribution of spin labeling to bridging structure and mechanism. Efforts to develop methods for determining structures from EPR restraints and to increase sensitivity and throughput promise to expand spin labeling applications in membrane protein structural biology. Copyright © 2011 Elsevier Ltd. All rights reserved.
Cao, Han; Ng, Marcus C K; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W I
2017-09-01
[Formula: see text]-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM . Website is implemented in PHP, MySQL and Apache, with all major browsers supported.
TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers
NASA Astrophysics Data System (ADS)
Cao, Han; Ng, Marcus C. K.; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W. I.
2017-09-01
α-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM. Website is implemented in PHP, MySQL and Apache, with all major browsers supported.
Material Structure of a Graded Refractive Index Lens in Decapod Squid
NASA Astrophysics Data System (ADS)
Cai, Jing; Heiney, Paul; Sweeney, Alison
2013-03-01
Underwater vision with a camera-type eye that is simultaneously acute and sensitive requires a spherical lens with a graded distribution of refractive index. Squids have this type of lens, and our previous work has shown that its optical properties are likely achieved with radially variable densities of a single protein with multiple isoforms. Here we measure the spatial organization of this novel protein material in concentric layers of the lens and use these data to suggest possible mechanisms of self-assembly of the proteins into a graded refractive index structure. First, we performed small angle x-ray scattering (SAXS) to study how the protein is spatially organized. Then, molecular dynamic simulation allowed us to correlate structure to the possible dynamics of the system in different regions of the lens. The combination of simulation and SAXS data in this system revealed the likely protein-protein interactions, resulting material structure and its relationship to the observed and variable optical properties of this graded index system. We believe insights into the material properties of the squid lens system will inform the invention of self-assembling graded index devices.
Schoborg, Todd; Rickels, Ryan; Barrios, Josh
2013-01-01
Chromatin insulators assist in the formation of higher-order chromatin structures by mediating long-range contacts between distant genomic sites. It has been suggested that insulators accomplish this task by forming dense nuclear foci termed insulator bodies that result from the coalescence of multiple protein-bound insulators. However, these structures remain poorly understood, particularly the mechanisms triggering body formation and their role in nuclear function. In this paper, we show that insulator proteins undergo a dramatic and dynamic spatial reorganization into insulator bodies during osmostress and cell death in a high osmolarity glycerol–p38 mitogen-activated protein kinase–independent manner, leading to a large reduction in DNA-bound insulator proteins that rapidly repopulate chromatin as the bodies disassemble upon return to isotonicity. These bodies occupy distinct nuclear territories and contain a defined structural arrangement of insulator proteins. Our findings suggest insulator bodies are novel nuclear stress foci that can be used as a proxy to monitor the chromatin-bound state of insulator proteins and provide new insights into the effects of osmostress on nuclear and genome organization. PMID:23878275
Cryo-EM structure of a herpesvirus capsid at 3.1 Å.
Yuan, Shuai; Wang, Jialing; Zhu, Dongjie; Wang, Nan; Gao, Qiang; Chen, Wenyuan; Tang, Hao; Wang, Junzhi; Zhang, Xinzheng; Liu, Hongrong; Rao, Zihe; Wang, Xiangxi
2018-04-06
Structurally and genetically, human herpesviruses are among the largest and most complex of viruses. Using cryo-electron microscopy (cryo-EM) with an optimized image reconstruction strategy, we report the herpes simplex virus type 2 (HSV-2) capsid structure at 3.1 angstroms, which is built up of about 3000 proteins organized into three types of hexons (central, peripentonal, and edge), pentons, and triplexes. Both hexons and pentons contain the major capsid protein, VP5; hexons also contain a small capsid protein, VP26; and triplexes comprise VP23 and VP19C. Acting as core organizers, VP5 proteins form extensive intermolecular networks, involving multiple disulfide bonds (about 1500 in total) and noncovalent interactions, with VP26 proteins and triplexes that underpin capsid stability and assembly. Conformational adaptations of these proteins induced by their microenvironments lead to 46 different conformers that assemble into a massive quasisymmetric shell, exemplifying the structural and functional complexity of HSV. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Crystal Structure of Menin Reveals Binding Site for Mixed Lineage Leukemia (MLL) Protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murai, Marcelo J.; Chruszcz, Maksymilian; Reddy, Gireesh
2014-10-02
Menin is a tumor suppressor protein that is encoded by the MEN1 (multiple endocrine neoplasia 1) gene and controls cell growth in endocrine tissues. Importantly, menin also serves as a critical oncogenic cofactor of MLL (mixed lineage leukemia) fusion proteins in acute leukemias. Direct association of menin with MLL fusion proteins is required for MLL fusion protein-mediated leukemogenesis in vivo, and this interaction has been validated as a new potential therapeutic target for development of novel anti-leukemia agents. Here, we report the first crystal structure of menin homolog from Nematostella vectensis. Due to a very high sequence similarity, the Nematostellamore » menin is a close homolog of human menin, and these two proteins likely have very similar structures. Menin is predominantly an {alpha}-helical protein with the protein core comprising three tetratricopeptide motifs that are flanked by two {alpha}-helical bundles and covered by a {beta}-sheet motif. A very interesting feature of menin structure is the presence of a large central cavity that is highly conserved between Nematostella and human menin. By employing site-directed mutagenesis, we have demonstrated that this cavity constitutes the binding site for MLL. Our data provide a structural basis for understanding the role of menin as a tumor suppressor protein and as an oncogenic co-factor of MLL fusion proteins. It also provides essential structural information for development of inhibitors targeting the menin-MLL interaction as a novel therapeutic strategy in MLL-related leukemias.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Xu, Dong; Zhang, Yang
2012-07-01
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Copyright © 2012 Wiley Periodicals, Inc.
Damberger, F. F.; Pelton, J. G.; Harrison, C. J.; Nelson, H. C.; Wemmer, D. E.
1994-01-01
The solution structure of the 92-residue DNA-binding domain of the heat shock transcription factor from Kluyveromyces lactis has been determined using multidimensional NMR methods. Three-dimensional (3D) triple resonance, 1H-13C-13C-1H total correlation spectroscopy, and 15N-separated total correlation spectroscopy-heteronuclear multiple quantum correlation experiments were used along with various 2D spectra to make nearly complete assignments for the backbone and side-chain 1H, 15N, and 13C resonances. Five-hundred eighty-three NOE constraints identified in 3D 13C- and 15N-separated NOE spectroscopy (NOESY)-heteronuclear multiple quantum correlation spectra and a 4-dimensional 13C/13C-edited NOESY spectrum, along with 35 phi, 9 chi 1, and 30 hydrogen bond constraints, were used to calculate 30 structures by hybrid distance geometry/stimulated annealing protocol, of which 24 were used for structural comparison. The calculations revealed that a 3-helix bundle packs against a small 4-stranded antiparallel beta-sheet. The backbone RMS deviation (RMSD) for the family of structures was 1.03 +/- 0.19 A with respect to the average structure. The topology is analogous to that of the C-terminal domain of the catabolite gene activator protein and appears to be in the helix-turn-helix family of DNA-binding proteins. The overall fold determined by the NMR data is consistent with recent crystallographic work on this domain (Harrison CJ, Bohm AA, Nelson HCM, 1994, Science 263:224) as evidenced by RMSD between backbone atoms in the NMR and X-ray structures of 1.77 +/- 0.20 A. Several differences were identified some of which may be due to protein-protein interactions in the crystal. PMID:7849597
Ferritins: dynamic management of biological iron and oxygen chemistry.
Liu, Xiaofeng; Theil, Elizabeth C
2005-03-01
Ferritins are spherical, cage-like proteins with nanocavities formed by multiple polypeptide subunits (four-helix bundles) that manage iron/oxygen chemistry. Catalytic coupling yields diferric oxo/hydroxo complexes at ferroxidase sites in maxi-ferritin subunits (24 subunits, 480 kDa; plants, animals, microorganisms). Oxidation occurs at the cavity surface of mini-ferritins/Dps proteins (12 subunits, 240 kDa; bacteria). Oxidation products are concentrated as minerals in the nanocavity for iron-protein cofactor synthesis (maxi-ferritins) or DNA protection (mini-ferritins). The protein cage and nanocavity characterize all ferritins, although amino acid sequences diverge, especially in bacteria. Catalytic oxidation/di-iron coupling in the protein cage (maxi-ferritins, 480 kDa; plants, bacteria and animal cell-specific isoforms) or on the cavity surface (mini-ferritins/Dps proteins, 280 kDa; bacteria) initiates mineralization. Gated pores (eight or four), symmetrically arranged, control iron flow. The multiple ferritin functions combine pore, channel, and catalytic functions in compact protein structures required for life and disease response.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Metrick, Claire M.; Heldwein, Ekaterina E.; Sandri-Goldin, R. M.
Proteins forming the tegument layers of herpesviral virions mediate many essential processes in the viral replication cycle, yet few have been characterized in detail. UL21 is one such multifunctional tegument protein and is conserved among alphaherpesviruses. While UL21 has been implicated in many processes in viral replication, ranging from nuclear egress to virion morphogenesis to cell-cell spread, its precise roles remain unclear. Here we report the 2.7-Å crystal structure of the C-terminal domain of herpes simplex virus 1 (HSV-1) UL21 (UL21C), which has a unique α-helical fold resembling a dragonfly. Analysis of evolutionary conservation patterns and surface electrostatics pinpointed fourmore » regions of potential functional importance on the surface of UL21C to be pursued by mutagenesis. In combination with the previously determined structure of the N-terminal domain of UL21, the structure of UL21C provides a 3-dimensional framework for targeted exploration of the multiple roles of UL21 in the replication and pathogenesis of alphaherpesviruses. Additionally, we describe an unanticipated ability of UL21 to bind RNA, which may hint at a yet unexplored function. IMPORTANCEDue to the limited genomic coding capacity of viruses, viral proteins are often multifunctional, which makes them attractive antiviral targets. Such multifunctionality, however, complicates their study, which often involves constructing and characterizing null mutant viruses. Systematic exploration of these multifunctional proteins requires detailed road maps in the form of 3-dimensional structures. In this work, we determined the crystal structure of the C-terminal domain of UL21, a multifunctional tegument protein that is conserved among alphaherpesviruses. Structural analysis pinpointed surface areas of potential functional importance that provide a starting point for mutagenesis. In addition, the unexpected RNA-binding ability of UL21 may expand its functional repertoire. The structure of UL21C and the observation of its RNA-binding ability are the latest additions to the navigational chart that can guide the exploration of the multiple functions of UL21.« less
Modular protein domains: an engineering approach toward functional biomaterials.
Lin, Charng-Yu; Liu, Julie C
2016-08-01
Protein domains and peptide sequences are a powerful tool for conferring specific functions to engineered biomaterials. Protein sequences with a wide variety of functionalities, including structure, bioactivity, protein-protein interactions, and stimuli responsiveness, have been identified, and advances in molecular biology continue to pinpoint new sequences. Protein domains can be combined to make recombinant proteins with multiple functionalities. The high fidelity of the protein translation machinery results in exquisite control over the sequence of recombinant proteins and the resulting properties of protein-based materials. In this review, we discuss protein domains and peptide sequences in the context of functional protein-based materials, composite materials, and their biological applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kohda, Daisuke
2018-04-01
Promiscuous recognition of ligands by proteins is as important as strict recognition in numerous biological processes. In living cells, many short, linear amino acid motifs function as targeting signals in proteins to specify the final destination of the protein transport. In general, the target signal is defined by a consensus sequence containing wild-characters, and hence represented by diverse amino acid sequences. The classical lock-and-key or induced-fit/conformational selection mechanism may not cover all aspects of the promiscuous recognition. On the basis of our crystallographic and NMR studies on the mitochondrial Tom20 protein-presequence interaction, we proposed a new hypothetical mechanism based on "a rapid equilibrium of multiple states with partial recognitions". This dynamic, multiple recognition mode enables the Tom20 receptor to recognize diverse mitochondrial presequences with nearly equal affinities. The plant Tom20 is evolutionally unrelated to the animal Tom20 in our study, but is a functional homolog of the animal/fungal Tom20. NMR studies by another research group revealed that the presequence binding by the plant Tom20 was not fully explained by simple interaction modes, suggesting the presence of a similar dynamic, multiple recognition mode. Circumstantial evidence also suggested that similar dynamic mechanisms may be applicable to other promiscuous recognitions of signal peptides by the SRP54/Ffh and SecA proteins.
2013-10-01
IDPs have flexibility, thereby providing the plasticity to enable interactions with multiple partners where high-specificity and low-affinity...block protein-protein interactions is a rapidly evolving field, as the importance of these proteins in disease becomes established. The plasticity of...closest to the structure of EPI-002 — did not bind an abundance of other cellular proteins (Figure 3D , top). Only 3 bands between 200 and 75 kDa were
Structural basis for the antifolding activity of a molecular chaperone
NASA Astrophysics Data System (ADS)
Huang, Chengdong; Rossi, Paolo; Saio, Tomohide; Kalodimos, Charalampos G.
2016-09-01
Molecular chaperones act on non-native proteins in the cell to prevent their aggregation, premature folding or misfolding. Different chaperones often exert distinct effects, such as acceleration or delay of folding, on client proteins via mechanisms that are poorly understood. Here we report the solution structure of SecB, a chaperone that exhibits strong antifolding activity, in complex with alkaline phosphatase and maltose-binding protein captured in their unfolded states. SecB uses long hydrophobic grooves that run around its disk-like shape to recognize and bind to multiple hydrophobic segments across the length of non-native proteins. The multivalent binding mode results in proteins wrapping around SecB. This unique complex architecture alters the kinetics of protein binding to SecB and confers strong antifolding activity on the chaperone. The data show how the different architectures of chaperones result in distinct binding modes with non-native proteins that ultimately define the activity of the chaperone.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Judd, R.C.; Caldwell, H.D.
1985-01-01
The objective of this study was to determine if in-gel chloramine-T radioiodination adequately labels OM proteins to allow for accurate and precise structural comparison of these molecules. Therefore, intrinsically /sup 14/C-amino acid labeled proteins and /sup 125/I-labeled proteins were cleaved with two endopeptidic reagents and the peptide fragments separated by HPLC. A comparison of retention times of the fragments, as determined by differential radiation counting, thus indicated whether /sup 125/Ilabeling identified of all the peptide peaks seen in the /sup 14/Clabeled proteins. Results demonstrated that radioiodination yields complete and accurate information about the primary structure of outer membrane proteins. Inmore » addition, it permits the use of extremely small amounts of protein allowing for method optimization and multiple separations to insure reproducibility.« less
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen; Godzik, Adam
2015-01-01
Motivation: Most proteins consist of multiple domains, independent structural and evolutionary units that are often reshuffled in genomic rearrangements to form new protein architectures. Template-based modeling methods can often detect homologous templates for individual domains, but templates that could be used to model the entire query protein are often not available. Results: We have developed a fast docking algorithm ab initio domain assembly (AIDA) for assembling multi-domain protein structures, guided by the ab initio folding potential. This approach can be extended to discontinuous domains (i.e. domains with ‘inserted’ domains). When tested on experimentally solved structures of multi-domain proteins, the relative domain positions were accurately found among top 5000 models in 86% of cases. AIDA server can use domain assignments provided by the user or predict them from the provided sequence. The latter approach is particularly useful for automated protein structure prediction servers. The blind test consisting of 95 CASP10 targets shows that domain boundaries could be successfully determined for 97% of targets. Availability and implementation: The AIDA package as well as the benchmark sets used here are available for download at http://ffas.burnham.org/AIDA/. Contact: adam@sanfordburnham.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25701568
Jaspard, Emmanuel; Macherel, David; Hunault, Gilles
2012-01-01
Late Embryogenesis Abundant Proteins (LEAPs) are ubiquitous proteins expected to play major roles in desiccation tolerance. Little is known about their structure - function relationships because of the scarcity of 3-D structures for LEAPs. The previous building of LEAPdb, a database dedicated to LEAPs from plants and other organisms, led to the classification of 710 LEAPs into 12 non-overlapping classes with distinct properties. Using this resource, numerous physico-chemical properties of LEAPs and amino acid usage by LEAPs have been computed and statistically analyzed, revealing distinctive features for each class. This unprecedented analysis allowed a rigorous characterization of the 12 LEAP classes, which differed also in multiple structural and physico-chemical features. Although most LEAPs can be predicted as intrinsically disordered proteins, the analysis indicates that LEAP class 7 (PF03168) and probably LEAP class 11 (PF04927) are natively folded proteins. This study thus provides a detailed description of the structural properties of this protein family opening the path toward further LEAP structure - function analysis. Finally, since each LEAP class can be clearly characterized by a unique set of physico-chemical properties, this will allow development of software to predict proteins as LEAPs. PMID:22615859
MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.
Fang, Chao; Shang, Yi; Xu, Dong
2018-05-01
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.
Wang, Nanyi; Wang, Lirong; Xie, Xiang-Qun
2017-11-27
Molecular docking is widely applied to computer-aided drug design and has become relatively mature in the recent decades. Application of docking in modeling varies from single lead compound optimization to large-scale virtual screening. The performance of molecular docking is highly dependent on the protein structures selected. It is especially challenging for large-scale target prediction research when multiple structures are available for a single target. Therefore, we have established ProSelection, a docking preferred-protein selection algorithm, in order to generate the proper structure subset(s). By the ProSelection algorithm, protein structures of "weak selectors" are filtered out whereas structures of "strong selectors" are kept. Specifically, the structure which has a good statistical performance of distinguishing active ligands from inactive ligands is defined as a strong selector. In this study, 249 protein structures of 14 autophagy-related targets are investigated. Surflex-dock was used as the docking engine to distinguish active and inactive compounds against these protein structures. Both t test and Mann-Whitney U test were used to distinguish the strong from the weak selectors based on the normality of the docking score distribution. The suggested docking score threshold for active ligands (SDA) was generated for each strong selector structure according to the receiver operating characteristic (ROC) curve. The performance of ProSelection was further validated by predicting the potential off-targets of 43 U.S. Federal Drug Administration approved small molecule antineoplastic drugs. Overall, ProSelection will accelerate the computational work in protein structure selection and could be a useful tool for molecular docking, target prediction, and protein-chemical database establishment research.
Cool and Safe: Multiplicity in Safe Innovation at Unilever
ERIC Educational Resources Information Center
Penders, Bart
2011-01-01
This article presents the making of a safe innovation: the application of ice structuring protein (ISP) in edible ices. It argues that safety is not the absence of risk but is an active accomplishment; innovations are not "made safe afterward" but "safe innovations are made". Furthermore, there are multiple safeties to be accomplished in the…
Maitip, Jakkrawut; Trueman, Holly E; Kaehler, Benjamin D; Huttley, Gavin A; Chantawannakul, Panuwan; Sutherland, Tara D
2015-04-01
Multiple gene duplication events in the precursor of the Aculeata (bees, ants, hornets) gave rise to four silk genes. Whilst these homologs encode proteins with similar amino acid composition and coiled coil structure, the retention of all four homologs implies they each are important. In this study we identified, produced and characterized the four silk proteins from Apis dorsata, the giant Asian honeybee. The proteins were readily purified, allowing us to investigate the folding behavior of solutions of individual proteins in comparison to mixtures of all four proteins at concentrations where they assemble into their native coiled coil structure. In contrast to solutions of any one protein type, solutions of a mixture of the four proteins formed coiled coils that were stable against dilution and detergent denaturation. The results are consistent with the formation of a heteromeric coiled coil protein complex. The mechanism of silk protein coiled coil formation and evolution is discussed in light of these results. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gorgel, Manuela; Bøggild, Andreas; Ulstrup, Jakob Jensen; Weiss, Manfred S; Müller, Uwe; Nissen, Poul; Boesen, Thomas
2015-05-01
Exploiting the anomalous signal of the intrinsic S atoms to phase a protein structure is advantageous, as ideally only a single well diffracting native crystal is required. However, sulfur is a weak anomalous scatterer at the typical wavelengths used for X-ray diffraction experiments, and therefore sulfur SAD data sets need to be recorded with a high multiplicity. In this study, the structure of a small pilin protein was determined by sulfur SAD despite several obstacles such as a low anomalous signal (a theoretical Bijvoet ratio of 0.9% at a wavelength of 1.8 Å), radiation damage-induced reduction of the cysteines and a multiplicity of only 5.5. The anomalous signal was improved by merging three data sets from different volumes of a single crystal, yielding a multiplicity of 17.5, and a sodium ion was added to the substructure of anomalous scatterers. In general, all data sets were balanced around the threshold values for a successful phasing strategy. In addition, a collection of statistics on structures from the PDB that were solved by sulfur SAD are presented and compared with the data. Looking at the quality indicator R(anom)/R(p.i.m.), an inconsistency in the documentation of the anomalous R factor is noted and reported.
Structural basis for host membrane remodeling induced by protein 2B of hepatitis A virus.
Vives-Adrián, Laia; Garriga, Damià; Buxaderas, Mònica; Fraga, Joana; Pereira, Pedro José Barbosa; Macedo-Ribeiro, Sandra; Verdaguer, Núria
2015-04-01
The complexity of viral RNA synthesis and the numerous participating factors require a mechanism to topologically coordinate and concentrate these multiple viral and cellular components, ensuring a concerted function. Similarly to all other positive-strand RNA viruses, picornaviruses induce rearrangements of host intracellular membranes to create structures that act as functional scaffolds for genome replication. The membrane-targeting proteins 2B and 2C, their precursor 2BC, and protein 3A appear to be primarily involved in membrane remodeling. Little is known about the structure of these proteins and the mechanisms by which they induce massive membrane remodeling. Here we report the crystal structure of the soluble region of hepatitis A virus (HAV) protein 2B, consisting of two domains: a C-terminal helical bundle preceded by an N-terminally curved five-stranded antiparallel β-sheet that displays striking structural similarity to the β-barrel domain of enteroviral 2A proteins. Moreover, the helicoidal arrangement of the protein molecules in the crystal provides a model for 2B-induced host membrane remodeling during HAV infection. No structural information is currently available for the 2B protein of any picornavirus despite it being involved in a critical process in viral factory formation: the rearrangement of host intracellular membranes. Here we present the structure of the soluble domain of the 2B protein of hepatitis A virus (HAV). Its arrangement, both in crystals and in solution under physiological conditions, can help to understand its function and sheds some light on the membrane rearrangement process, a putative target of future antiviral drugs. Moreover, this first structure of a picornaviral 2B protein also unveils a closer evolutionary relationship between the hepatovirus and enterovirus genera within the Picornaviridae family. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Structural Basis for Host Membrane Remodeling Induced by Protein 2B of Hepatitis A Virus
Vives-Adrián, Laia; Garriga, Damià; Buxaderas, Mònica; Fraga, Joana; Pereira, Pedro José Barbosa
2015-01-01
ABSTRACT The complexity of viral RNA synthesis and the numerous participating factors require a mechanism to topologically coordinate and concentrate these multiple viral and cellular components, ensuring a concerted function. Similarly to all other positive-strand RNA viruses, picornaviruses induce rearrangements of host intracellular membranes to create structures that act as functional scaffolds for genome replication. The membrane-targeting proteins 2B and 2C, their precursor 2BC, and protein 3A appear to be primarily involved in membrane remodeling. Little is known about the structure of these proteins and the mechanisms by which they induce massive membrane remodeling. Here we report the crystal structure of the soluble region of hepatitis A virus (HAV) protein 2B, consisting of two domains: a C-terminal helical bundle preceded by an N-terminally curved five-stranded antiparallel β-sheet that displays striking structural similarity to the β-barrel domain of enteroviral 2A proteins. Moreover, the helicoidal arrangement of the protein molecules in the crystal provides a model for 2B-induced host membrane remodeling during HAV infection. IMPORTANCE No structural information is currently available for the 2B protein of any picornavirus despite it being involved in a critical process in viral factory formation: the rearrangement of host intracellular membranes. Here we present the structure of the soluble domain of the 2B protein of hepatitis A virus (HAV). Its arrangement, both in crystals and in solution under physiological conditions, can help to understand its function and sheds some light on the membrane rearrangement process, a putative target of future antiviral drugs. Moreover, this first structure of a picornaviral 2B protein also unveils a closer evolutionary relationship between the hepatovirus and enterovirus genera within the Picornaviridae family. PMID:25589659
Thermodynamic study of the native and phosphorylated regulatory domain of the CFTR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marasini, Carlotta, E-mail: marasini@ge.ibf.cnr.it; Galeno, Lauretta; Moran, Oscar
2012-07-06
Highlights: Black-Right-Pointing-Pointer CFTR mutations produce cystic fibrosis. Black-Right-Pointing-Pointer Chloride transport depends on the regulatory domain phosphorylation. Black-Right-Pointing-Pointer Regulatory domain is intrinsically disordered. Black-Right-Pointing-Pointer Secondary structure and protein stability change upon phosphorylation. -- Abstract: The regulatory domain (RD) of the cystic fibrosis transmembrane conductance regulator (CFTR), the defective protein in cystic fibrosis, is the region of the channel that regulates the CFTR activity with multiple phosphorylation sites. This domain is an intrinsically disordered protein, characterized by lack of stable or unique tertiary structure. The disordered character of a protein is directly correlated with its function. The flexibility of RD may bemore » important for its regulatory role: the continuous conformational change may be necessary for the progressive phosphorylation, and thus activation, of the channel. However, the lack of a defined and stable structure results in a considerable limitation when trying to in build a unique molecular model for the RD. Moreover, several evidences indicate significant structural differences between the native, non-phosphorylated state, and the multiple phosphorylated state of the protein. The aim of our work is to provide data to describe the conformations and the thermodynamic properties in these two functional states of RD. We have done the circular dichroism (CD) spectra in samples with a different degree of phosphorylation, from the non-phosphorylated state to a bona fide completely phosphorylated state. Analysis of CD spectra showed that the random coil and {beta}-sheets secondary structure decreased with the polypeptide phosphorylation, at expenses of an increase of {alpha}-helix. This observation lead to interpret phosphorylation as a mechanism favoring a more structured state. We also studied the thermal denaturation curves of the protein in the two conditions, monitoring the changes of the mean residue ellipticity measured at 222 nm as a function of temperature, between 20 and 95 Degree-Sign C. The thermodynamic analysis of the denaturation curves shows that phosphorylation of the protein induces a state of lower stability of R domain, characterized by a lower transition temperature, and by a smaller Gibbs free energy difference between the native and the unfolded states.« less
3D structural fluctuation of IgG1 antibody revealed by individual particle electron tomography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Xing; Zhang, Lei; Tong, Huimin
2015-05-05
Commonly used methods for determining protein structure, including X-ray crystallography and single-particle reconstruction, often provide a single and unique three-dimensional (3D) structure. However, in these methods, the protein dynamics and flexibility/fluctuation remain mostly unknown. Here, we utilized advances in electron tomography (ET) to study the antibody flexibility and fluctuation through structural determination of individual antibody particles rather than averaging multiple antibody particles together. Through individual-particle electron tomography (IPET) 3D reconstruction from negatively-stained ET images, we obtained 120 ab-initio 3D density maps at an intermediate resolution (~1–3 nm) from 120 individual IgG1 antibody particles. Using these maps as a constraint, wemore » derived 120 conformations of the antibody via structural flexible docking of the crystal structure to these maps by targeted molecular dynamics simulations. Statistical analysis of the various conformations disclosed the antibody 3D conformational flexibility through the distribution of its domain distances and orientations. This blueprint approach, if extended to other flexible proteins, may serve as a useful methodology towards understanding protein dynamics and functions.« less
GeneBee-net: Internet-based server for analyzing biopolymers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brodsky, L.I.; Ivanov, V.V.; Nikolaev, V.K.
This work describes a network server for searching databanks of biopolymer structures and performing other biocomputing procedures; it is available via direct Internet connection. Basic server procedures are dedicated to homology (similarity) search of sequence and 3D structure of proteins. The homologies found could be used to build multiple alignments, predict protein and RNA secondary structure, and construct phylogenetic trees. In addition to traditional methods of sequence similarity search, the authors propose {open_quotes}non-matrix{close_quotes} (correlational) search. An analogous approach is used to identify regions of similar tertiary structure of proteins. Algorithm concepts and usage examples are presented for new methods. Servicemore » logic is based upon interaction of a client program and server procedures. The client program allows the compilation of queries and the processing of results of an analysis.« less
Evol and ProDy for bridging protein sequence evolution and structural dynamics
Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R.; Bahar, Ivet
2014-01-01
Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar@pitt.edu PMID:24849577
2013-01-01
Background Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured. Methods We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima. Results and conclusions Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions. PMID:24565020
Structural perturbations of azurin deposited on solid matrices as revealed by trp phosphorescence.
Gabellieri, E; Strambini, G B
2001-01-01
The phosphorescence emission of Cd-azurin from Pseudomonas aeruginosa was used as a probe of possible perturbations in the dynamical structure of the protein core that may be induced by protein-sorbent and protein-protein interactions occurring when the macromolecule is deposited into amorphous, thin solid films. Relative to the protein in aqueous solution, the spectrum is unrelaxed and the phosphorescence decay becomes highly heterogeneous, the average lifetime increasing sharply with film thickness and upon its dehydration. According to the lifetime parameter, adsorption of the protein to the substrate is found to produce a multiplicity of partially unfolded structures, an influence that propagates for several protein layers from the surface. Among the substrates used for film deposition, hydrophilic silica, dextran, DEAE-dextran, dextran sulfate, and hydrophobic octodecylamine, the perturbation is smallest with dextran sulfate and largest with octodecylamine. The destabilizing effect of protein-protein interactions, as monitored on 50-layer-thick films, is most evident at a relative humidity of 75%. Stabilizing agents were incorporated to attenuate the deleterious effects of protein aggregation. Among them, the most effective in preserving a more native-like structure are the disaccharides sucrose and trehalose in dry films and the polymer dextran in wet films. Interestingly, the polymer was found to achieve maximum efficacy at sensibly lower additive/protein ratios than the sugars. PMID:11325742
WEBnm@ v2.0: Web server and services for comparing protein flexibility.
Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie
2014-12-30
Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
Characterizing protein domain associations by Small-molecule ligand binding
Li, Qingliang; Cheng, Tiejun; Wang, Yanli; Bryant, Stephen H.
2012-01-01
Background Protein domains are evolutionarily conserved building blocks for protein structure and function, which are conventionally identified based on protein sequence or structure similarity. Small molecule binding domains are of great importance for the recognition of small molecules in biological systems and drug development. Many small molecules, including drugs, have been increasingly identified to bind to multiple targets, leading to promiscuous interactions with protein domains. Thus, a large scale characterization of the protein domains and their associations with respect to small-molecule binding is of particular interest to system biology research, drug target identification, as well as drug repurposing. Methods We compiled a collection of 13,822 physical interactions of small molecules and protein domains derived from the Protein Data Bank (PDB) structures. Based on the chemical similarity of these small molecules, we characterized pairwise associations of the protein domains and further investigated their global associations from a network point of view. Results We found that protein domains, despite lack of similarity in sequence and structure, were comprehensively associated through binding the same or similar small-molecule ligands. Moreover, we identified modules in the domain network that consisted of closely related protein domains by sharing similar biochemical mechanisms, being involved in relevant biological pathways, or being regulated by the same cognate cofactors. Conclusions A novel protein domain relationship was identified in the context of small-molecule binding, which is complementary to those identified by traditional sequence-based or structure-based approaches. The protein domain network constructed in the present study provides a novel perspective for chemogenomic study and network pharmacology, as well as target identification for drug repurposing. PMID:23745168
Huang, Sheng Yu; Chen, Sung Fang; Chen, Chun Hao; Huang, Hsuan Wei; Wu, Wen Guey; Sung, Wang Chou
2014-09-02
Snake venom consists of toxin proteins with multiple disulfide linkages to generate unique structures and biological functions. Determination of these cysteine connections usually requires the purification of each protein followed by structural analysis. In this study, dimethyl labeling coupled with LC-MS/MS and RADAR algorithm was developed to identify the disulfide bonds in crude snake venom. Without any protein separation, the disulfide linkages of several cytotoxins and PLA2 could be solved, including more than 20 disulfide bonds. The results show that this method is capable of analyzing protein mixture. In addition, the approach was also used to compare native cytotoxin 3 (CTX III) and its scrambled isomer, another category of protein mixture, for unknown disulfide bonds. Two disulfide-linked peptides were observed in the native CTX III, and 10 in its scrambled form, X-CTX III. This is the first study that reports a platform for the global cysteine connection analysis on a protein mixture. The proposed method is simple and automatic, offering an efficient tool for structural and functional studies of venom proteins.
My 65 years in protein chemistry.
Scheraga, Harold A
2015-05-01
This is a tour of a physical chemist through 65 years of protein chemistry from the time when emphasis was placed on the determination of the size and shape of the protein molecule as a colloidal particle, with an early breakthrough by James Sumner, followed by Linus Pauling and Fred Sanger, that a protein was a real molecule, albeit a macromolecule. It deals with the recognition of the nature and importance of hydrogen bonds and hydrophobic interactions in determining the structure, properties, and biological function of proteins until the present acquisition of an understanding of the structure, thermodynamics, and folding pathways from a linear array of amino acids to a biological entity. Along the way, with a combination of experiment and theoretical interpretation, a mechanism was elucidated for the thrombin-induced conversion of fibrinogen to a fibrin blood clot and for the oxidative-folding pathways of ribonuclease A. Before the atomic structure of a protein molecule was determined by x-ray diffraction or nuclear magnetic resonance spectroscopy, experimental studies of the fundamental interactions underlying protein structure led to several distance constraints which motivated the theoretical approach to determine protein structure, and culminated in the Empirical Conformational Energy Program for Peptides (ECEPP), an all-atom force field, with which the structures of fibrous collagen-like proteins and the 46-residue globular staphylococcal protein A were determined. To undertake the study of larger globular proteins, a physics-based coarse-grained UNited-RESidue (UNRES) force field was developed, and applied to the protein-folding problem in terms of structure, thermodynamics, dynamics, and folding pathways. Initially, single-chain and, ultimately, multiple-chain proteins were examined, and the methodology was extended to protein-protein interactions and to nucleic acids and to protein-nucleic acid interactions. The ultimate results led to an understanding of a variety of biological processes underlying natural and disease phenomena.
Gan, Rui; Perez, Jessica G; Carlson, Erik D; Ntai, Ioanna; Isaacs, Farren J; Kelleher, Neil L; Jewett, Michael C
2017-05-01
The ability to site-specifically incorporate non-canonical amino acids (ncAAs) into proteins has made possible the study of protein structure and function in fundamentally new ways, as well as the bio synthesis of unnatural polymers. However, the task of site-specifically incorporating multiple ncAAs into proteins with high purity and yield continues to present a challenge. At the heart of this challenge lies the lower efficiency of engineered orthogonal translation system components compared to their natural counterparts (e.g., translation elements that specifically use a ncAA and do not interact with the cell's natural translation apparatus). Here, we show that evolving and tuning expression levels of multiple components of an engineered translation system together as a whole enhances ncAA incorporation efficiency. Specifically, we increase protein yield when incorporating multiple p-azido-phenylalanine(pAzF) residues into proteins by (i) evolving the Methanocaldococcus jannaschii p-azido-phenylalanyl-tRNA synthetase anti-codon binding domain, (ii) evolving the elongation factor Tu amino acid-binding pocket, and (iii) tuning the expression of evolved translation machinery components in a single vector. Use of the evolved translation machinery in a genomically recoded organism lacking release factor one enabled enhanced multi-site ncAA incorporation into proteins. We anticipate that our approach to orthogonal translation system development will accelerate and expand our ability to site-specifically incorporate multiple ncAAs into proteins and biopolymers, advancing new horizons for synthetic and chemical biotechnology. Biotechnol. Bioeng. 2017;114: 1074-1086. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Presynaptic Filament Dynamics in Homologous Recombination and DNA Repair
Liu, Jie; Ehmsen, Kirk T.; Heyer, Wolf-Dietrich; Morrical, Scott W.
2014-01-01
Homologous Recombination (HR) is an essential genome stability mechanism used for high-fidelity repair of DNA double-strand breaks and for the recovery of stalled or collapsed DNA replication forks. The crucial homology search and DNA strand exchange steps of HR are catalyzed by presynaptic filaments—helical filaments of a recombinase enzyme bound to single-stranded DNA. Presynaptic filaments are fundamentally dynamic structures, the assembly, catalytic turnover, and disassembly of which must be closely coordinated with other elements of the DNA recombination, repair, and replication machinery in order for genome maintenance functions to be effective. Here, we review the major dynamic elements controlling the assembly, activity, and disassembly of presynaptic filaments: some intrinsic such as recombinase ATP binding and hydrolytic activities, others extrinsic such as ssDNA-binding proteins, mediator proteins, and DNA motor proteins. We examine dynamic behavior on multiple levels, including atomic- and filament-level structural changes associated with ATP binding and hydrolysis as evidenced in crystal structures, as well as subunit binding and dissociation events driven by intrinsic and extrinsic factors. We examine the biochemical properties of recombination proteins from four model systems (T4 phage, E. coli, S. cerevisiae, and H. sapiens), demonstrating how their properties are tailored for the context-specific requirements in these diverse species. We propose that the presynaptic filament has evolved to rely on multiple external factors for increased multi-level regulation of HR processes in genomes with greater structural and sequence complexity. PMID:21599536
Significance of structural changes in proteins: expected errors in refined protein structures.
Stroud, R. M.; Fauman, E. B.
1995-01-01
A quantitative expression key to evaluating significant structural differences or induced shifts between any two protein structures is derived. Because crystallography leads to reports of a single (or sometimes dual) position for each atom, the significance of any structural change based on comparison of two structures depends critically on knowing the expected precision of each median atomic position reported, and on extracting it for each atom, from the information provided in the Protein Data Bank and in the publication. The differences between structures of protein molecules that should be identical, and that are normally distributed, indicating that they are not affected by crystal contacts, were analyzed with respect to many potential indicators of structure precision, so as to extract, essentially by "machine learning" principles, a generally applicable expression involving the highest correlates. Eighteen refined crystal structures from the Protein Data Bank, in which there are multiple molecules in the crystallographic asymmetric unit, were selected and compared. The thermal B factor, the connectivity of the atom, and the ratio of the number of reflections to the number of atoms used in refinement correlate best with the magnitude of the positional differences between regions of the structures that otherwise would be expected to be the same. These results are embodied in a six-parameter equation that can be applied to any crystallographically refined structure to estimate the expected uncertainty in position of each atom. Structure change in a macromolecule can thus be referenced to the expected uncertainty in atomic position as reflected in the variance between otherwise identical structures with the observed values of correlated parameters. PMID:8563637
Using droplet-on-demand based printing to guide self-assembly in a peptide-protein based bioink
NASA Astrophysics Data System (ADS)
Hedegaard, Clara; Collin, Estelle; Redondo-Gomez, Carlos; Nguyen, Luong T. H.; Ng, Kee Woei; Castrejon-Pita, Alfonso A.; Castrejon-Pita, J. Rafael; Mata, Alvaro
2017-11-01
Tissue engineering aims to capture details of the extracellular matrix (ECM) that stimulate tissue regeneration. Advanced biofabrication techniques have enabled structural complexity, however they are restricted by the choice of material due to stringent printing requirements, leading to a lack of nanoscale control and molecular versatility. In this project, we exploit the dynamics of droplet fluid interactions combined with the co-assembly of peptide amphiphiles (PAs) with biomolecules/proteins to develop a new approach to droplet-based biofabrication. A custom-made droplet generator was developed and used to controllably dispense droplets of PA into a protein solution resulting in gel formation within milliseconds. Taking advantage of the interfacial and inertial forces during the droplet/liquid interaction, it is possible to control the co-assembly kinetics, to give rise to aligned or disordered nanofibers, hydrogel structures of different geometries and sizes, surface topographies, and higher-ordered structures made from multiple hydrogels. The process allows multiple cell types to be spatially distributed on the outside or embedded within the ECM mimetic scaffolds, whilst exhibiting high cell viability (>88%). ERC Starting Grant (STROFUNSCAFF), FP7-PEOPLE-2013-CIG Biomorph and the Royal Society.
Tsai, Chi-Lin; Tainer, John A
2018-01-01
[Fe-S] clusters are essential cofactors in all domains of life. They play many biological roles due to their unique abilities for electron transfer and conformational control. Yet, producing and analyzing Fe-S proteins can be difficult and even misleading if not done anaerobically. Due to unique redox properties of [Fe-S] clusters and their oxygen sensitivity, they pose multiple challenges and can lose enzymatic activity or cause their component proteins to be structurally disordered due to [Fe-S] cluster oxidation and loss in air. Here we highlight tested protocols and strategies enabling efficient and stable [Fe-S] protein production, purification, crystallization, X-ray diffraction data collection, and structure determination. From multiple high-resolution anaerobic crystal structures, we furthermore analyze exemplary data defining [Fe-S] clusters, substrate entry, and product exit for the functional oxidation states of type II molybdo-bis(molybdopterin guanine dinucleotide) (Mo-bisMGD) enzymes. Notably, these enzymes perform electron shuttling between quinone pools and specific substrates to catalyze respiratory metabolism. The identified structure-activity relationships for this enzyme class have broad implications germane to perchlorate environments on Earth and Mars extending to an alternative mechanism underlying metabolic origins for the evolution of the oxygen atmosphere. Integrated structural analyses of type II Mo-bisMGD enzymes unveil novel distinctive shared molecular mechanisms for dynamic control of substrate entry and product release gated by hydrophobic residues. Collective findings support a prototypic model for type II Mo-bisMGD enzymes including insights for a fundamental molecular mechanistic understanding of selectivity and regulation by a conformationally gated channel with general implications for [Fe-S] cluster respiratory enzymes. © 2018 Elsevier Inc. All rights reserved.
Functional and Genomic Analyses of Alpha-Solenoid Proteins
Fournier, David; Palidwor, Gareth A.; Shcherbinin, Sergey; Szengel, Angelika; Schaefer, Martin H.; Perez-Iratxeta, Carol; Andrade-Navarro, Miguel A.
2013-01-01
Alpha-solenoids are flexible protein structural domains formed by ensembles of alpha-helical repeats (Armadillo and HEAT repeats among others). While homology can be used to detect many of these repeats, some alpha-solenoids have very little sequence homology to proteins of known structure and we expect that many remain undetected. We previously developed a method for detection of alpha-helical repeats based on a neural network trained on a dataset of protein structures. Here we improved the detection algorithm and updated the training dataset using recently solved structures of alpha-solenoids. Unexpectedly, we identified occurrences of alpha-solenoids in solved protein structures that escaped attention, for example within the core of the catalytic subunit of PI3KC. Our results expand the current set of known alpha-solenoids. Application of our tool to the protein universe allowed us to detect their significant enrichment in proteins interacting with many proteins, confirming that alpha-solenoids are generally involved in protein-protein interactions. We then studied the taxonomic distribution of alpha-solenoids to discuss an evolutionary scenario for the emergence of this type of domain, speculating that alpha-solenoids have emerged in multiple taxa in independent events by convergent evolution. We observe a higher rate of alpha-solenoids in eukaryotic genomes and in some prokaryotic families, such as Cyanobacteria and Planctomycetes, which could be associated to increased cellular complexity. The method is available at http://cbdm.mdc-berlin.de/~ard2/. PMID:24278209
Designed Proteins Induce the Formation of Nanocage-containing Extracellular Vesicles
Votteler, Jörg; Ogohara, Cassandra; Yi, Sue; Hsia, Yang; Nattermann, Una; Belnap, David M.; King, Neil P.; Sundquist, Wesley I.
2017-01-01
Complex biological processes are often performed by self-organizing nanostructures comprising multiple classes of macromolecules, such as ribosomes (proteins and RNA) or enveloped viruses (proteins, nucleic acids, and lipids). Approaches have been developed for designing self-assembling structures consisting of either nucleic acids1,2 or proteins3–5, but strategies for engineering hybrid biological materials are only beginning to emerge6,7. Here, we describe the design of self-assembling protein nanocages that direct their own release from human cells inside small vesicles in a manner that resembles some viruses. We refer to these hybrid biomaterials as Enveloped Protein Nanocages (EPNs). Robust EPN biogenesis required protein sequence elements that encode three distinct functions: membrane binding, self-assembly, and recruitment of the Endosomal Sorting Complexes Required for Transport (ESCRT) machinery8. A variety of synthetic proteins with these functional elements induced EPN biogenesis, highlighting the modularity and generality of the design strategy. Biochemical and electron cryomicroscopic (cryo-EM) analyses revealed that one design, EPN-01, comprised small (~100 nm) vesicles containing multiple protein nanocages that closely matched the structure of the designed 60-subunit self-assembling scaffold9. EPNs that incorporated the vesicular stomatitis viral glycoprotein (VSV-G) could fuse with target cells and deliver their contents, thereby transferring cargoes from one cell to another. These studies show how proteins can be programmed to direct the formation of hybrid biological materials that perform complex tasks, and establish EPNs as a novel class of designed, modular, genetically-encoded nanomaterials that can transfer molecules between cells. PMID:27919066
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
Kabashi, Edor; Agar, Jeffrey N; Hong, Yu; Taylor, David M; Minotti, Sandra; Figlewicz, Denise A; Durham, Heather D
2008-06-01
In amyotrophic lateral sclerosis caused by mutations in Cu/Zn-superoxide dismutase (SOD1), altered solubility and aggregation of the mutant protein implicates failure of pathways for detecting and catabolizing misfolded proteins. Our previous studies demonstrated early reduction of proteasome-mediated proteolytic activity in lumbar spinal cord of SOD1(G93A) transgenic mice, tissue particularly vulnerable to disease. The purpose of this study was to identify any underlying abnormalities in proteasomal structure. In lumbar spinal cord of pre-symptomatic mice [postnatal day 45 (P45) and P75], normal levels of structural 20S alpha subunits were incorporated into 20S/26S proteasomes; however, proteasomal complexes separated by native gel electrophoresis showed decreased immunoreactivity with antibodies to beta3, a structural subunit of the 20S proteasome core, and beta5, the subunit with chymotrypsin-like activity. This occurred prior to increase in beta5i immunoproteasomal subunit. mRNA levels were maintained and no association of mutant SOD1 with proteasomes was identified, implicating post-transcriptional mechanisms. mRNAs also were maintained in laser captured motor neurons at a later stage of disease (P100) in which multiple 20S proteins are reduced relative to the surrounding neuropil. Increase in detergent-insoluble, ubiquitinated proteins at P75 provided further evidence of stress on mechanisms of protein quality control in multiple cell types prior to significant motor neuron death.
The same pocket in menin binds both MLL and JUND but has opposite effects on transcription
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Jing; Gurung, Buddha; Wan, Bingbing
2013-04-08
Menin is a tumour suppressor protein whose loss or inactivation causes multiple endocrine neoplasia 1 (MEN1), a hereditary autosomal dominant tumour syndrome that is characterized by tumorigenesis in multiple endocrine organs. Menin interacts with many proteins and is involved in a variety of cellular processes. Menin binds the JUN family transcription factor JUND and inhibits its transcriptional activity. Several MEN1 missense mutations disrupt the menin-JUND interaction, suggesting a correlation between the tumour-suppressor function of menin and its suppression of JUND-activated transcription. Menin also interacts with mixed lineage leukaemia protein 1 (MLL1), a histone H3 lysine 4 methyltransferase, and functions asmore » an oncogenic cofactor to upregulate gene transcription and promote MLL1-fusion-protein-induced leukaemogenesis. A recent report on the tethering of MLL1 to chromatin binding factor lens epithelium-derived growth factor (LEDGF) by menin indicates that menin is a molecular adaptor coordinating the functions of multiple proteins. Despite its importance, how menin interacts with many distinct partners and regulates their functions remains poorly understood. Here we present the crystal structures of human menin in its free form and in complexes with MLL1 or with JUND, or with an MLL1-LEDGF heterodimer. These structures show that menin contains a deep pocket that binds short peptides of MLL1 or JUND in the same manner, but that it can have opposite effects on transcription. The menin-JUND interaction blocks JUN N-terminal kinase (JNK)-mediated JUND phosphorylation and suppresses JUND-induced transcription. In contrast, menin promotes gene transcription by binding the transcription activator MLL1 through the peptide pocket while still interacting with the chromatin-anchoring protein LEDGF at a distinct surface formed by both menin and MLL1.« less
Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi
2016-05-01
Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.
Hochrein, James M.; Lerner, Edwina C.; Schiavone, Anthony P.; Smithgall, Thomas E.; Engen, John R.
2006-01-01
The ability of proteins to regulate their own enzymatic activity can be facilitated by changes in structure or protein dynamics in response to external regulators. Because many proteins contain SH2 and SH3 domains, transmission of information between the domains is a potential method of allosteric regulation. To determine if ligand binding to one modular domain may alter structural dynamics in an adjacent domain, allowing potential transmission of information through the protein, we used hydrogen exchange and mass spectrometry to measure changes in protein dynamics in the SH3 and SH2 domains of hematopoietic cell kinase (Hck). Ligand binding to either domain had little or no effect on hydrogen exchange in the adjacent domain, suggesting that changes in protein structure or dynamics are not a means of SH2/SH3 crosstalk. Furthermore, ligands of varying affinity covalently attached to SH3/SH2 altered dynamics only in the domain to which they bind. Such results demonstrate that ligand binding may not structurally alter adjacent SH3/SH2 domains and implies that other aspects of protein architecture contribute to the multiple levels of regulation in proteins containing SH3 and SH2 domains. PMID:16322569
May, Eric R; Armen, Roger S; Mannan, Aristotle M; Brooks, Charles L
2010-08-01
The arenavirus genome encodes for a Z-protein, which contains a RING domain that coordinates two zinc ions, and has been identified as having several functional roles at various stages of the virus life cycle. Z-protein binds to multiple host proteins and has been directly implicated in the promotion of viral budding, repression of mRNA translation, and apoptosis of infected cells. Using homology models of the Z-protein from Lassa strain arenavirus, replica exchange molecular dynamics (MD) was used to refine the structures, which were then subsequently clustered. Population-weighted ensembles of low-energy cluster representatives were predicted based upon optimal agreement of the chemical shifts computed with the SPARTA program with the experimental NMR chemical shifts. A member of the refined ensemble was identified to be a potential binder of budding factor Tsg101 based on its correspondence to the structure of the HIV-1 Gag late domain when bound to Tsg101. Members of these ensembles were docked against the crystal structure of human eIF4E translation initiation factor. Two plausible binding modes emerged based upon their agreement with experimental observation, favorable interaction energies and stability during MD trajectories. Mutations to Z are proposed that would either inhibit both binding mechanisms or selectively inhibit only one mode. The C-terminal domain conformation of the most populated member of the representative ensemble shielded protein-binding recognition motifs for Tsg101 and eIF4E and represents the most populated state free in solution. We propose that C-terminal flexibility is key for mediating the different functional states of the Z-protein. (c) 2010 Wiley-Liss, Inc.
May, Eric R.; Armen, Roger S.; Mannan, Aristotle M.; Brooks, Charles L.
2010-01-01
The arenavirus genome encodes for a Z-protein, which contains a RING domain that coordinates two zinc ions, and has been identified as having several functional roles at various stages of the virus life cycle. Z-protein binds to multiple host proteins and has been directly implicated in the promotion of viral budding, repression of mRNA translation and apoptosis of infected cells. Using homology models of the Z-protein from Lassa strain arenavirus, replica exchange molecular dynamics were employed to refine the structures, which were then subsequently clustered. Population weighted ensembles of low energy cluster representatives were predicted based upon optimal agreement of the chemical shifts computed with the SPARTA program with the experimental NMR chemical shifts. A member of the refined ensemble was indentified to be a potential binder of budding factor Tsg101 based on its correspondence to the structure of the HIV-1 Gag late domain when bound to Tsg101. Members of these ensembles were docked against the crystal structure of human eIF4E translation initiation factor. Two plausible binding modes emerged based upon their agreement with experimental observation, favorable interaction energies and stability during molecular dynamics trajectories. Mutations to Z are proposed that would either inhibit both binding mechanisms or selectively inhibit only one mode. The C-terminal domain conformation of the most populated member of the representative ensemble shielded protein binding recognition motifs for Tsg101 and eIF4E, and represents the most populated state free in solution. We propose that C-terminal flexibility is key for mediating the different functional states of the Z-protein. PMID:20544962
Materiomics: biological protein materials, from nano to macro.
Cranford, Steven; Buehler, Markus J
2010-11-12
Materiomics is an emerging field of science that provides a basis for multiscale material system characterization, inspired in part by natural, for example, protein-based materials. Here we outline the scope and explain the motivation of the field of materiomics, as well as demonstrate the benefits of a materiomic approach in the understanding of biological and natural materials as well as in the design of de novo materials. We discuss recent studies that exemplify the impact of materiomics - discovering Nature's complexity through a materials science approach that merges concepts of material and structure throughout all scales and incorporates feedback loops that facilitate sensing and resulting structural changes at multiple scales. The development and application of materiomics is illustrated for the specific case of protein-based materials, which constitute the building blocks of a variety of biological systems such as tendon, bone, skin, spider silk, cells, and tissue, as well as natural composite material systems (a combination of protein-based and inorganic constituents) such as nacre and mollusk shells, and other natural multiscale systems such as cellulose-based plant and wood materials. An important trait of these materials is that they display distinctive hierarchical structures across multiple scales, where molecular details are exhibited in macroscale mechanical responses. Protein materials are intriguing examples of materials that balance multiple tasks, representing some of the most sustainable material solutions that integrate structure and function despite severe limitations in the quality and quantity of material building blocks. However, up until now, our attempts to analyze and replicate Nature's materials have been hindered by our lack of fundamental understanding of these materials' intricate hierarchical structures, scale-bridging mechanisms, and complex material components that bestow protein-based materials their unique properties. Recent advances in analytical tools and experimental methods allow a holistic view of such a hierarchical biological material system. The integration of these approaches and amalgamation of material properties at all scale levels to develop a complete description of a material system falls within the emerging field of materiomics. Materiomics is the result of the convergence of engineering and materials science with experimental and computational biology in the context of natural and synthetic materials. Through materiomics, fundamental advances in our understanding of structure-property-process relations of biological systems contribute to the mechanistic understanding of certain diseases and facilitate the development of novel biological, biologically inspired, and completely synthetic materials for applications in medicine (biomaterials), nanotechnology, and engineering.
A decade and a half of protein intrinsic disorder: Biology still waits for physics
Uversky, Vladimir N
2013-01-01
The abundant existence of proteins and regions that possess specific functions without being uniquely folded into unique 3D structures has become accepted by a significant number of protein scientists. Sequences of these intrinsically disordered proteins (IDPs) and IDP regions (IDPRs) are characterized by a number of specific features, such as low overall hydrophobicity and high net charge which makes these proteins predictable. IDPs/IDPRs possess large hydrodynamic volumes, low contents of ordered secondary structure, and are characterized by high structural heterogeneity. They are very flexible, but some may undergo disorder to order transitions in the presence of natural ligands. The degree of these structural rearrangements varies over a very wide range. IDPs/IDPRs are tightly controlled under the normal conditions and have numerous specific functions that complement functions of ordered proteins and domains. When lacking proper control, they have multiple roles in pathogenesis of various human diseases. Gaining structural and functional information about these proteins is a challenge, since they do not typically “freeze” while their “pictures are taken.” However, despite or perhaps because of the experimental challenges, these fuzzy objects with fuzzy structures and fuzzy functions are among the most interesting targets for modern protein research. This review briefly summarizes some of the recent advances in this exciting field and considers some of the basic lessons learned from the analysis of physics, chemistry, and biology of IDPs. PMID:23553817
Xu, Dong; Zhang, Yang
2012-01-01
Ab initio protein folding is one of the major unsolved problems in computational biology due to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1–20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 non-homologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score (TM-score) >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in 1/3 cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction (CASP9) experiment, QUARK server outperformed the second and third best servers by 18% and 47% based on the cumulative Z-score of global distance test-total (GDT-TS) scores in the free modeling (FM) category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress towards the solution of the most important problem in the field. PMID:22411565
2015-01-01
Various studies have implicated the concave surface of arrestin in the binding of the cytosolic surface of rhodopsin. However, specific sites of contact between the two proteins have not previously been defined in detail. Here, we report that arrestin shares part of the same binding site on rhodopsin as does the transducin Gα subunit C-terminal tail, suggesting binding of both proteins to rhodopsin may share some similar underlying mechanisms. We also identify two areas of contact between the proteins near this region. Both sites lie in the arrestin N-domain, one in the so-called “finger” loop (residues 67–79) and the other in the 160 loop (residues 155–165). We mapped these sites using a novel tryptophan-induced quenching method, in which we introduced Trp residues into arrestin and measured their ability to quench the fluorescence of bimane probes attached to cysteine residues on TM6 of rhodopsin (T242C and T243C). The involvement of finger loop binding to rhodopsin was expected, but the evidence of the arrestin 160 loop contacting rhodopsin was not. Remarkably, our data indicate one site on rhodopsin can interact with multiple structurally separate sites on arrestin that are almost 30 Å apart. Although this observation at first seems paradoxical, in fact, it provides strong support for recent hypotheses that structural plasticity and conformational changes are involved in the arrestin–rhodopsin binding interface and that the two proteins may be able to interact through multiple docking modes, with arrestin binding to both monomeric and dimeric rhodopsin. PMID:24724832
Sinha, Abhinav; Jones Brunette, Amber M; Fay, Jonathan F; Schafer, Christopher T; Farrens, David L
2014-05-27
Various studies have implicated the concave surface of arrestin in the binding of the cytosolic surface of rhodopsin. However, specific sites of contact between the two proteins have not previously been defined in detail. Here, we report that arrestin shares part of the same binding site on rhodopsin as does the transducin Gα subunit C-terminal tail, suggesting binding of both proteins to rhodopsin may share some similar underlying mechanisms. We also identify two areas of contact between the proteins near this region. Both sites lie in the arrestin N-domain, one in the so-called "finger" loop (residues 67-79) and the other in the 160 loop (residues 155-165). We mapped these sites using a novel tryptophan-induced quenching method, in which we introduced Trp residues into arrestin and measured their ability to quench the fluorescence of bimane probes attached to cysteine residues on TM6 of rhodopsin (T242C and T243C). The involvement of finger loop binding to rhodopsin was expected, but the evidence of the arrestin 160 loop contacting rhodopsin was not. Remarkably, our data indicate one site on rhodopsin can interact with multiple structurally separate sites on arrestin that are almost 30 Å apart. Although this observation at first seems paradoxical, in fact, it provides strong support for recent hypotheses that structural plasticity and conformational changes are involved in the arrestin-rhodopsin binding interface and that the two proteins may be able to interact through multiple docking modes, with arrestin binding to both monomeric and dimeric rhodopsin.
Extant fold-switching proteins are widespread.
Porter, Lauren L; Looger, Loren L
2018-06-05
A central tenet of biology is that globular proteins have a unique 3D structure under physiological conditions. Recent work has challenged this notion by demonstrating that some proteins switch folds, a process that involves remodeling of secondary structure in response to a few mutations (evolved fold switchers) or cellular stimuli (extant fold switchers). To date, extant fold switchers have been viewed as rare byproducts of evolution, but their frequency has been neither quantified nor estimated. By systematically and exhaustively searching the Protein Data Bank (PDB), we found ∼100 extant fold-switching proteins. Furthermore, we gathered multiple lines of evidence suggesting that these proteins are widespread in nature. Based on these lines of evidence, we hypothesized that the frequency of extant fold-switching proteins may be underrepresented by the structures in the PDB. Thus, we sought to identify other putative extant fold switchers with only one solved conformation. To do this, we identified two characteristic features of our ∼100 extant fold-switching proteins, incorrect secondary structure predictions and likely independent folding cooperativity, and searched the PDB for other proteins with similar features. Reassuringly, this method identified dozens of other proteins in the literature with indication of a structural change but only one solved conformation in the PDB. Thus, we used it to estimate that 0.5-4% of PDB proteins switch folds. These results demonstrate that extant fold-switching proteins are likely more common than the PDB reflects, which has implications for cell biology, genomics, and human health. Copyright © 2018 the Author(s). Published by PNAS.
Fischer, Axel W.; Bordignon, Enrica; Bleicken, Stephanie; García-Sáez, Ana J.; Jeschke, Gunnar; Meiler, Jens
2016-01-01
Structure determination remains a challenge for many biologically important proteins. In particular, proteins that adopt multiple conformations often evade crystallization in all biologically relevant states. Although computational de novo protein folding approaches often sample biologically relevant conformations, the selection of the most accurate model for different functional states remains a formidable challenge, in particular, for proteins with more than about 150 residues. Electron paramagnetic resonance (EPR) spectroscopy can obtain limited structural information for proteins in well-defined biological states and thereby assist in selecting biologically relevant conformations. The present study demonstrates that de novo folding methods are able to accurately sample the folds of 192-residue long soluble monomeric Bcl-2-associated X protein (BAX). The tertiary structures of the monomeric and homodimeric forms of BAX were predicted using the primary structure as well as 25 and 11 EPR distance restraints, respectively. The predicted models were subsequently compared to respective NMR/X-ray structures of BAX. EPR restraints improve the protein-size normalized root-mean-square-deviation (RMSD100) of the most accurate models with respect to the NMR/crystal structure from 5.9 Å to 3.9 Å and from 5.7 Å to 3.3 Å, respectively. Additionally, the model discrimination is improved, which is demonstrated by an improvement of the enrichment from 5% to 15% and from 13% to 21%, respectively. PMID:27129417
Parente, Daniel J; Ray, J Christian J; Swint-Kruse, Liskin
2015-12-01
As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank-ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly-used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6-bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column-specific properties such as sequence entropy and random noise were subtracted; "central" positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints-detectable by divergent algorithms--that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions. © 2015 Wiley Periodicals, Inc.
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data
Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo
2018-01-01
RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423
Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio
2012-02-15
We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.
Exploring protein kinase conformation using swarm-enhanced sampling molecular dynamics.
Atzori, Alessio; Bruce, Neil J; Burusco, Kepa K; Wroblowski, Berthold; Bonnet, Pascal; Bryce, Richard A
2014-10-27
Protein plasticity, while often linked to biological function, also provides opportunities for rational design of selective and potent inhibitors of their function. The application of computational methods to the prediction of concealed protein concavities is challenging, as the motions involved can be significant and occur over long time scales. Here we introduce the swarm-enhanced sampling molecular dynamics (sesMD) method as a tool to improve sampling of conformational landscapes. In this approach, a swarm of replica simulations interact cooperatively via a set of pairwise potentials incorporating attractive and repulsive components. We apply the sesMD approach to explore the conformations of the DFG motif in the protein p38α mitogen-activated protein kinase. In contrast to multiple MD simulations, sesMD trajectories sample a range of DFG conformations, some of which map onto existing crystal structures. Simulated structures intermediate between the DFG-in and DFG-out conformations are predicted to have druggable pockets of interest for structure-based ligand design.
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.
Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir
2009-01-01
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures
Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir
2009-01-01
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
Looking at the Disordered Proteins through the Computational Microscope
2018-01-01
Intrinsically disordered proteins (IDPs) have attracted wide interest over the past decade due to their surprising prevalence in the proteome and versatile roles in cell physiology and pathology. A large selection of IDPs has been identified as potential targets for therapeutic intervention. Characterizing the structure–function relationship of disordered proteins is therefore an essential but daunting task, as these proteins can adapt transient structure, necessitating a new paradigm for connecting structural disorder to function. Molecular simulation has emerged as a natural complement to experiments for atomic-level characterizations and mechanistic investigations of this intriguing class of proteins. The diverse range of length and time scales involved in IDP function requires performing simulations at multiple levels of resolution. In this Outlook, we focus on summarizing available simulation methods, along with a few interesting example applications. We also provide an outlook on how these simulation methods can be further improved in order to provide a more accurate description of IDP structure, binding, and assembly.
Biophysics of α-synuclein membrane interactions.
Pfefferkorn, Candace M; Jiang, Zhiping; Lee, Jennifer C
2012-02-01
Membrane proteins participate in nearly all cellular processes; however, because of experimental limitations, their characterization lags far behind that of soluble proteins. Peripheral membrane proteins are particularly challenging to study because of their inherent propensity to adopt multiple and/or transient conformations in solution and upon membrane association. In this review, we summarize useful biophysical techniques for the study of peripheral membrane proteins and their application in the characterization of the membrane interactions of the natively unfolded and Parkinson's disease (PD) related protein, α-synuclein (α-syn). We give particular focus to studies that have led to the current understanding of membrane-bound α-syn structure and the elucidation of specific membrane properties that affect α-syn-membrane binding. Finally, we discuss biophysical evidence supporting a key role for membranes and α-syn in PD pathogenesis. This article is part of a Special Issue entitled: Membrane protein structure and function. Copyright © 2011. Published by Elsevier B.V.
Mori, Mirko; Kateb, Fatiha; Bodenhausen, Geoffrey; Piccioli, Mario; Abergel, Daniel
2010-03-17
Multiple quantum relaxation in proteins reveals unexpected relationships between correlated or anti-correlated conformational backbone dynamics in alpha-helices or beta-sheets. The contributions of conformational exchange to the relaxation rates of C'N coherences (i.e., double- and zero-quantum coherences involving backbone carbonyl (13)C' and neighboring amide (15)N nuclei) depend on the kinetics of slow exchange processes, as well as on the populations of the conformations and chemical shift differences of (13)C' and (15)N nuclei. The relaxation rates of C'N coherences, which reflect concerted fluctuations due to slow chemical shift modulations (CSMs), were determined by direct (13)C detection in diamagnetic and paramagnetic proteins. In well-folded proteins such as lanthanide-substituted calbindin (CaLnCb), copper,zinc superoxide dismutase (Cu,Zn SOD), and matrix metalloproteinase (MMP12), slow conformational exchange occurs along the entire backbone. Our observations demonstrate that relaxation rates of C'N coherences arising from slow backbone dynamics have positive signs (characteristic of correlated fluctuations) in beta-sheets and negative signs (characteristic of anti-correlated fluctuations) in alpha-helices. This extends the prospects of structure-dynamics relationships to slow time scales that are relevant for protein function and enzymatic activity.
The complex folding pathways of protein A suggest a multiple-funnelled energy landscape
NASA Astrophysics Data System (ADS)
St-Pierre, Jean-Francois; Mousseau, Normand; Derreumaux, Philippe
2008-01-01
Folding proteins into their native states requires the formation of both secondary and tertiary structures. Many questions remain, however, as to whether these form into a precise order, and various pictures have been proposed that place the emphasis on the first or the second level of structure in describing folding. One of the favorite test models for studying this question is the B domain of protein A, which has been characterized by numerous experiments and simulations. Using the activation-relaxation technique coupled with a generic energy model (optimized potential for efficient peptide structure prediction), we generate more than 50 folding trajectories for this 60-residue protein. While the folding pathways to the native state are fully consistent with the funnel-like description of the free energy landscape, we find a wide range of mechanisms in which secondary and tertiary structures form in various orders. Our nonbiased simulations also reveal the presence of a significant number of non-native β and α conformations both on and off pathway, including the visit, for a non-negligible fraction of trajectories, of fully ordered structures resembling the native state of nonhomologous proteins.
Insights into the Shc Family of Adaptor Proteins
Prigent, Sally A.
2017-01-01
The Shc family of adaptor proteins is a group of proteins that lacks intrinsic enzymatic activity. Instead, Shc proteins possess various domains that allow them to recruit different signalling molecules. Shc proteins help to transduce an extracellular signal into an intracellular signal, which is then translated into a biological response. The Shc family of adaptor proteins share the same structural topography, CH2-PTB-CH1-SH2, which is more than an isoform of Shc family proteins; this structure, which includes multiple domains, allows for the posttranslational modification of Shc proteins and increases the functional diversity of Shc proteins. The deregulation of Shc proteins has been linked to different disease conditions, including cancer and Alzheimer’s, which indicates their key roles in cellular functions. Accordingly, a question might arise as to whether Shc proteins could be targeted therapeutically to correct their disturbance. To answer this question, thorough knowledge must be acquired; herein, we aim to shed light on the Shc family of adaptor proteins to understand their intracellular role in normal and disease states, which later might be applied to connote mechanisms to reverse the disease state.
Estrada-Ortiz, Natalia; Neochoritis, Constantinos G; Dömling, Alexander
2016-04-19
A recent therapeutic strategy in oncology is based on blocking the protein-protein interaction between the murine double minute (MDM) homologues MDM2/X and the tumor-suppressor protein p53. Inhibiting the binding between wild-type (WT) p53 and its negative regulators MDM2 and/or MDMX has become an important target in oncology to restore the antitumor activity of p53, the so-called guardian of our genome. Interestingly, based on the multiple disclosed compound classes and structural analysis of small-molecule-MDM2 adducts, the p53-MDM2 complex is perhaps the best studied and most targeted protein-protein interaction. Several classes of small molecules have been identified as potent, selective, and efficient inhibitors of the p53-MDM2/X interaction, and many co-crystal structures with the protein are available. Herein we review the properties as well as preclinical and clinical studies of these small molecules and peptides, categorized by scaffold type. A particular emphasis is made on crystallographic structures and the observed binding modes of these compounds, including conserved water molecules present. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Bhakat, Soumendranath; Åberg, Emil; Söderhjelm, Pär
2018-01-01
Advanced molecular docking methods often aim at capturing the flexibility of the protein upon binding to the ligand. In this study, we investigate whether instead a simple rigid docking method can be applied, if combined with multiple target structures to model the backbone flexibility and molecular dynamics simulations to model the sidechain and ligand flexibility. The methods are tested for the binding of 35 ligands to FXR as part of the first stage of the Drug Design Data Resource (D3R) Grand Challenge 2 blind challenge. The results show that the multiple-target docking protocol performs surprisingly well, with correct poses found for 21 of the ligands. MD simulations started on the docked structures are remarkably stable, but show almost no tendency of refining the structure closer to the experimentally found binding pose. Reconnaissance metadynamics enhances the exploration of new binding poses, but additional collective variables involving the protein are needed to exploit the full potential of the method.
Bhakat, Soumendranath; Åberg, Emil; Söderhjelm, Pär
2018-01-01
Advanced molecular docking methods often aim at capturing the flexibility of the protein upon binding to the ligand. In this study, we investigate whether instead a simple rigid docking method can be applied, if combined with multiple target structures to model the backbone flexibility and molecular dynamics simulations to model the sidechain and ligand flexibility. The methods are tested for the binding of 35 ligands to FXR as part of the first stage of the Drug Design Data Resource (D3R) Grand Challenge 2 blind challenge. The results show that the multiple-target docking protocol performs surprisingly well, with correct poses found for 21 of the ligands. MD simulations started on the docked structures are remarkably stable, but show almost no tendency of refining the structure closer to the experimentally found binding pose. Reconnaissance metadynamics enhances the exploration of new binding poses, but additional collective variables involving the protein are needed to exploit the full potential of the method.
My 65 years in protein chemistry
Scheraga, Harold A.
2015-01-01
This is a tour of a physical chemist through 65 years of protein chemistry from the time when emphasis was placed on the determination of the size and shape of the protein molecule as a colloidal particle, with an early breakthrough by James Sumner, followed by Linus Pauling and Fred Sanger, that a protein was a real molecule, albeit a macromolecule. It deals with the recognition of the nature and importance of hydrogen bonds and hydrophobic interactions in determining the structure, properties, and biological function of proteins until the present acquisition of an understanding of the structure, thermodynamics, and folding pathways from a linear array of amino acids to a biological entity. Along the way, with a combination of experiment and theoretical interpretation, a mechanism was elucidated for the thrombin-induced conversion of fibrinogen to a fibrin blood clot and for the oxidative-folding pathways of ribonuclease A. Before the atomic structure of a protein molecule was determined by x-ray diffraction or nuclear magnetic resonance spectroscopy, experimental studies of the fundamental interactions underlying protein structure led to several distance constraints which motivated the theoretical approach to determine protein structure, and culminated in the Empirical Conformational Energy Program for Peptides (ECEPP), an all-atom force field, with which the structures of fibrous collagen-like proteins and the 46-residue globular staphylococcal protein A were determined. To undertake the study of larger globular proteins, a physics-based coarse-grained UNited-RESidue (UNRES) force field was developed, and applied to the protein-folding problem in terms of structure, thermodynamics, dynamics, and folding pathways. Initially, single-chain and, ultimately, multiple-chain proteins were examined, and the methodology was extended to protein–protein interactions and to nucleic acids and to protein–nucleic acid interactions. The ultimate results led to an understanding of a variety of biological processes underlying natural and disease phenomena. PMID:25850343
Antunes, Deborah; Jorge, Natasha A. N.; Caffarena, Ernesto R.; Passetti, Fabio
2018-01-01
RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications. PMID:29403526
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3
Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa
2001-01-01
The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Customizing model membranes and samples for NMR spectroscopic studies of complex membrane proteins.
Sanders, C R; Oxenoid, K
2000-11-23
Both solution and solid state nuclear magnetic resonance (NMR) techniques for structural determination are advancing rapidly such that it is possible to contemplate bringing these techniques to bear upon integral membrane proteins having multiple transmembrane segments. This review outlines existing and emerging options for model membrane media for use in such studies and surveys the special considerations which must be taken into account when preparing larger membrane proteins for NMR spectroscopic studies.
Improved data visualization techniques for analyzing macromolecule structural changes
Kim, Jae Hyun; Iyer, Vidyashankara; Joshi, Sangeeta B; Volkin, David B; Middaugh, C Russell
2012-01-01
The empirical phase diagram (EPD) is a colored representation of overall structural integrity and conformational stability of macromolecules in response to various environmental perturbations. Numerous proteins and macromolecular complexes have been analyzed by EPDs to summarize results from large data sets from multiple biophysical techniques. The current EPD method suffers from a number of deficiencies including lack of a meaningful relationship between color and actual molecular features, difficulties in identifying contributions from individual techniques, and a limited ability to be interpreted by color-blind individuals. In this work, three improved data visualization approaches are proposed as techniques complementary to the EPD. The secondary, tertiary, and quaternary structural changes of multiple proteins as a function of environmental stress were first measured using circular dichroism, intrinsic fluorescence spectroscopy, and static light scattering, respectively. Data sets were then visualized as (1) RGB colors using three-index EPDs, (2) equiangular polygons using radar charts, and (3) human facial features using Chernoff face diagrams. Data as a function of temperature and pH for bovine serum albumin, aldolase, and chymotrypsin as well as candidate protein vaccine antigens including a serine threonine kinase protein (SP1732) and surface antigen A (SP1650) from S. pneumoniae and hemagglutinin from an H1N1 influenza virus are used to illustrate the advantages and disadvantages of each type of data visualization technique. PMID:22898970
Illuminating structural proteins in viral "dark matter" with metaproteomics
Brum, Jennifer R.; Ignacio-Espinoza, J. Cesar; Kim, Eun -Hae; ...
2016-02-16
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional darkmatter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore,more » four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Altogether, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.« less
Illuminating structural proteins in viral "dark matter" with metaproteomics.
Brum, Jennifer R; Ignacio-Espinoza, J Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M; Roux, Simon; VerBerkmoes, Nathan C; Rich, Virginia I; Sullivan, Matthew B
2016-03-01
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.
Illuminating structural proteins in viral “dark matter” with metaproteomics
Brum, Jennifer R.; Ignacio-Espinoza, J. Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M.; Roux, Simon; VerBerkmoes, Nathan C.; Rich, Virginia I.; Sullivan, Matthew B.
2016-01-01
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional “viral dark matter.” Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world’s oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter. PMID:26884177
Ponomareva, Eugenia P; Ternovoi, Vladimir A; Mikryukova, Tamara P; Protopopova, Elena V; Gladysheva, Anastasia V; Shvalov, Alexander N; Konovalova, Svetlana N; Chausov, Eugene V; Loktev, Valery B
2017-10-01
The C11-13 strain from the Siberian subtype of tick-borne encephalitis virus (TBEV) was isolated from human brain using pig embryo kidney (PEK), 293, and Neuro-2a cells. Analysis of the complete viral genome of the C11-13 variants during six passages in these cells revealed that the cell-adapted C11-13 variants had multiple amino acid substitutions as compared to TBEV from human brain. Seven out of eight amino acids substitutions in the high-replicating C11-13(PEK) variant mapped to non-structural proteins; 13 out of 14 substitutions in the well-replicating C11-13(293) variant, and all four substitutions in the low-replicating C11-13(Neuro-2a) variant were also localized in non-structural proteins, predominantly in the NS2a (2), NS3 (6) and NS5 (3) proteins. The substitutions NS2a 1067 (Asn → Asp), NS2a 1168 (Leu → Val) in the N-terminus of NS2a and NS3 1745 (His → Gln) in the helicase domain of NS3 were found in all selected variants. We postulate that multiple substitutions in the NS2a, NS3 and NS5 genes play a key role in adaptation of TBEV to different cells.
The JCSG high-throughput structural biology pipeline.
Elsliger, Marc André; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wooley, John; Wüthrich, Kurt; Wilson, Ian A
2010-10-01
The Joint Center for Structural Genomics high-throughput structural biology pipeline has delivered more than 1000 structures to the community over the past ten years. The JCSG has made a significant contribution to the overall goal of the NIH Protein Structure Initiative (PSI) of expanding structural coverage of the protein universe, as well as making substantial inroads into structural coverage of an entire organism. Targets are processed through an extensive combination of bioinformatics and biophysical analyses to efficiently characterize and optimize each target prior to selection for structure determination. The pipeline uses parallel processing methods at almost every step in the process and can adapt to a wide range of protein targets from bacterial to human. The construction, expansion and optimization of the JCSG gene-to-structure pipeline over the years have resulted in many technological and methodological advances and developments. The vast number of targets and the enormous amounts of associated data processed through the multiple stages of the experimental pipeline required the development of variety of valuable resources that, wherever feasible, have been converted to free-access web-based tools and applications.
A Structure of a Collagen VI VWA Domain Displays N and C Termini at Opposite Sides of the Protein
Becker, Ann-Kathrin A.; Mikolajek, Halina; Paulsson, Mats; Wagener, Raimund; Werner, Jörn M.
2014-01-01
Summary Von Willebrand factor A (VWA) domains are versatile protein interaction domains with N and C termini in close proximity placing spatial constraints on overall protein structure. The 1.2 Å crystal structures of a collagen VI VWA domain and a disease-causing point mutant show C-terminal extensions that place the N and C termini at opposite ends. This allows a “beads-on-a-string” arrangement of multiple VWA domains as observed for ten N-terminal domains of the collagen VI α3 chain. The extension is linked to the core domain by a salt bridge and two hydrophobic patches. Comparison of the wild-type and a muscular dystrophy-associated mutant structure identifies a potential perturbation of a protein interaction interface and indeed, the secretion of mutant collagen VI tetramers is affected. Homology modeling is used to locate a number of disease-associated mutations and analyze their structural impact, which will allow mechanistic analysis of collagen-VI-associated muscular dystrophy phenotypes. PMID:24332716
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lokareddy, Ravi K.; Sankhala, Rajeshwer S.; Roy, Ankoor
Tailed bacteriophages and herpesviruses assemble infectious particles via an empty precursor capsid (or ‘procapsid’) built by multiple copies of coat and scaffolding protein and by one dodecameric portal protein. Genome packaging triggers rearrangement of the coat protein and release of scaffolding protein, resulting in dramatic procapsid lattice expansion. Here, we provide structural evidence that the portal protein of the bacteriophage P22 exists in two distinct dodecameric conformations: an asymmetric assembly in the procapsid (PC-portal) that is competent for high affinity binding to the large terminase packaging protein, and a symmetric ring in the mature virion (MV-portal) that has negligible affinitymore » for the packaging motor. Modelling studies indicate the structure of PC-portal is incompatible with DNA coaxially spooled around the portal vertex, suggesting that newly packaged DNA triggers the switch from PC- to MV-conformation. Thus, we propose the signal for termination of ‘Headful Packaging’ is a DNA-dependent symmetrization of portal protein.« less
Unraveling secrets of telomeres: one molecule at a time
Lin, Jiangguo; Kaur, Parminder; Countryman, Preston; Opresko, Patricia L.; Wang, Hong
2016-01-01
Telomeres play important roles in maintaining the stability of linear chromosomes. Telomere maintenance involves dynamic actions of multiple proteins interacting with long repetitive sequences and complex dynamic DNA structures, such as G-quadruplexes, T-loops and t-circles. Given the heterogeneity and complexity of telomeres, single-molecule approaches are essential to fully understand the structure-function relationships that govern telomere maintenance. In this review, we present a brief overview of the principles of single-molecule imaging and manipulation techniques. We then highlight results obtained from applying these single-molecule techniques for studying structure, dynamics and functions of G-quadruplexes, telomerase, and shelterin proteins. PMID:24569170
MACSIMS : multiple alignment of complete sequences information management system
Thompson, Julie D; Muller, Arnaud; Waterhouse, Andrew; Procter, Jim; Barton, Geoffrey J; Plewniak, Frédéric; Poch, Olivier
2006-01-01
Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . PMID:16792820
Ripoche, Hugues; Laine, Elodie; Ceres, Nicoletta; Carbone, Alessandra
2017-01-04
The database JET2 Viewer, openly accessible at http://www.jet2viewer.upmc.fr/, reports putative protein binding sites for all three-dimensional (3D) structures available in the Protein Data Bank (PDB). This knowledge base was generated by applying the computational method JET 2 at large-scale on more than 20 000 chains. JET 2 strategy yields very precise predictions of interacting surfaces and unravels their evolutionary process and complexity. JET2 Viewer provides an online intelligent display, including interactive 3D visualization of the binding sites mapped onto PDB structures and suitable files recording JET 2 analyses. Predictions were evaluated on more than 15 000 experimentally characterized protein interfaces. This is, to our knowledge, the largest evaluation of a protein binding site prediction method. The overall performance of JET 2 on all interfaces are: Sen = 52.52, PPV = 51.24, Spe = 80.05, Acc = 75.89. The data can be used to foster new strategies for protein-protein interactions modulation and interaction surface redesign. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Colloids in food: ingredients, structure, and stability.
Dickinson, Eric
2015-01-01
This article reviews progress in the field of food colloids with particular emphasis on advances in novel functional ingredients and nanoscale structuring. Specific aspects of ingredient development described here are the stabilization of bubbles and foams by the protein hydrophobin, the emulsifying characteristics of Maillard-type protein-polysaccharide conjugates, the structural and functional properties of protein fibrils, and the Pickering stabilization of dispersed droplets by food-grade nanoparticles and microparticles. Building on advances in the nanoscience of biological materials, the application of structural design principles to the fabrication of edible colloids is leading to progress in the fabrication of functional dispersed systems-multilayer interfaces, multiple emulsions, and gel-like emulsions. The associated physicochemical insight is contributing to our mechanistic understanding of oral processing and textural perception of food systems and to the development of colloid-based strategies to control delivery of nutrients during food digestion within the human gastrointestinal tract.
Proton-coupled sugar transport in the prototypical major facilitator superfamily protein XylE
Wisedchaisri, Goragot; Park, Min-Sun; Iadanza, Matthew G.; Zheng, Hongjin; Gonen, Tamir
2014-01-01
The major facilitator superfamily (MFS) is the largest collection of structurally related membrane proteins that transport a wide array of substrates. The proton-coupled sugar transporter XylE is the first member of the MFS that has been structurally characterized in multiple transporting conformations, including both the outward and inward-facing states. Here we report the crystal structure of XylE in a new inward-facing open conformation, allowing us to visualize the rocker-switch movement of the N-domain against the C-domain during the transport cycle. Using molecular dynamics simulation, and functional transport assays, we describe the movement of XylE that facilitates sugar translocation across a lipid membrane and identify the likely candidate proton-coupling residues as the conserved Asp27 and Arg133. This study addresses the structural basis for proton-coupled substrate transport and release mechanism for the sugar porter family of proteins. PMID:25088546
Multiple Targets of Salicylic Acid and Its Derivatives in Plants and Animals
Klessig, Daniel F.; Tian, Miaoying; Choi, Hyong Woo
2016-01-01
Salicylic acid (SA) is a critical plant hormone that is involved in many processes, including seed germination, root initiation, stomatal closure, floral induction, thermogenesis, and response to abiotic and biotic stresses. Its central role in plant immunity, although extensively studied, is still only partially understood. Classical biochemical approaches and, more recently, genome-wide high-throughput screens have identified more than two dozen plant SA-binding proteins (SABPs), as well as multiple candidates that have yet to be characterized. Some of these proteins bind SA with high affinity, while the affinity of others exhibit is low. Given that SA levels vary greatly even within a particular plant species depending on subcellular location, tissue type, developmental stage, and with respect to both time and location after an environmental stimulus such as infection, the presence of SABPs exhibiting a wide range of affinities for SA may provide great flexibility and multiple mechanisms through which SA can act. SA and its derivatives, both natural and synthetic, also have multiple targets in animals/humans. Interestingly, many of these proteins, like their plant counterparts, are associated with immunity or disease development. Two recently identified SABPs, high mobility group box protein and glyceraldehyde 3-phosphate dehydrogenase, are critical proteins that not only serve key structural or metabolic functions but also play prominent roles in disease responses in both kingdoms. PMID:27303403
Conlan, Andrea R.; Paddock, Mark L.; Axelrod, Herbert L.; Cohen, Aina E.; Abresch, Edward C.; Wiley, Sandra; Roy, Melinda; Nechushtai, Rachel; Jennings, Patricia A.
2009-01-01
A primary role for mitochondrial dysfunction is indicated in the pathogenesis of insulin resistance. A widely used drug for the treatment of type 2 diabetes is pioglitazone, a member of the thiazolidinedione class of molecules. MitoNEET, a 2Fe–2S outer mitochondrial membrane protein, binds pioglitazone [Colca et al. (2004 ▶), Am. J. Physiol. Endocrinol. Metab. 286, E252–E260]. The soluble domain of the human mitoNEET protein has been expressed C-terminal to the superfolder green fluorescent protein and the mitoNEET protein has been isolated. Comparison of the crystal structure of mitoNEET isolated from cleavage of the fusion protein (1.4 Å resolution, R factor = 20.2%) with other solved structures shows that the CDGSH domains are superimposable, indicating proper assembly of mitoNEET. Furthermore, there is considerable flexibility in the position of the cytoplasmic tethering arms, resulting in two different conformations in the crystal structure. This flexibility affords multiple orientations on the outer mitochondrial membrane. PMID:19574633
Ito, Yoko; Uemura, Tomohiro; Shoda, Keiko; Fujimoto, Masaru; Ueda, Takashi; Nakano, Akihiko
2012-01-01
The Golgi apparatus forms stacks of cisternae in many eukaryotic cells. However, little is known about how such a stacked structure is formed and maintained. To address this question, plant cells provide a system suitable for live-imaging approaches because individual Golgi stacks are well separated in the cytoplasm. We established tobacco BY-2 cell lines expressing multiple Golgi markers tagged by different fluorescent proteins and observed their responses to brefeldin A (BFA) treatment and BFA removal. BFA treatment disrupted cis, medial, and trans cisternae but caused distinct relocalization patterns depending on the proteins examined. Medial- and trans-Golgi proteins, as well as one cis-Golgi protein, were absorbed into the endoplasmic reticulum (ER), but two other cis-Golgi proteins formed small punctate structures. After BFA removal, these puncta coalesced first, and then the Golgi stacks regenerated from them in the cis-to-trans order. We suggest that these structures have a property similar to the ER-Golgi intermediate compartment and function as the scaffold of Golgi regeneration. PMID:22740633
Ito, Yoko; Uemura, Tomohiro; Shoda, Keiko; Fujimoto, Masaru; Ueda, Takashi; Nakano, Akihiko
2012-08-01
The Golgi apparatus forms stacks of cisternae in many eukaryotic cells. However, little is known about how such a stacked structure is formed and maintained. To address this question, plant cells provide a system suitable for live-imaging approaches because individual Golgi stacks are well separated in the cytoplasm. We established tobacco BY-2 cell lines expressing multiple Golgi markers tagged by different fluorescent proteins and observed their responses to brefeldin A (BFA) treatment and BFA removal. BFA treatment disrupted cis, medial, and trans cisternae but caused distinct relocalization patterns depending on the proteins examined. Medial- and trans-Golgi proteins, as well as one cis-Golgi protein, were absorbed into the endoplasmic reticulum (ER), but two other cis-Golgi proteins formed small punctate structures. After BFA removal, these puncta coalesced first, and then the Golgi stacks regenerated from them in the cis-to-trans order. We suggest that these structures have a property similar to the ER-Golgi intermediate compartment and function as the scaffold of Golgi regeneration.
Diab, Ahmed; Foca, Adrien; Zoulim, Fabien; Durantel, David; Andrisani, Ourania
2018-01-01
Virally encoded proteins have evolved to perform multiple functions, and the core protein (HBc) of the hepatitis B virus (HBV) is a perfect example. While HBc is the structural component of the viral nucleocapsid, additional novel functions for the nucleus-localized HBc have recently been described. These results extend for HBc, beyond its structural role, a regulatory function in the viral life cycle and potentially a role in pathogenesis. In this article, we review the diverse roles of HBc in HBV replication and pathogenesis, emphasizing how the unique structure of this protein is key to its various functions. We focus in particular on recent advances in understanding the significance of HBc phosphorylations, its interaction with host proteins and the role of HBc in regulating the transcription of host genes. We also briefly allude to the emerging niche for new direct-acting antivirals targeting HBc, known as Core (protein) Allosteric Modulators (CAMs). Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, Jaslyn E. M. M.; Midtgaard, Søren Roi; Gysel, Kira
The crystal and solution structures of the T. thermophilus NlpC/P60 d, l-endopeptidase as well as the co-crystal structure of its N-terminal LysM domains bound to chitohexaose allow a proposal to be made regarding how the enzyme recognizes peptidoglycan. LysM domains, which are frequently present as repetitive entities in both bacterial and plant proteins, are known to interact with carbohydrates containing N-acetylglucosamine (GlcNAc) moieties, such as chitin and peptidoglycan. In bacteria, the functional significance of the involvement of multiple LysM domains in substrate binding has so far lacked support from high-resolution structures of ligand-bound complexes. Here, a structural study of themore » Thermus thermophilus NlpC/P60 endopeptidase containing two LysM domains is presented. The crystal structure and small-angle X-ray scattering solution studies of this endopeptidase revealed the presence of a homodimer. The structure of the two LysM domains co-crystallized with N-acetyl-chitohexaose revealed a new intermolecular binding mode that may explain the differential interaction between LysM domains and short or long chitin oligomers. By combining the structural information with the three-dimensional model of peptidoglycan, a model suggesting how protein dimerization enhances the recognition of peptidoglycan is proposed.« less
Protein 3D Structure Computed from Evolutionary Sequence Variation
Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris
2011-01-01
The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes. PMID:22163331
Mitchell, Carter A; Shi, Ce; Aldrich, Courtney C; Gulick, Andrew M
2012-04-17
Many bacteria use large modular enzymes for the synthesis of polyketide and peptide natural products. These multidomain enzymes contain integrated carrier domains that deliver bound substrates to multiple catalytic domains, requiring coordination of these chemical steps. Nonribosomal peptide synthetases (NRPSs) load amino acids onto carrier domains through the activity of an upstream adenylation domain. Our lab recently determined the structure of an engineered two-domain NRPS containing fused adenylation and carrier domains. This structure adopted a domain-swapped dimer that illustrated the interface between these two domains. To continue our investigation, we now examine PA1221, a natural two-domain protein from Pseudomonas aeruginosa. We have determined the amino acid specificity of this new enzyme and used domain specific mutations to demonstrate that loading the downstream carrier domain within a single protein molecule occurs more quickly than loading of a nonfused carrier domain intermolecularly. Finally, we have determined crystal structures of both apo- and holo-PA1221 proteins, the latter using a valine-adenosine vinylsulfonamide inhibitor that traps the adenylation domain-carrier domain interaction. The protein adopts an interface similar to that seen with the prior adenylation domain-carrier protein construct. A comparison of these structures with previous structures of multidomain NRPSs suggests that a large conformational change within the NRPS adenylation domains guides the carrier domain into the active site for thioester formation.
Structure-guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm.
Maximova, Tatiana; Plaku, Erion; Shehu, Amarda
2016-07-07
Proteins are macromolecules in perpetual motion, switching between structural states to modulate their function. A detailed characterization of the precise yet complex relationship between protein structure, dynamics, and function requires elucidating transitions between functionally-relevant states. Doing so challenges both wet and dry laboratories, as protein dynamics involves disparate temporal scales. In this paper we present a novel, sampling-based algorithm to compute transition paths. The algorithm exploits two main ideas. First, it leverages known structures to initialize its search and define a reduced conformation space for rapid sampling. This is key to address the insufficient sampling issue suffered by sampling-based algorithms. Second, the algorithm embeds samples in a nearest-neighbor graph where transition paths can be efficiently computed via queries. The algorithm adapts the probabilistic roadmap framework that is popular in robot motion planning. In addition to efficiently computing lowest-cost paths between any given structures, the algorithm allows investigating hypotheses regarding the order of experimentally-known structures in a transition event. This novel contribution is likely to open up new venues of research. Detailed analysis is presented on multiple-basin proteins of relevance to human disease. Multiscaling and the AMBER ff14SB force field are used to obtain energetically-credible paths at atomistic detail.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-05-26
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-01-01
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414
3Drefine: an interactive web server for efficient protein structure refinement
Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin
2016-01-01
3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. PMID:27131371
Lessons in molecular recognition. 2. Assessing and improving cross-docking accuracy.
Sutherland, Jeffrey J; Nandigam, Ravi K; Erickson, Jon A; Vieth, Michal
2007-01-01
Docking methods are used to predict the manner in which a ligand binds to a protein receptor. Many studies have assessed the success rate of programs in self-docking tests, whereby a ligand is docked into the protein structure from which it was extracted. Cross-docking, or using a protein structure from a complex containing a different ligand, provides a more realistic assessment of a docking program's ability to reproduce X-ray results. In this work, cross-docking was performed with CDocker, Fred, and Rocs using multiple X-ray structures for eight proteins (two kinases, one nuclear hormone receptor, one serine protease, two metalloproteases, and two phosphodiesterases). While average cross-docking accuracy is not encouraging, it is shown that using the protein structure from the complex that contains the bound ligand most similar to the docked ligand increases docking accuracy for all methods ("similarity selection"). Identifying the most successful protein conformer ("best selection") and similarity selection substantially reduce the difference between self-docking and average cross-docking accuracy. We identify universal predictors of docking accuracy (i.e., showing consistent behavior across most protein-method combinations), and show that models for predicting docking accuracy built using these parameters can be used to select the most appropriate docking method.
Genome Pool Strategy for Structural Coverage of Protein Families
Jaroszewski, Lukasz; Slabinski, Lukasz; Wooley, John; Deacon, Ashley M.; Lesley, Scott A.; Wilson, Ian. A.; Godzik, Adam
2010-01-01
As noticed by generations of structural biologists, closely homologous proteins may have substantially different crystallization properties and propensities. These observations can be used to systematically introduce additional dimensionality into crystallization trials by targeting homologous proteins from multiple genomes in a “genome pool” strategy. Through extensive use of our recently introduced “crystallization feasibility score” (Slabinski et al., 2007a), we can explain that the genome pool strategy works well because the crystallization feasibility scores are surprisingly broad within families of homologous proteins, with most families containing a range of optimal to very difficult targets. We also show that some families can be regarded as relatively “easy”, where a significant number of proteins are predicted to have optimal crystallization features, and others are “very difficult”, where almost none are predicted to result in a crystal structure. Thus, the outcome of such variable distributions of such crystallizability' preferences leads to uneven structural coverage of known families, with “easier” or “optimal” families having several times more solved structures than “very difficult” ones. Nevertheless, this latter category can be successfully targeted by increasing the number of genomes that are used to select targets from a given family. On average, adding 10 new genomes to the “genome pool” provides more promising targets for 7 “very difficult” families. In contrast, our crystallization feasibility score does not indicate that any specific microbial genomes can be readily classified as “easier” or “very difficult” with respect to providing suitable candidates for crystallization and structure determination. Finally, our analyses show that specific physicochemical properties of the protein sequence favor successful outcomes for structure determination and, hence, the group of proteins with known 3D structures is systematically different from the general pool of known proteins. We, therefore, assess the structural consequences of these differences in protein sequence and protein biophysical properties. PMID:19000818
Meng, Fanchi; Na, Insung; Kurgan, Lukasz; Uversky, Vladimir N.
2015-01-01
The cell nucleus contains a number of membrane-less organelles or intra-nuclear compartments. These compartments are dynamic structures representing liquid-droplet phases which are only slightly denser than the bulk intra-nuclear fluid. They possess different functions, have diverse morphologies, and are typically composed of RNA (or, in some cases, DNA) and proteins. We analyzed 3005 mouse proteins localized in specific intra-nuclear organelles, such as nucleolus, chromatin, Cajal bodies, nuclear speckles, promyelocytic leukemia (PML) nuclear bodies, nuclear lamina, nuclear pores, and perinuclear compartment and compared them with ~29,863 non-nuclear proteins from mouse proteome. Our analysis revealed that intrinsic disorder is enriched in the majority of intra-nuclear compartments, except for the nuclear pore and lamina. These compartments are depleted in proteins that lack disordered domains and enriched in proteins that have multiple disordered domains. Moonlighting proteins found in multiple intra-nuclear compartments are more likely to have multiple disordered domains. Protein-protein interaction networks in the intra-nuclear compartments are denser and include more hubs compared to the non-nuclear proteins. Hubs in the intra-nuclear compartments (except for the nuclear pore) are enriched in disorder compared with non-nuclear hubs and non-nuclear proteins. Therefore, our work provides support to the idea of the functional importance of intrinsic disorder in the cell nucleus and shows that many proteins associated with sub-nuclear organelles in nuclei of mouse cells are enriched in disorder. This high level of disorder in the mouse nuclear proteins defines their ability to serve as very promiscuous binders, possessing both large quantities of potential disorder-based interaction sites and the ability of a single such site to be involved in a large number of interactions. PMID:26712748
Shrivastava, Dipty; Nain, Vikrant; Sahi, Shakti; Verma, Anju; Sharma, Priyanka; Sharma, Prakash Chand; Kumar, Polumetla Ananda
2011-01-22
Resistance (R) protein recognizes molecular signature of pathogen infection and activates downstream hypersensitive response signalling in plants. R protein works as a molecular switch for pathogen defence signalling and represent one of the largest plant gene family. Hence, understanding molecular structure and function of R proteins has been of paramount importance for plant biologists. The present study is aimed at predicting structure of R proteins signalling domains (CC-NBS) by creating a homology model, refining and optimising the model by molecular dynamics simulation and comparing ADP and ATP binding. Based on sequence similarity with proteins of known structures, CC-NBS domains were initially modelled using CED- 4 (cell death abnormality protein) and APAF-1 (apoptotic protease activating factor) as multiple templates. The final CC-NBS structural model was built and optimized by molecular dynamic simulation for 5 nanoseconds (ns). Docking of ADP and ATP at active site shows that both ligand bind specifically with same residues and with minor difference (1 Kcal/mol) in binding energy. Sharing of binding site by ADP and ATP and low difference in their binding site makes CC-NBS suitable for working as molecular switch. Furthermore, structural superimposition elucidate that CC-NBS and CARD (caspase recruitment domains) domain of CED-4 have low RMSD value of 0.9 A° Availability of 3D structural model for both CC and NBS domains will . help in getting deeper insight in these pathogen defence genes.
Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso
2016-12-27
Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Shishkov, Alexander; Bogacheva, Elena; Fedorova, Natalia; Ksenofontov, Alexander; Badun, Gennadii; Radyukhin, Victor; Lukashina, Elena; Serebryakova, Marina; Dolgov, Alexey; Chulichkov, Alexey; Dobrov, Evgeny; Baratova, Lyudmila
2011-12-01
The structure of the C-terminal domain of the influenza virus A matrix M1 protein, for which X-ray diffraction data were still missing, was studied in acidic solution. Matrix M1 protein was bombarded with thermally-activated tritium atoms, and the resulting intramolecular distribution of the tritium label was analyzed to assess the steric accessibility of the amino acid residues in this protein. This technique revealed that interdomain loops and the C-terminal domain of the protein are the most accessible to labeling with tritium atoms. A model of the spatial arrangement of the C-terminal domain of matrix M1 protein was generated using rosetta software adjusted to the data obtained by tritium planigraphy experiments. This model suggests that the C-terminal domain is an almost flat layer with a three-α-helical structure. To explain the high level of tritium label incorporation into the C-terminal domain of the M1 protein in an acidic solution, we also used independent experimental approaches (CD spectroscopy, limited proteolysis and MALDI-TOF MS analysis of the proteolysis products, dynamic light scattering and analytical ultracentrifugation), as well as multiple computational algorithms, to analyse the intrinsic protein disorder. Taken together, the results obtained in the present study indicate that the C-terminal domain is weakly structured. We hypothesize that the specific 3D structural peculiarities of the M1 protein revealed in acidic pH solution allow the protein greater structural flexibility and enable it to interact effectively with the components of the host cell. © 2011 The Authors Journal compilation © 2011 FEBS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Identification of Host Cell Factors Associated with Astrovirus Replication in Caco-2 Cells.
Murillo, Andrea; Vera-Estrella, Rosario; Barkla, Bronwyn J; Méndez, Ernesto; Arias, Carlos F
2015-10-01
Astroviruses are small, nonenveloped viruses with a single-stranded positive-sense RNA genome causing acute gastroenteritis in children and immunocompromised patients. Since positive-sense RNA viruses have frequently been found to replicate in association with membranous structures, in this work we characterized the replication of the human astrovirus serotype 8 strain Yuc8 in Caco-2 cells, using density gradient centrifugation and free-flow zonal electrophoresis (FFZE) to fractionate cellular membranes. Structural and nonstructural viral proteins, positive- and negative-sense viral RNA, and infectious virus particles were found to be associated with a distinct population of membranes separated by FFZE. The cellular proteins associated with this membrane population in infected and mock-infected cells were identified by tandem mass spectrometry. The results indicated that membranes derived from multiple cell organelles were present in the population. Gene ontology and protein-protein interaction network analysis showed that groups of proteins with roles in fatty acid synthesis and ATP biosynthesis were highly enriched in the fractions of this population in infected cells. Based on this information, we investigated by RNA interference the role that some of the identified proteins might have in the replication cycle of the virus. Silencing of the expression of genes involved in cholesterol (DHCR7, CYP51A1) and fatty acid (FASN) synthesis, phosphatidylinositol (PI4KIIIβ) and inositol phosphate (ITPR3) metabolism, and RNA helicase activity (DDX23) significantly decreased the amounts of Yuc8 genomic and antigenomic RNA, synthesis of the structural protein VP90, and virus yield. These results strongly suggest that astrovirus RNA replication and particle assembly take place in association with modified membranes potentially derived from multiple cell organelles. Astroviruses are common etiological agents of acute gastroenteritis in children and immunocompromised patients. More recently, they have been associated with neurological diseases in mammals, including humans, and are also responsible for different pathologies in birds. In this work, we provide evidence that astrovirus RNA replication and virus assembly occur in contact with cell membranes potentially derived from multiple cell organelles and show that membrane-associated cellular proteins involved in lipid metabolism are required for efficient viral replication. Our findings provide information to enhance our knowledge of astrovirus biology and provide information that might be useful for the development of therapeutic interventions to prevent virus replication. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Identification of Host Cell Factors Associated with Astrovirus Replication in Caco-2 Cells
Murillo, Andrea; Vera-Estrella, Rosario; Barkla, Bronwyn J.; Méndez, Ernesto
2015-01-01
ABSTRACT Astroviruses are small, nonenveloped viruses with a single-stranded positive-sense RNA genome causing acute gastroenteritis in children and immunocompromised patients. Since positive-sense RNA viruses have frequently been found to replicate in association with membranous structures, in this work we characterized the replication of the human astrovirus serotype 8 strain Yuc8 in Caco-2 cells, using density gradient centrifugation and free-flow zonal electrophoresis (FFZE) to fractionate cellular membranes. Structural and nonstructural viral proteins, positive- and negative-sense viral RNA, and infectious virus particles were found to be associated with a distinct population of membranes separated by FFZE. The cellular proteins associated with this membrane population in infected and mock-infected cells were identified by tandem mass spectrometry. The results indicated that membranes derived from multiple cell organelles were present in the population. Gene ontology and protein-protein interaction network analysis showed that groups of proteins with roles in fatty acid synthesis and ATP biosynthesis were highly enriched in the fractions of this population in infected cells. Based on this information, we investigated by RNA interference the role that some of the identified proteins might have in the replication cycle of the virus. Silencing of the expression of genes involved in cholesterol (DHCR7, CYP51A1) and fatty acid (FASN) synthesis, phosphatidylinositol (PI4KIIIβ) and inositol phosphate (ITPR3) metabolism, and RNA helicase activity (DDX23) significantly decreased the amounts of Yuc8 genomic and antigenomic RNA, synthesis of the structural protein VP90, and virus yield. These results strongly suggest that astrovirus RNA replication and particle assembly take place in association with modified membranes potentially derived from multiple cell organelles. IMPORTANCE Astroviruses are common etiological agents of acute gastroenteritis in children and immunocompromised patients. More recently, they have been associated with neurological diseases in mammals, including humans, and are also responsible for different pathologies in birds. In this work, we provide evidence that astrovirus RNA replication and virus assembly occur in contact with cell membranes potentially derived from multiple cell organelles and show that membrane-associated cellular proteins involved in lipid metabolism are required for efficient viral replication. Our findings provide information to enhance our knowledge of astrovirus biology and provide information that might be useful for the development of therapeutic interventions to prevent virus replication. PMID:26246569
Multiple structure-intrinsic disorder interactions regulate and coordinate Hox protein function
NASA Astrophysics Data System (ADS)
Bondos, Sarah
During animal development, Hox transcription factors determine fate of developing tissues to generate diverse organs and appendages. Hox proteins are famous for their bizarre mutant phenotypes, such as replacing antennae with legs. Clearly, the functions of individual Hox proteins must be distinct and reliable in vivo, or the organism risks malformation or death. However, within the Hox protein family, the DNA-binding homeodomains are highly conserved and the amino acids that contact DNA are nearly invariant. These observations raise the question: How do different Hox proteins correctly identify their distinct target genes using a common DNA binding domain? One possible means to modulate DNA binding is through the influence of the non-homeodomain protein regions, which differ significantly among Hox proteins. However genetic approaches never detected intra-protein interactions, and early biochemical attempts were hindered because the special features of ``intrinsically disordered'' sequences were not appreciated. We propose the first-ever structural model of a Hox protein to explain how specific contacts between distant, intrinsically disordered regions of the protein and the homeodomain regulate DNA binding and coordinate this activity with other Hox molecular functions.
Torres, Jaume; Surya, Wahyu; Li, Yan; Liu, Ding Xiang
2015-01-01
Viroporins are members of a rapidly growing family of channel-forming small polypeptides found in viruses. The present review will be focused on recent structural and protein-protein interaction information involving two viroporins found in enveloped viruses that target the respiratory tract; (i) the envelope protein in coronaviruses and (ii) the small hydrophobic protein in paramyxoviruses. Deletion of these two viroporins leads to viral attenuation in vivo, whereas data from cell culture shows involvement in the regulation of stress and inflammation. The channel activity and structure of some representative members of these viroporins have been recently characterized in some detail. In addition, searches for protein-protein interactions using yeast-two hybrid techniques have shed light on possible functional roles for their exposed cytoplasmic domains. A deeper analysis of these interactions should not only provide a more complete overview of the multiple functions of these viroporins, but also suggest novel strategies that target protein-protein interactions as much needed antivirals. These should complement current efforts to block viroporin channel activity. PMID:26053927
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guan, Jian; Bywaters, Stephanie M.; Brendle, Sarah A.
2015-09-15
Cryo-electron microscopy (cryo-EM) was used to solve the structures of human papillomavirus type 16 (HPV16) complexed with fragments of antibody (Fab) from three different neutralizing monoclonals (mAbs): H16.1A, H16.14J, and H263.A2. The structure-function analysis revealed predominantly monovalent binding of each Fab with capsid interactions that involved multiple loops from symmetry related copies of the major capsid protein. The residues identified in each Fab-virus interface map to a conformational groove on the surface of the capsomer. In addition to the known involvement of the FG and HI loops, the DE loop was also found to constitute the core of each epitope.more » Surprisingly, the epitope mapping also identified minor contributions by EF and BC loops. Complementary immunological assays included mAb and Fab neutralization. The specific binding characteristics of mAbs correlated with different neutralizing behaviors in pre- and post-attachment neutralization assays. - Highlights: • We present HPV16-Fab complexes from neutralizing mAbs: H16.1A, H16.14J, and H263.A2. • The structure-function analysis revealed predominantly monovalent binding of each mAb. • Capsid–Fab interactions involved multiple loops from symmetry related L1 proteins. • Besides the known FG and HI loops, epitope mapping also identified DE, EF, and BC loops. • Neutralizing assays complement the structures to show multiple neutralization mechanisms.« less
Evolutionary profiles from the QR factorization of multiple sequence alignments
Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida
2005-01-01
We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Multiple polymer architectures of human Polyhomeotic homolog 3 (PHC3) SAM
Nanyes, David R.; Junco, Sarah E.; Taylor, Alexander B.; Robinson, Angela K.; Patterson, Nicolle L.; Shivarajpur, Ambika; Halloran, Jonathan; Hale, Seth M.; Kaur, Yogeet; Hart, P. John; Kim, Chongwoo A.
2014-01-01
The self-association of sterile alpha motifs (SAMs) into a helical polymer architecture is a critical functional component of many different and diverse array of proteins. For the Drosophila Polycomb group (PcG) protein Polyhomeotic (Ph), its SAM polymerization serves as the structural foundation to cluster multiple PcG complexes, helping to maintain a silenced chromatin state. Ph SAM shares 64% sequence identity with its human ortholog, PHC3 SAM, and both SAMs polymerize. However, in the context of their larger protein regions, PHC3 SAM forms longer polymers compared to Ph SAM. Motivated to establish the precise structural basis for the differences, if any, between Ph and PHC3 SAM, we determined the crystal structure of the PHC3 SAM polymer. PHC3 SAM utilizes the same SAM-SAM interaction as the Ph SAM six-fold repeat polymer. Yet, PHC3 SAM polymerizes utilizing just five SAMs per turn of the helical polymer rather than the typical six per turn observed for all SAM polymers reported to date. Structural analysis suggested that malleability of the PHC3 SAM would allow formation of not just the five-fold repeat structure but possibly others. Indeed, a second PHC3 SAM polymer in a different crystal form forms a six-fold repeat polymer. These results suggest that the polymers formed by PHC3 SAM, and likely others, are quite dynamic. The functional consequence of the variable PHC3 SAM polymers may be to create different chromatin architectures. PMID:25044168
Hao, Ge-Fei; Xu, Wei-Fang; Yang, Sheng-Gang; Yang, Guang-Fu
2015-01-01
Protein and peptide structure predictions are of paramount importance for understanding their functions, as well as the interactions with other molecules. However, the use of molecular simulation techniques to directly predict the peptide structure from the primary amino acid sequence is always hindered by the rough topology of the conformational space and the limited simulation time scale. We developed here a new strategy, named Multiple Simulated Annealing-Molecular Dynamics (MSA-MD) to identify the native states of a peptide and miniprotein. A cluster of near native structures could be obtained by using the MSA-MD method, which turned out to be significantly more efficient in reaching the native structure compared to continuous MD and conventional SA-MD simulation. PMID:26492886
NASA Astrophysics Data System (ADS)
Ben-Nissan, Gili; Chotiner, Almog; Tarnavsky, Mark; Sharon, Michal
2016-06-01
Missense mutations that lead to the expression of mutant proteins carrying single amino acid substitutions are the cause of numerous diseases. Unlike gene lesions, insertions, deletions, nonsense mutations, or modified RNA splicing, which affect the length of a polypeptide, or determine whether a polypeptide is translated at all, missense mutations exert more subtle effects on protein structure, which are often difficult to evaluate. Here, we took advantage of the spectral resolution afforded by the EMR Orbitrap platform, to generate a mass spectrometry-based approach relying on simultaneous measurements of the wild-type protein and the missense variants. This approach not only considerably shortens the analysis time due to the concurrent acquisition but, more importantly, enables direct comparisons between the wild-type protein and the variants, allowing identification of even subtle structural changes. We demonstrate our approach using the Parkinson's-associated protein, DJ-1. Together with the wild-type protein, we examined two missense mutants, DJ-1A104T and DJ-1D149A, which lead to early-onset familial Parkinson's disease. Gas-phase, thermal, and chemical stability assays indicate clear alterations in the conformational stability of the two mutants: the structural stability of DJ-1D149A is reduced, whereas that of DJ-1A104T is enhanced. Overall, we anticipate that the methodology presented here will be applicable to numerous other missense mutants, promoting the structural investigations of multiple variants of the same protein.
Co-transcriptional nuclear actin dynamics
Percipalle, Piergiorgio
2013-01-01
Actin is a key player for nuclear structure and function regulating both chromosome organization and gene activity. In the cell nucleus actin interacts with many different proteins. Among these proteins several studies have identified classical nuclear factors involved in chromatin structure and function, transcription and RNA processing as well as proteins that are normally involved in controlling the actin cytoskeleton. These discoveries have raised the possibility that nuclear actin performs its multi task activities through tight interactions with different sets of proteins. This high degree of promiscuity in the spectrum of protein-to-protein interactions correlates well with the conformational plasticity of actin and the ability to undergo regulated changes in its polymerization states. Several of the factors involved in controlling head-to-tail actin polymerization have been shown to be in the nucleus where they seem to regulate gene activity. By focusing on the multiple tasks performed by actin and actin-binding proteins, possible models of how actin dynamics controls the different phases of the RNA polymerase II transcription cycle are being identified. PMID:23138849
Analysis of Structural Features Contributing to Weak Affinities of Ubiquitin/Protein Interactions.
Cohen, Ariel; Rosenthal, Eran; Shifman, Julia M
2017-11-10
Ubiquitin is a small protein that enables one of the most common post-translational modifications, where the whole ubiquitin molecule is attached to various target proteins, forming mono- or polyubiquitin conjugations. As a prototypical multispecific protein, ubiquitin interacts non-covalently with a variety of proteins in the cell, including ubiquitin-modifying enzymes and ubiquitin receptors that recognize signals from ubiquitin-conjugated substrates. To enable recognition of multiple targets and to support fast dissociation from the ubiquitin modifying enzymes, ubiquitin/protein interactions are characterized with low affinities, frequently in the higher μM and lower mM range. To determine how structure encodes low binding affinity of ubiquitin/protein complexes, we analyzed structures of more than a hundred such complexes compiled in the Ubiquitin Structural Relational Database. We calculated various structure-based features of ubiquitin/protein binding interfaces and compared them to the same features of general protein-protein interactions (PPIs) with various functions and generally higher affinities. Our analysis shows that ubiquitin/protein binding interfaces on average do not differ in size and shape complementarity from interfaces of higher-affinity PPIs. However, they contain fewer favorable hydrogen bonds and more unfavorable hydrophobic/charge interactions. We further analyzed how binding interfaces change upon affinity maturation of ubiquitin toward its target proteins. We demonstrate that while different features are improved in different experiments, the majority of the evolved complexes exhibit better shape complementarity and hydrogen bond pattern compared to wild-type complexes. Our analysis helps to understand how low-affinity PPIs have evolved and how they could be converted into high-affinity PPIs. Copyright © 2017 Elsevier Ltd. All rights reserved.
2016-01-01
ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app. PMID:27302480
Riffle, Michael; Jaschob, Daniel; Zelter, Alex; Davis, Trisha N
2016-08-05
ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app .
NASA Astrophysics Data System (ADS)
Hong, Mei
1999-08-01
We describe an approach to efficiently determine the backbone conformation of solid proteins that utilizes selective and extensive 13C labeling in conjunction with two-dimensional magic-angle-spinning NMR. The selective 13C labeling approach aims to reduce line broadening and other multispin complications encountered in solid-state NMR of uniformly labeled proteins while still enhancing the sensitivity of NMR spectra. It is achieved by using specifically labeled glucose or glycerol as the sole carbon source in the protein expression medium. For amino acids synthesized in the linear part of the biosynthetic pathways, [1-13C]glucose preferentially labels the ends of the side chains, while [2-13C]glycerol labels the Cα of these residues. Amino acids produced from the citric-acid cycle are labeled in a more complex manner. Information on the secondary structure of such a labeled protein was obtained by measuring multiple backbone torsion angles φ simultaneously, using an isotropic-anisotropic 2D correlation technique, the HNCH experiment. Initial experiments for resonance assignment of a selectively 13C labeled protein were performed using 15N-13C 2D correlation spectroscopy. From the time dependence of the 15N-13C dipolar coherence transfer, both intraresidue and interresidue connectivities can be observed, thus yielding partial sequential assignment. We demonstrate the selective 13C labeling and these 2D NMR experiments on a 8.5-kDa model protein, ubiquitin. This isotope-edited NMR approach is expected to facilitate the structure determination of proteins in the solid state.
Núñez-Vivanco, Gabriel; Valdés-Jiménez, Alejandro; Besoaín, Felipe; Reyes-Parada, Miguel
2016-01-01
Since the structure of proteins is more conserved than the sequence, the identification of conserved three-dimensional (3D) patterns among a set of proteins, can be important for protein function prediction, protein clustering, drug discovery and the establishment of evolutionary relationships. Thus, several computational applications to identify, describe and compare 3D patterns (or motifs) have been developed. Often, these tools consider a 3D pattern as that described by the residues surrounding co-crystallized/docked ligands available from X-ray crystal structures or homology models. Nevertheless, many of the protein structures stored in public databases do not provide information about the location and characteristics of ligand binding sites and/or other important 3D patterns such as allosteric sites, enzyme-cofactor interaction motifs, etc. This makes necessary the development of new ligand-independent methods to search and compare 3D patterns in all available protein structures. Here we introduce Geomfinder, an intuitive, flexible, alignment-free and ligand-independent web server for detailed estimation of similarities between all pairs of 3D patterns detected in any two given protein structures. We used around 1100 protein structures to form pairs of proteins which were assessed with Geomfinder. In these analyses each protein was considered in only one pair (e.g. in a subset of 100 different proteins, 50 pairs of proteins can be defined). Thus: (a) Geomfinder detected identical pairs of 3D patterns in a series of monoamine oxidase-B structures, which corresponded to the effectively similar ligand binding sites at these proteins; (b) we identified structural similarities among pairs of protein structures which are targets of compounds such as acarbose, benzamidine, adenosine triphosphate and pyridoxal phosphate; these similar 3D patterns are not detected using sequence-based methods; (c) the detailed evaluation of three specific cases showed the versatility of Geomfinder, which was able to discriminate between similar and different 3D patterns related to binding sites of common substrates in a range of diverse proteins. Geomfinder allows detecting similar 3D patterns between any two pair of protein structures, regardless of the divergency among their amino acids sequences. Although the software is not intended for simultaneous multiple comparisons in a large number of proteins, it can be particularly useful in cases such as the structure-based design of multitarget drugs, where a detailed analysis of 3D patterns similarities between a few selected protein targets is essential.
Future Directions of Structural Mass Spectrometry using Hydroxyl Radical Footprinting
Kiselar, Janna G.; Chance, Mark R.
2010-01-01
Hydroxyl radical protein footprinting coupled to mass spectrometry has been developed over the last decade and has matured to a powerful method for analyzing protein structure and dynamics. It has been successfully applied in the analysis of protein structure, protein folding, protein dynamics, and protein-protein and protein-DNA interactions. Using synchrotron radiolysis, exposures of proteins to a “white” x-ray beam for milliseconds provide sufficient oxidative modifications to surface amino acid side chains that can be easily detected and quantified by mass spectrometry. Thus, conformational changes in proteins or protein complexes can be examined using a time-resolved approach, which would be a valuable method for the study of macromolecular dynamics. In this review, we describe a new application of hydroxyl radical protein footprinting to probe the time evolution of the calcium-dependent conformational changes of gelsolin on the millisecond timescale. The data suggest a cooperative transition as multiple sites in different molecular sub-domains have similar rates of conformational change. These findings demonstrate that time-resolved protein footprinting is suitable for studies of protein dynamics that occur over periods ranging from milliseconds to seconds. In this review we also show how the structural resolution and sensitivity of the technology can be improved as well. The hydroxyl radical varies in its reactivity to different side chains by over two orders of magnitude, thus oxidation of amino acid side chains of lower reactivity are more rarely observed in such experiments. Here we demonstrate that selected reaction monitoring (SRM)-based method can be utilized for quantification of oxidized species, improving the signal to noise ratio. This expansion of the set of oxidized residues of lower reactivity will improve the overall structural resolution of the technique. This approach is also suggested as a basis for developing hypothesis driven structural mass spectrometry experiments. PMID:20812376
Microbial biotin protein ligases aid in understanding holocarboxylase synthetase deficiency.
Pendini, Nicole R; Bailey, Lisa M; Booker, Grant W; Wilce, Matthew C; Wallace, John C; Polyak, Steven W
2008-01-01
The attachment of biotin onto the biotin-dependent enzymes is catalysed by biotin protein ligase (BPL), also known as holocarboxylase synthase HCS in mammals. Mammals contain five biotin-enzymes that participate in a number of important metabolic pathways such as fatty acid biogenesis, gluconeogenesis and amino acid catabolism. All mammalian biotin-enzymes are post-translationally biotinylated, and therefore activated, through the action of a single HCS. Substrate recognition by BPLs occurs through conserved structural cues that govern the specificity of biotinylation. Defects in biotin metabolism, including HCS, give rise to multiple carboxylase deficiency (MCD). Here we review the literature on this important enzyme. In particular, we focus on the new information that has been learned about BPL's from a number of recently published protein structures. Through molecular modelling studies insights into the structural basis of HCS deficiency in MCD are discussed.
Self-generated covalent cross-links in the cell-surface adhesins of Gram-positive bacteria.
Baker, Edward N; Squire, Christopher J; Young, Paul G
2015-10-01
The ability of bacteria to adhere to other cells or to surfaces depends on long, thin adhesive structures that are anchored to their cell walls. These structures include extended protein oligomers known as pili and single, multi-domain polypeptides, mostly based on multiple tandem Ig-like domains. Recent structural studies have revealed the widespread presence of covalent cross-links, not previously seen within proteins, which stabilize these domains. The cross-links discovered so far are either isopeptide bonds that link lysine side chains to the side chains of asparagine or aspartic acid residues or ester bonds between threonine and glutamine side chains. These bonds appear to be formed by spontaneous intramolecular reactions as the proteins fold and are strategically placed so as to impart considerable mechanical strength. © 2015 Authors; published by Portland Press Limited.
Posttranslational Modifications Regulate the Postsynaptic Localization of PSD-95.
Vallejo, Daniela; Codocedo, Juan F; Inestrosa, Nibaldo C
2017-04-01
The postsynaptic density (PSD) consists of a lattice-like array of interacting proteins that organizes and stabilizes synaptic receptors, ion channels, structural proteins, and signaling molecules required for normal synaptic transmission and synaptic function. The scaffolding and hub protein postsynaptic density protein-95 (PSD-95) is a major element of central chemical synapses and interacts with glutamate receptors, cell adhesion molecules, and cytoskeletal elements. In fact, PSD-95 can regulate basal synaptic stability as well as the activity-dependent structural plasticity of the PSD and, therefore, of the excitatory chemical synapse. Several studies have shown that PSD-95 is highly enriched at excitatory synapses and have identified multiple protein structural domains and protein-protein interactions that mediate PSD-95 function and trafficking to the postsynaptic region. PSD-95 is also a target of several signaling pathways that induce posttranslational modifications, including palmitoylation, phosphorylation, ubiquitination, nitrosylation, and neddylation; these modifications determine the synaptic stability and function of PSD-95 and thus regulate the fates of individual dendritic spines in the nervous system. In the present work, we review the posttranslational modifications that regulate the synaptic localization of PSD-95 and describe their functional consequences. We also explore the signaling pathways that induce such changes.
The modular architecture of protein-protein binding interfaces.
Reichmann, D; Rahat, O; Albeck, S; Meged, R; Dym, O; Schreiber, G
2005-01-04
Protein-protein interactions are essential for life. Yet, our understanding of the general principles governing binding is not complete. In the present study, we show that the interface between proteins is built in a modular fashion; each module is comprised of a number of closely interacting residues, with few interactions between the modules. The boundaries between modules are defined by clustering the contact map of the interface. We show that mutations in one module do not affect residues located in a neighboring module. As a result, the structural and energetic consequences of the deletion of entire modules are surprisingly small. To the contrary, within their module, mutations cause complex energetic and structural consequences. Experimentally, this phenomenon is shown on the interaction between TEM1-beta-lactamase and beta-lactamase inhibitor protein (BLIP) by using multiple-mutant analysis and x-ray crystallography. Replacing an entire module of five interface residues with Ala created a large cavity in the interface, with no effect on the detailed structure of the remaining interface. The modular architecture of binding sites, which resembles human engineering design, greatly simplifies the design of new protein interactions and provides a feasible view of how these interactions evolved.
Materiomics: biological protein materials, from nano to macro
Cranford, Steven; Buehler, Markus J
2010-01-01
Materiomics is an emerging field of science that provides a basis for multiscale material system characterization, inspired in part by natural, for example, protein-based materials. Here we outline the scope and explain the motivation of the field of materiomics, as well as demonstrate the benefits of a materiomic approach in the understanding of biological and natural materials as well as in the design of de novo materials. We discuss recent studies that exemplify the impact of materiomics – discovering Nature’s complexity through a materials science approach that merges concepts of material and structure throughout all scales and incorporates feedback loops that facilitate sensing and resulting structural changes at multiple scales. The development and application of materiomics is illustrated for the specific case of protein-based materials, which constitute the building blocks of a variety of biological systems such as tendon, bone, skin, spider silk, cells, and tissue, as well as natural composite material systems (a combination of protein-based and inorganic constituents) such as nacre and mollusk shells, and other natural multiscale systems such as cellulose-based plant and wood materials. An important trait of these materials is that they display distinctive hierarchical structures across multiple scales, where molecular details are exhibited in macroscale mechanical responses. Protein materials are intriguing examples of materials that balance multiple tasks, representing some of the most sustainable material solutions that integrate structure and function despite severe limitations in the quality and quantity of material building blocks. However, up until now, our attempts to analyze and replicate Nature’s materials have been hindered by our lack of fundamental understanding of these materials’ intricate hierarchical structures, scale-bridging mechanisms, and complex material components that bestow protein-based materials their unique properties. Recent advances in analytical tools and experimental methods allow a holistic view of such a hierarchical biological material system. The integration of these approaches and amalgamation of material properties at all scale levels to develop a complete description of a material system falls within the emerging field of materiomics. Materiomics is the result of the convergence of engineering and materials science with experimental and computational biology in the context of natural and synthetic materials. Through materiomics, fundamental advances in our understanding of structure–property–process relations of biological systems contribute to the mechanistic understanding of certain diseases and facilitate the development of novel biological, biologically inspired, and completely synthetic materials for applications in medicine (biomaterials), nanotechnology, and engineering. PMID:24198478
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mikhailov, Victor S.; N. K. Koltzov Institute of Developmental Biology, Russian Academy of Sciences, Moscow 117808; Vanarsdall, Adam L.
2008-01-20
DNA-binding protein (DBP) of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) was expressed as an N-terminal His{sub 6}-tag fusion using a recombinant baculovirus and purified to near homogeneity. Purified DBP formed oligomers that were crosslinked by redox reagents resulting in predominantly protein dimers and tetramers. In gel retardation assays, DBP showed a high affinity for single-stranded oligonucleotides and was able to compete with another baculovirus SSB protein, LEF-3, for binding sites. DBP binding protected ssDNA against hydrolysis by a baculovirus alkaline nuclease AN/LEF-3 complex. Partial proteolysis by trypsin revealed a domain structure of DBP that is required for interaction with DNA andmore » that can be disrupted by thermal treatment. Binding to ssDNA, but not to dsDNA, changed the pattern of proteolytic fragments of DBP indicating adjustments in protein structure upon interaction with ssDNA. DBP was capable of unwinding short DNA duplexes and also promoted the renaturation of long complementary strands of ssDNA into duplexes. The unwinding and renaturation activities of DBP, as well as the DNA binding activity, were sensitive to sulfhydryl reagents and were inhibited by oxidation of thiol groups with diamide or by alkylation with N-ethylmaleimide. A high affinity of DBP for ssDNA and its unwinding and renaturation activities confirmed identification of DBP as a member of the SSB/recombinase family. These activities and a tight association with subnuclear structures suggests that DBP is a component of the virogenic stroma that is involved in the processing of replicative intermediates.« less
SBION: A Program for Analyses of Salt-Bridges from Multiple Structure Files.
Gupta, Parth Sarthi Sen; Mondal, Sudipta; Mondal, Buddhadev; Islam, Rifat Nawaz Ul; Banerjee, Shyamashree; Bandyopadhyay, Amal K
2014-01-01
Salt-bridge and network salt-bridge are specific electrostatic interactions that contribute to the overall stability of proteins. In hierarchical protein folding model, these interactions play crucial role in nucleation process. The advent and growth of protein structure database and its availability in public domain made an urgent need for context dependent rapid analysis of salt-bridges. While these analyses on single protein is cumbersome and time-consuming, batch analyses need efficient software for rapid topological scan of a large number of protein for extracting details on (i) fraction of salt-bridge residues (acidic and basic). (ii) Chain specific intra-molecular salt-bridges, (iii) inter-molecular salt-bridges (protein-protein interactions) in all possible binary combinations (iv) network salt-bridges and (v) secondary structure distribution of salt-bridge residues. To the best of our knowledge, such efficient software is not available in public domain. At this juncture, we have developed a program i.e. SBION which can perform all the above mentioned computations for any number of protein with any number of chain at any given distance of ion-pair. It is highly efficient, fast, error-free and user friendly. Finally we would say that our SBION indeed possesses potential for applications in the field of structural and comparative bioinformatics studies. SBION is freely available for non-commercial/academic institutions on formal request to the corresponding author (akbanerjee@biotech.buruniv.ac.in).
The poly(C)-binding proteins: a multiplicity of functions and a search for mechanisms.
Makeyev, Aleksandr V; Liebhaber, Stephen A
2002-01-01
The poly(C) binding proteins (PCBPs) are encoded at five dispersed loci in the mouse and human genomes. These proteins, which can be divided into two groups, hnRNPs K/J and the alphaCPs (alphaCP1-4), are linked by a common evolutionary history, a shared triple KH domain configuration, and by their poly(C) binding specificity. Given these conserved characteristics it is remarkable to find a substantial diversity in PCBP functions. The roles of these proteins in mRNA stabilization, translational activation, and translational silencing suggest a complex and diverse set of post-transcriptional control pathways. Their additional putative functions in transcriptional control and as structural components of important DNA-protein complexes further support their remarkable structural and functional versatility. Clearly the identification of additional binding targets and delineation of corresponding control mechanisms and effector pathways will establish highly informative models for further exploration. PMID:12003487
The poly(C)-binding proteins: a multiplicity of functions and a search for mechanisms.
Makeyev, Aleksandr V; Liebhaber, Stephen A
2002-03-01
The poly(C) binding proteins (PCBPs) are encoded at five dispersed loci in the mouse and human genomes. These proteins, which can be divided into two groups, hnRNPs K/J and the alphaCPs (alphaCP1-4), are linked by a common evolutionary history, a shared triple KH domain configuration, and by their poly(C) binding specificity. Given these conserved characteristics it is remarkable to find a substantial diversity in PCBP functions. The roles of these proteins in mRNA stabilization, translational activation, and translational silencing suggest a complex and diverse set of post-transcriptional control pathways. Their additional putative functions in transcriptional control and as structural components of important DNA-protein complexes further support their remarkable structural and functional versatility. Clearly the identification of additional binding targets and delineation of corresponding control mechanisms and effector pathways will establish highly informative models for further exploration.
Uehara, Shota; Tanaka, Shigenori
2017-04-24
Protein flexibility is a major hurdle in current structure-based virtual screening (VS). In spite of the recent advances in high-performance computing, protein-ligand docking methods still demand tremendous computational cost to take into account the full degree of protein flexibility. In this context, ensemble docking has proven its utility and efficiency for VS studies, but it still needs a rational and efficient method to select and/or generate multiple protein conformations. Molecular dynamics (MD) simulations are useful to produce distinct protein conformations without abundant experimental structures. In this study, we present a novel strategy that makes use of cosolvent-based molecular dynamics (CMD) simulations for ensemble docking. By mixing small organic molecules into a solvent, CMD can stimulate dynamic protein motions and induce partial conformational changes of binding pocket residues appropriate for the binding of diverse ligands. The present method has been applied to six diverse target proteins and assessed by VS experiments using many actives and decoys of DEKOIS 2.0. The simulation results have revealed that the CMD is beneficial for ensemble docking. Utilizing cosolvent simulation allows the generation of druggable protein conformations, improving the VS performance compared with the use of a single experimental structure or ensemble docking by standard MD with pure water as the solvent.
2012-01-01
Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Hafsa, Noor E.; Arndt, David; Wishart, David S.
2015-01-01
The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, β-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, β-strands, coil regions, five common β-turns (type I, II, I′, II′ and VIII), β hairpins as well as interior and edge β-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0. PMID:25979265
Srinivasan, E; Rajasekaran, R
2017-07-25
The genetic substitution mutation of Cys146Arg in the SOD1 protein is predominantly found in the Japanese population suffering from familial amyotrophic lateral sclerosis (FALS). A complete study of the biophysical aspects of this particular missense mutation through conformational analysis and producing free energy landscapes could provide an insight into the pathogenic mechanism of ALS disease. In this study, we utilized general molecular dynamics simulations along with computational predictions to assess the structural characterization of the protein as well as the conformational preferences of monomeric wild type and mutant SOD1. Our static analysis, accomplished through multiple programs, predicted the deleterious and destabilizing effect of mutant SOD1. Subsequently, comparative molecular dynamic studies performed on the wild type and mutant SOD1 indicated a loss in the protein conformational stability and flexibility. We observed the mutational consequences not only in local but also in long-range variations in the structural properties of the SOD1 protein. Long-range intramolecular protein interactions decrease upon mutation, resulting in less compact structures in the mutant protein rather than in the wild type, suggesting that the mutant structures are less stable than the wild type SOD1. We also presented the free energy landscape to study the collective motion of protein conformations through principal component analysis for the wild type and mutant SOD1. Overall, the study assisted in revealing the cause of the structural destabilization and protein misfolding via structural characterization, secondary structure composition and free energy landscapes. Hence, the computational framework in our study provides a valuable direction for the search for the cure against fatal FALS.
Cocco, Simona; Monasson, Remi; Weigt, Martin
2013-01-01
Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764
Construction of protocellular structures under simulated primitive earth conditions
NASA Astrophysics Data System (ADS)
Yanagawa, Hiroshi; Ogawa, Yoko; Kojima, Kiyotsugu; Ito, Masahiko
1988-09-01
We have developed experimental approaches for the construction of protocellular structures under simulated primitive earth conditions and studied their formation and characteristics. Three types of envelopes; protein envelopes, lipid envelopes, and lipid-protein envelopes are considered as candidates for protocellular structures. Simple protein envelopes and lipid envelopes are presumed to have originated at an early stage of chemical evolution, interaction mutually and then evolved into more complex envelopes composed of both lipids and proteins. Three kinds of protein envelopes were constructedin situ from amino acids under simulated primitive earth conditions such as a fresh water tide pool, a warm sea, and a submarine hydrothermal vent. One protein envelope was formed from a mixture of amino acid amides at 80 °C using multiple hydration-dehydration cycles. Marigranules, protein envelope structures, were produced from mixtures of glycine and acidic, basic and aromatic amino acids at 105 °C in a modified sea medium enriched with essential transition elements. Thermostable microspheres were also formed from a mixture of glycine, alanine, valine, and aspartic acid at 250 °C and above. The microspheres did not form at lower temperatures and consist of silicates and peptide-like polymers containing imide bonds and amino acid residues enriched in valine. Amphiphilic proteins with molecular weights of 2000 were necessary for the formation of the protein envelopes. Stable lipid envelopes were formed from different dialkyl phospholipids and fatty acids. Large, stable, lipid-protein envelopes were formed from egg lecithin and the solubilized marigranules. Polycations such as polylysine and polyhistidine, or basic proteins such as lysozyme and cytochromec also stabilized lipid-protein envelopes.
NASA Astrophysics Data System (ADS)
Korkmaz, Nuriye; Ostermann, Kai; Rödel, Gerhard
2011-03-01
Surface layer proteins have the appealing property to self-assemble in nanosized arrays in solution and on solid substrates. In this work, we characterize the formation of assembly structures of the recombinant surface layer protein SbsC of Geobacillus stearothermophilus ATTC 12980, which was tagged with enhanced green fluorescent protein and expressed in the yeast Saccharomyces cerevisiae. The tubular structures formed by the protein in vivo are retained upon bursting the cells by osmotic shock; however, their average length is decreased. During dialysis, monomers obtained by treatment with chaotropic chemicals recrystallize again to form tube-like structures. This process is strictly dependent on calcium (Ca2 + ) ions, with an optimal concentration of 10 mM. Further increase of the Ca2 + concentration results in multiple non-productive nucleation points. We further show that the lengths of the S-layer assemblies increase with time and can be controlled by pH. After 48 h, the average length at pH 9.0 is 4.13 µm compared to 2.69 µm at pH 5.5. Successful chemical deposition of platinum indicates the potential of recrystallized mSbsC-eGFP structures for nanobiotechnological applications.
A non-canonical DNA structure enables homologous recombination in various genetic systems.
Masuda, Tokiha; Ito, Yutaka; Terada, Tohru; Shibata, Takehiko; Mikawa, Tsutomu
2009-10-30
Homologous recombination, which is critical to genetic diversity, depends on homologous pairing (HP). HP is the switch from parental to recombinant base pairs, which requires expansion of inter-base pair spaces. This expansion unavoidably causes untwisting of the parental double-stranded DNA. RecA/Rad51-catalyzed ATP-dependent HP is extensively stimulated in vitro by negative supercoils, which compensates for untwisting. However, in vivo, double-stranded DNA is relaxed by bound proteins and thus is an unfavorable substrate for RecA/Rad51. In contrast, Mhr1, an ATP-independent HP protein required for yeast mitochondrial homologous recombination, catalyzes HP without the net untwisting of double-stranded DNA. Therefore, we questioned whether Mhr1 uses a novel strategy to promote HP. Here, we found that, like RecA, Mhr1 induced the extension of bound single-stranded DNA. In addition, this structure was induced by all evolutionarily and structurally distinct HP proteins so far tested, including bacterial RecO, viral RecT, and human Rad51. Thus, HP includes the common non-canonical DNA structure and uses a common core mechanism, independent of the species of HP proteins. We discuss the significance of multiple types of HP proteins.
Struct2Net: a web service to predict protein–protein interactions using a structure-based approach
Singh, Rohit; Park, Daniel; Xu, Jinbo; Hosur, Raghavendra; Berger, Bonnie
2010-01-01
Struct2Net is a web server for predicting interactions between arbitrary protein pairs using a structure-based approach. Prediction of protein–protein interactions (PPIs) is a central area of interest and successful prediction would provide leads for experiments and drug design; however, the experimental coverage of the PPI interactome remains inadequate. We believe that Struct2Net is the first community-wide resource to provide structure-based PPI predictions that go beyond homology modeling. Also, most web-resources for predicting PPIs currently rely on functional genomic data (e.g. GO annotation, gene expression, cellular localization, etc.). Our structure-based approach is independent of such methods and only requires the sequence information of the proteins being queried. The web service allows multiple querying options, aimed at maximizing flexibility. For the most commonly studied organisms (fly, human and yeast), predictions have been pre-computed and can be retrieved almost instantaneously. For proteins from other species, users have the option of getting a quick-but-approximate result (using orthology over pre-computed results) or having a full-blown computation performed. The web service is freely available at http://struct2net.csail.mit.edu. PMID:20513650
Xue, Yi; Skrynnikov, Nikolai R
2014-01-01
Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for 15N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields. PMID:24452989
Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting
2016-01-01
ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181
Multiple Replica Repulsion Technique for Efficient Conformational Sampling of Biological Systems
Malevanets, Anatoly; Wodak, Shoshana J.
2011-01-01
Here, we propose a technique for sampling complex molecular systems with many degrees of freedom. The technique, termed “multiple replica repulsion” (MRR), does not suffer from poor scaling with the number of degrees of freedom associated with common replica exchange procedures and does not require sampling at high temperatures. The algorithm involves creation of multiple copies (replicas) of the system, which interact with one another through a repulsive potential that can be applied to the system as a whole or to portions of it. The proposed scheme prevents oversampling of the most populated states and provides accurate descriptions of conformational perturbations typically associated with sampling ground-state energy wells. The performance of MRR is illustrated for three systems of increasing complexity. A two-dimensional toy potential surface is used to probe the sampling efficiency as a function of key parameters of the procedure. MRR simulations of the Met-enkephalin pentapeptide, and the 76-residue protein ubiquitin, performed in presence of explicit water molecules and totaling 32 ns each, investigate the ability of MRR to characterize the conformational landscape of the peptide, and the protein native basin, respectively. Results obtained for the enkephalin peptide reflect more closely the extensive conformational flexibility of this peptide than previously reported simulations. Those obtained for ubiquitin show that conformational ensembles sampled by MRR largely encompass structural fluctuations relevant to biological recognition, which occur on the microsecond timescale, or are observed in crystal structures of ubiquitin complexes with other proteins. MRR thus emerges as a very promising simple and versatile technique for modeling the structural plasticity of complex biological systems. PMID:21843487
NASA Astrophysics Data System (ADS)
Gaines, J. C.; Clark, A. H.; Regan, L.; O'Hern, C. S.
2017-07-01
Proteins are biological polymers that underlie all cellular functions. The first high-resolution protein structures were determined by x-ray crystallography in the 1960s. Since then, there has been continued interest in understanding and predicting protein structure and stability. It is well-established that a large contribution to protein stability originates from the sequestration from solvent of hydrophobic residues in the protein core. How are such hydrophobic residues arranged in the core; how can one best model the packing of these residues, and are residues loosely packed with multiple allowed side chain conformations or densely packed with a single allowed side chain conformation? Here we show that to properly model the packing of residues in protein cores it is essential that amino acids are represented by appropriately calibrated atom sizes, and that hydrogen atoms are explicitly included. We show that protein cores possess a packing fraction of φ ≈ 0.56 , which is significantly less than the typically quoted value of 0.74 obtained using the extended atom representation. We also compare the results for the packing of amino acids in protein cores to results obtained for jammed packings from discrete element simulations of spheres, elongated particles, and composite particles with bumpy surfaces. We show that amino acids in protein cores pack as densely as disordered jammed packings of particles with similar values for the aspect ratio and bumpiness as found for amino acids. Knowing the structural properties of protein cores is of both fundamental and practical importance. Practically, it enables the assessment of changes in the structure and stability of proteins arising from amino acid mutations (such as those identified as a result of the massive human genome sequencing efforts) and the design of new folded, stable proteins and protein-protein interactions with tunable specificity and affinity.
Disorder and function: a review of the dehydrin protein family
Graether, Steffen P.; Boddington, Kelly F.
2014-01-01
Dehydration proteins (dehydrins) are group 2 members of the late embryogenesis abundant (LEA) protein family. The protein architecture of dehydrins can be described by the presence of three types of conserved sequence motifs that have been named the K-, Y-, and S-segments. By definition, a dehydrin must contain at least one copy of the lysine-rich K-segment. Abiotic stresses such as drought, cold, and salinity cause the upregulation of dehydrin mRNA and protein levels. Despite the large body of genetic and protein evidence of the importance of these proteins in stress response, the in vivo protective mechanism is not fully known. In vitro experimental evidence from biochemical assays and localization experiments suggests multiple roles for dehydrins, including membrane protection, cryoprotection of enzymes, and protection from reactive oxygen species. Membrane binding by dehydrins is likely to be as a peripheral membrane protein, since the protein sequences are highly hydrophilic and contain many charged amino acids. Because of this, dehydrins in solution are intrinsically disordered proteins, that is, they have no well-defined secondary or tertiary structure. Despite their disorder, dehydrins have been shown to gain structure when bound to ligands such as membranes, and to possibly change their oligomeric state when bound to ions. We review what is currently known about dehydrin sequences and their structures, and examine the various ligands that have been shown to bind to this family of proteins. PMID:25400646
Cryo-Electron Microscopy of Viruses Infecting Bacterium
NASA Astrophysics Data System (ADS)
Chiu, Wah
2010-03-01
Single particle cryo-EM can yield structures of infectious bacterial viruses with and without imposed icosahedral symmetry at subnanometer resolution. Reconstructions of infectious and empty phage particles show substantial differences in the portal vertex protein complex at one of the 12 pentameric vertices in the icosahedral virus particle through which the viral genomes are packaged or released. In addition, electron cryo-tomography of viruses during infecting its bacterial host cell displayed multiple conformations of the tail fiber of the virus. Our structural observations by single particle and tomographic reconstructions suggest a mechanism whereby the viral tail fibers, upon binding to the host cell, induce a cascade of structural alterations of the portal vertex protein complex that triggers DNA release.
Mechanisms of amyloid formation revealed by solution NMR
Karamanos, Theodoros K.; Kalverda, Arnout P.; Thompson, Gary S.; Radford, Sheena E.
2015-01-01
Amyloid fibrils are proteinaceous elongated aggregates involved in more than fifty human diseases. Recent advances in electron microscopy and solid state NMR have allowed the characterization of fibril structures to different extents of refinement. However, structural details about the mechanism of fibril formation remain relatively poorly defined. This is mainly due to the complex, heterogeneous and transient nature of the species responsible for assembly; properties that make them difficult to detect and characterize in structural detail using biophysical techniques. The ability of solution NMR spectroscopy to investigate exchange between multiple protein states, to characterize transient and low-population species, and to study high molecular weight assemblies, render NMR an invaluable technique for studies of amyloid assembly. In this article we review state-of-the-art solution NMR methods for investigations of: (a) protein dynamics that lead to the formation of aggregation-prone species; (b) amyloidogenic intrinsically disordered proteins; and (c) protein–protein interactions on pathway to fibril formation. Together, these topics highlight the power and potential of NMR to provide atomic level information about the molecular mechanisms of one of the most fascinating problems in structural biology. PMID:26282197
Ligand-biased ensemble receptor docking (LigBEnD): a hybrid ligand/receptor structure-based approach
NASA Astrophysics Data System (ADS)
Lam, Polo C.-H.; Abagyan, Ruben; Totrov, Maxim
2018-01-01
Ligand docking to flexible protein molecules can be efficiently carried out through ensemble docking to multiple protein conformations, either from experimental X-ray structures or from in silico simulations. The success of ensemble docking often requires the careful selection of complementary protein conformations, through docking and scoring of known co-crystallized ligands. False positives, in which a ligand in a wrong pose achieves a better docking score than that of native pose, arise as additional protein conformations are added. In the current study, we developed a new ligand-biased ensemble receptor docking method and composite scoring function which combine the use of ligand-based atomic property field (APF) method with receptor structure-based docking. This method helps us to correctly dock 30 out of 36 ligands presented by the D3R docking challenge. For the six mis-docked ligands, the cognate receptor structures prove to be too different from the 40 available experimental Pocketome conformations used for docking and could be identified only by receptor sampling beyond experimentally explored conformational subspace.
Docking and scoring protein interactions: CAPRI 2009.
Lensink, Marc F; Wodak, Shoshana J
2010-11-15
Protein docking algorithms are assessed by evaluating blind predictions performed during 2007-2009 in Rounds 13-19 of the community-wide experiment on critical assessment of predicted interactions (CAPRI). We evaluated the ability of these algorithms to sample docking poses and to single out specific association modes in 14 targets, representing 11 distinct protein complexes. These complexes play important biological roles in RNA maturation, G-protein signal processing, and enzyme inhibition and function. One target involved protein-RNA interactions not previously considered in CAPRI, several others were hetero-oligomers, or featured multiple interfaces between the same protein pair. For most targets, predictions started from the experimentally determined structures of the free (unbound) components, or from models built from known structures of related or similar proteins. To succeed they therefore needed to account for conformational changes and model inaccuracies. In total, 64 groups and 12 web-servers submitted docking predictions of which 4420 were evaluated. Overall our assessment reveals that 67% of the groups, more than ever before, produced acceptable models or better for at least one target, with many groups submitting multiple high- and medium-accuracy models for two to six targets. Forty-one groups including four web-servers participated in the scoring experiment with 1296 evaluated models. Scoring predictions also show signs of progress evidenced from the large proportion of correct models submitted. But singling out the best models remains a challenge, which also adversely affects the ability to correctly rank docking models. With the increased interest in translating abstract protein interaction networks into realistic models of protein assemblies, the growing CAPRI community is actively developing more efficient and reliable docking and scoring methods for everyone to use. © 2010 Wiley-Liss, Inc.
Brylinski, Michal; Skolnick, Jeffrey
2010-01-01
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609
Müller, Boje; Groscurth, Sira; Menzel, Matthias; Rüping, Boris A.; Twyman, Richard M.; Prüfer, Dirk; Noll, Gundula A.
2014-01-01
Background and Aims Forisomes are specialized structural phloem proteins that mediate sieve element occlusion after wounding exclusively in papilionoid legumes, but most studies of forisome structure and function have focused on the Old World clade rather than the early lineages. A comprehensive phylogenetic, molecular, structural and functional analysis of forisomes from species covering a broad spectrum of the papilionoid legumes was therefore carried out, including the first analysis of Dipteryx panamensis forisomes, representing the earliest branch of the Papilionoideae lineage. The aim was to study the molecular, structural and functional conservation among forisomes from different tribes and to establish the roles of individual forisome subunits. Methods Sequence analysis and bioinformatics were combined with structural and functional analysis of native forisomes and artificial forisome-like protein bodies, the latter produced by expressing forisome genes from different legumes in a heterologous background. The structure of these bodies was analysed using a combination of confocal laser scanning microscopy (CLSM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and the function of individual subunits was examined by combinatorial expression, micromanipulation and light microscopy. Key Results Dipteryx panamensis native forisomes and homomeric protein bodies assembled from the single sieve element occlusion by forisome (SEO-F) subunit identified in this species were structurally and functionally similar to forisomes from the Old World clade. In contrast, homomeric protein bodies assembled from individual SEO-F subunits from Old World species yielded artificial forisomes differing in proportion to their native counterparts, suggesting that multiple SEO-F proteins are required for forisome assembly in these plants. Structural differences between Medicago truncatula native forisomes, homomeric protein bodies and heteromeric bodies containing all possible subunit combinations suggested that combinations of SEO-F proteins may fine-tune the geometric proportions and reactivity of forisomes. Conclusions It is concluded that forisome structure and function have been strongly conserved during evolution and that species-dependent subsets of SEO-F proteins may have evolved to fine-tune the structure of native forisomes. PMID:24694827
Protein space: a natural method for realizing the nature of protein universe.
Yu, Chenglong; Deng, Mo; Cheng, Shiu-Yuen; Yau, Shek-Chung; He, Rong L; Yau, Stephen S-T
2013-02-07
Current methods cannot tell us what the nature of the protein universe is concretely. They are based on different models of amino acid substitution and multiple sequence alignment which is an NP-hard problem and requires manual intervention. Protein structural analysis also gives a direction for mapping the protein universe. Unfortunately, now only a minuscule fraction of proteins' 3-dimensional structures are known. Furthermore, the phylogenetic tree representations are not unique for any existing tree construction methods. Here we develop a novel method to realize the nature of protein universe. We show the protein universe can be realized as a protein space in 60-dimensional Euclidean space using a distance based on a normalized distribution of amino acids. Every protein is in one-to-one correspondence with a point in protein space, where proteins with similar properties stay close together. Thus the distance between two points in protein space represents the biological distance of the corresponding two proteins. We also propose a natural graphical representation for inferring phylogenies. The representation is natural and unique based on the biological distances of proteins in protein space. This will solve the fundamental question of how proteins are distributed in the protein universe. Copyright © 2012 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Sheng; Yang, Feng; Petyuk, Vladislav A.
Protein modification by O-linked beta-N-acetylglucosamine (O-GlcNAc) is emerging as an important factor in the pathogenesis of sporadic Alzheimer’s disease. Herein we report the most comprehensive, quantitative proteomics analysis for protein O-GlcNAcylation in post-mortem human brains with and without Alzheimer’s using isobaric tandem mass tags labeling, chemoenzymatic photocleavage enrichment and liquid chromatography coupled to mass spectrometry. A total of 1,850 O-GlcNAc peptides covering 1,094 O-GlcNAcylation sites were identified from 530 proteins in the human brain. 128 O-GlcNAc peptides covering 78 proteins were altered significantly in Alzheimer’s brain as compared to controls (q<0.05). Moreover, alteration of the O-GlcNAc peptide abundance could bemore » attributed more to O-GlcNAcylation level than to protein level changes. The altered O-GlcNAcylated proteins belong to several structural and functional categories, including synaptic proteins, cytoskeleton proteins, and memory-associated proteins. These findings suggest that dysregulation of O-GlcNAcylation of multiple brain proteins may be involved in the development of sporadic Alzheimer’s disease.« less
POLYVIEW-MM: web-based platform for animation and analysis of molecular simulations
Porollo, Aleksey; Meller, Jaroslaw
2010-01-01
Molecular simulations offer important mechanistic and functional clues in studies of proteins and other macromolecules. However, interpreting the results of such simulations increasingly requires tools that can combine information from multiple structural databases and other web resources, and provide highly integrated and versatile analysis tools. Here, we present a new web server that integrates high-quality animation of molecular motion (MM) with structural and functional analysis of macromolecules. The new tool, dubbed POLYVIEW-MM, enables animation of trajectories generated by molecular dynamics and related simulation techniques, as well as visualization of alternative conformers, e.g. obtained as a result of protein structure prediction methods or small molecule docking. To facilitate structural analysis, POLYVIEW-MM combines interactive view and analysis of conformational changes using Jmol and its tailored extensions, publication quality animation using PyMol, and customizable 2D summary plots that provide an overview of MM, e.g. in terms of changes in secondary structure states and relative solvent accessibility of individual residues in proteins. Furthermore, POLYVIEW-MM integrates visualization with various structural annotations, including automated mapping of known inter-action sites from structural homologs, mapping of cavities and ligand binding sites, transmembrane regions and protein domains. URL: http://polyview.cchmc.org/conform.html. PMID:20504857
Structure and function of archaeal prefoldin, a co-chaperone of group II chaperonin.
Ohtaki, Akashi; Noguchi, Keiichi; Yohda, Masafumi
2010-01-01
Molecular chaperones are key cellular components involved in the maintenance of protein homeostasis and other unrelated functions. Prefoldin is a chaperone that acts as a co-factor of group II chaperonins in eukaryotes and archaea. It assists proper folding of protein by capturing nonnative proteins and delivering it to the group II chaperonin. Eukaryotic prefoldin is a multiple subunit complex composed of six different polypeptide chains. Archaeal prefoldin, on the other hand, is a heterohexameric complex composed of two alpha and four beta subunits, and forms a double beta barrel assembly with six long coiled coils protruding from it like a jellyfish with six tentacles. Based on the structural information of the archaeal prefoldin, substrate recognition and prefoldin-chaperonin binding mechanisms have been investigated. In this paper, we review a series of studies on the molecular mechanisms of archaeal PFD function. Particular emphasis will be placed on the molecular structures revealed by X-ray crystallography and molecular dynamics induced by binding to nonnative protein substrates.
New strategy for protein interactions and application to structure-based drug design
NASA Astrophysics Data System (ADS)
Zou, Xiaoqin
One of the greatest challenges in computational biophysics is to predict interactions between biological molecules, which play critical roles in biological processes and rational design of therapeutic drugs. Biomolecular interactions involve delicate interplay between multiple interactions, including electrostatic interactions, van der Waals interactions, solvent effect, and conformational entropic effect. Accurate determination of these complex and subtle interactions is challenging. Moreover, a biological molecule such as a protein usually consists of thousands of atoms, and thus occupies a huge conformational space. The large degrees of freedom pose further challenges for accurate prediction of biomolecular interactions. Here, I will present our development of physics-based theory and computational modeling on protein interactions with other molecules. The major strategy is to extract microscopic energetics from the information embedded in the experimentally-determined structures of protein complexes. I will also present applications of the methods to structure-based therapeutic design. Supported by NSF CAREER Award DBI-0953839, NIH R01GM109980, and the American Heart Association (Midwest Affiliate) [13GRNT16990076].
Jowitt, Thomas A; Murdoch, Alan D; Baldock, Clair; Berry, Richard; Day, Joanna M; Hardingham, Timothy E
2010-01-01
Structural investigation of proteins containing large stretches of sequences without predicted secondary structure is the focus of much increased attention. Here, we have produced an unglycosylated 30 kDa peptide from the chondroitin sulphate (CS)-attachment region of human aggrecan (CS-peptide), which was predicted to be intrinsically disordered and compared its structure with the adjacent aggrecan G3 domain. Biophysical analyses, including analytical ultracentrifugation, light scattering, and circular dichroism showed that the CS-peptide had an elongated and stiffened conformation in contrast to the globular G3 domain. The results suggested that it contained significant secondary structure, which was sensitive to urea, and we propose that the CS-peptide forms an elongated wormlike molecule based on a dynamic range of energetically equivalent secondary structures stabilized by hydrogen bonds. The dimensions of the structure predicted from small-angle X-ray scattering analysis were compatible with EM images of fully glycosylated aggrecan and a partly glycosylated aggrecan CS2-G3 construct. The semiordered structure identified in CS-peptide was not predicted by common structural algorithms and identified a potentially distinct class of semiordered structure within sequences currently identified as disordered. Sequence comparisons suggested some evidence for comparable structures in proteins encoded by other genes (PRG4, MUC5B, and CBP). The function of these semiordered sequences may serve to spatially position attached folded modules and/or to present polypeptides for modification, such as glycosylation, and to provide templates for the multiple pleiotropic interactions proposed for disordered proteins. Proteins 2010. © 2010 Wiley-Liss, Inc. PMID:20806220
P³DB 3.0: From plant phosphorylation sites to protein networks.
Yao, Qiuming; Ge, Huangyi; Wu, Shangquan; Zhang, Ning; Chen, Wei; Xu, Chunhui; Gao, Jianjiong; Thelen, Jay J; Xu, Dong
2014-01-01
In the past few years, the Plant Protein Phosphorylation Database (P(3)DB, http://p3db.org) has become one of the most significant in vivo data resources for studying plant phosphoproteomics. We have substantially updated P(3)DB with respect to format, new datasets and analytic tools. In the P(3)DB 3.0, there are altogether 47 923 phosphosites in 16 477 phosphoproteins curated across nine plant organisms from 32 studies, which have met our multiple quality standards for acquisition of in vivo phosphorylation site data. Centralized by these phosphorylation data, multiple related data and annotations are provided, including protein-protein interaction (PPI), gene ontology, protein tertiary structures, orthologous sequences, kinase/phosphatase classification and Kinase Client Assay (KiC Assay) data--all of which provides context for the phosphorylation event. In addition, P(3)DB 3.0 incorporates multiple network viewers for the above features, such as PPI network, kinase-substrate network, phosphatase-substrate network, and domain co-occurrence network to help study phosphorylation from a systems point of view. Furthermore, the new P(3)DB reflects a community-based design through which users can share datasets and automate data depository processes for publication purposes. Each of these new features supports the goal of making P(3)DB a comprehensive, systematic and interactive platform for phosphoproteomics research.
AlignMe—a membrane protein sequence alignment web server
Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.
2014-01-01
We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Prchal, Jan; Junkova, Petra; Strmiskova, Miroslava; Lipov, Jan; Hynek, Radovan; Ruml, Tomas; Hrabal, Richard
2011-09-01
Matrix proteins play multiple roles both in early and late stages of the viral replication cycle. Their N-terminal myristoylation is important for interaction with the host cell membrane during virus budding. We used Escherichia coli, carrying N-myristoyltransferase gene, for the expression of the myristoylated His-tagged matrix protein of Mason-Pfizer monkey virus. An efficient, single-step purification procedure eliminating all contaminating proteins including, importantly, the non-myristoylated matrix protein was designed. The comparison of NMR spectra of matrix protein with its myristoylated form revealed substantial structural changes induced by this fatty acid modification. Copyright © 2011 Elsevier Inc. All rights reserved.
Applications of NMR and computational methodologies to study protein dynamics.
Narayanan, Chitra; Bafna, Khushboo; Roux, Louise D; Agarwal, Pratul K; Doucet, Nicolas
2017-08-15
Overwhelming evidence now illustrates the defining role of atomic-scale protein flexibility in biological events such as allostery, cell signaling, and enzyme catalysis. Over the years, spin relaxation nuclear magnetic resonance (NMR) has provided significant insights on the structural motions occurring on multiple time frames over the course of a protein life span. The present review article aims to illustrate to the broader community how this technique continues to shape many areas of protein science and engineering, in addition to being an indispensable tool for studying atomic-scale motions and functional characterization. Continuing developments in underlying NMR technology alongside software and hardware developments for complementary computational approaches now enable methodologies to routinely provide spatial directionality and structural representations traditionally harder to achieve solely using NMR spectroscopy. In addition to its well-established role in structural elucidation, we present recent examples that illustrate the combined power of selective isotope labeling, relaxation dispersion experiments, chemical shift analyses, and computational approaches for the characterization of conformational sub-states in proteins and enzymes. Copyright © 2017 Elsevier Inc. All rights reserved.
Study of Binding Interaction between Pif80 Protein Fragment and Aragonite
NASA Astrophysics Data System (ADS)
Du, Yuan-Peng; Chang, Hsun-Hui; Yang, Sheng-Yu; Huang, Shing-Jong; Tsai, Yu-Ju; Huang, Joseph Jen-Tse; Chan, Jerry Chun Chung
2016-08-01
Pif is a crucial protein for the formation of the nacreous layer in Pinctada fucata. Three non-acidic peptide fragments of the aragonite-binding domain (Pif80) are selected, which contain multiple copies of the repeat sequence DDRK, to study the interaction between non-acidic peptides and aragonite. The polypeptides DDRKDDRKGGK (Pif80-11) and DDRKDDRKGGKDDRKDDRKGGK (Pif80-22) have similar binding affinity to aragonite. Solid-state NMR data indicate that the backbones of Pif80-11 and Pif80-22 peptides bound on aragonite adopt a random-coil conformation. Pif80-11 is a lot more effective than Pif80-22 in promoting the nucleation of aragonite on the substrate of β-chitin. Our results suggest that the structural arrangement at a protein-mineral interface depends on the surface structure of the mineral substrate and the protein sequence. The side chains of the basic residues, which function as anchors to the aragonite surface, have uniform structures. The role of basic residues as anchors in protein-mineral interaction may play an important role in biomineralization.
Scop3D: three-dimensional visualization of sequence conservation.
Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien
2015-04-01
The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
El Omari, Kamel; Sutton, Geoff; Ravantti, Janne J; Zhang, Hanwen; Walter, Thomas S; Grimes, Jonathan M; Bamford, Dennis H; Stuart, David I; Mancini, Erika J
2013-08-06
The hallmark of a virus is its capsid, which harbors the viral genome and is formed from protein subunits, which assemble following precise geometric rules. dsRNA viruses use an unusual protein multiplicity (120 copies) to form their closed capsids. We have determined the atomic structure of the capsid protein (P1) from the dsRNA cystovirus Φ8. In the crystal P1 forms pentamers, very similar in shape to facets of empty procapsids, suggesting an unexpected assembly pathway that proceeds via a pentameric intermediate. Unlike the elongated proteins used by dsRNA mammalian reoviruses, P1 has a compact trapezoid-like shape and a distinct arrangement in the shell, with two near-identical conformers in nonequivalent structural environments. Nevertheless, structural similarity with the analogous protein from the mammalian viruses suggests a common ancestor. The unusual shape of the molecule may facilitate dramatic capsid expansion during phage maturation, allowing P1 to switch interaction interfaces to provide capsid plasticity. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
Evol and ProDy for bridging protein sequence evolution and structural dynamics.
Bakan, Ahmet; Dutta, Anindita; Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R; Bahar, Ivet
2014-09-15
Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Lu, Shun-Wen; Chen, Shiyan; Wang, Jianying; Yu, Hang; Chronis, Demosthenis; Mitchum, Melissa G; Wang, Xiaohong
2009-09-01
Plant CLAVATA3/ESR-related (CLE) peptides have diverse roles in plant growth and development. Here, we report the isolation and functional characterization of five new CLE genes from the potato cyst nematode Globodera rostochiensis. Unlike typical plant CLE peptides that contain a single CLE motif, four of the five Gr-CLE genes encode CLE proteins with multiple CLE motifs. These Gr-CLE genes were found to be specifically expressed within the dorsal esophageal gland cell of nematode parasitic stages, suggesting a role for their encoded proteins in plant parasitism. Overexpression phenotypes of Gr-CLE genes in Arabidopsis mimicked those of plant CLE genes, and Gr-CLE proteins could rescue the Arabidopsis clv3-2 mutant phenotype when expressed within meristems. A short root phenotype was observed when synthetic GrCLE peptides were exogenously applied to roots of Arabidopsis or potato similar to the overexpression of Gr-CLE genes in Arabidopsis and potato hairy roots. These results reveal that G. rostochiensis CLE proteins with either single or multiple CLE motifs function similarly to plant CLE proteins and that CLE signaling components are conserved in both Arabidopsis and potato roots. Furthermore, our results provide evidence to suggest that the evolution of multiple CLE motifs may be an important mechanism for generating functional diversity in nematode CLE proteins to facilitate parasitism.
Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan
2016-10-07
RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Perras, Alexandra K.; Daum, Bertram; Ziegler, Christine; Takahashi, Lynelle K.; Ahmed, Musahid; Wanner, Gerhard; Klingl, Andreas; Leitinger, Gerd; Kolb-Lenz, Dagmar; Gribaldo, Simonetta; Auerbach, Anna; Mora, Maximilian; Probst, Alexander J.; Bellack, Annett; Moissl-Eichinger, Christine
2015-01-01
The uncultivated “Candidatus Altiarchaeum hamiconexum” (formerly known as SM1 Euryarchaeon) carries highly specialized nano-grappling hooks (“hami”) on its cell surface. Until now little is known about the major protein forming these structured fibrous cell surface appendages, the genes involved or membrane anchoring of these filaments. These aspects were analyzed in depth in this study using environmental transcriptomics combined with imaging methods. Since a laboratory culture of this archaeon is not yet available, natural biofilm samples with high Ca. A. hamiconexum abundance were used for the entire analyses. The filamentous surface appendages spanned both membranes of the cell, which are composed of glycosyl-archaeol. The hami consisted of multiple copies of the same protein, the corresponding gene of which was identified via metagenome-mapped transcriptome analysis. The hamus subunit proteins, which are likely to self-assemble due to their predicted beta sheet topology, revealed no similiarity to known microbial flagella-, archaella-, fimbriae- or pili-proteins, but a high similarity to known S-layer proteins of the archaeal domain at their N-terminal region (44–47% identity). Our results provide new insights into the structure of the unique hami and their major protein and indicate their divergent evolution with S-layer proteins. PMID:26106369
Ou, Horng D.; Deerinck, Thomas J.; Bushong, Eric; Ellisman, Mark H.; O’Shea, Clodagh C.
2015-01-01
Structural studies of viral proteins most often use high-resolution techniques such as X-ray crystallography, nuclear magnetic resonance, single particle negative stain, or cryo-electron microscopy (EM) to reveal atomic interactions of soluble, homogeneous viral proteins or viral protein complexes. Once viral proteins or complexes are separated from their host’s cellular environment, their natural in-situ structure and details of how they interact with other cellular components may be lost. EM has been an invaluable tool in virology since its introduction in the late 1940’s and subsequent application to cells in the 1950’s. EM studies have expanded our knowledge of viral entry, viral replication, alteration of cellular components, and viral lysis. Most of these early studies were focused on conspicuous morphological cellular changes, because classic EM metal stains were designed to highlight classes of cellular structures rather than specific molecular structures. Much later, to identify viral proteins inducing specific structural configurations at the cellular level, immunostaining with a primary antibody followed by colloidal gold secondary antibody was employed to mark the location of specific viral proteins. This technique can suffer from artifacts in cellular ultrastructure due to compromises required to provide access to the immuno-reagents. Immunolocalization methods also require the generation of highly specific antibodies, which may not be available for every viral protein. Here we discuss new methods to visualize viral proteins and structures at high resolutions in-situ using correlated light and electron microscopy (CLEM). We discuss the use of genetically encoded protein fusions that oxidize diaminobenzidine (DAB) into an osmiophilic polymer that can be visualized by EM. Detailed protocols for applying the genetically encoded photo-oxidizing protein MiniSOG to a viral protein, photo-oxidation of the fusion protein to yield DAB polymer staining, and preparation of photo-oxidized samples for TEM and serial block-face scanning EM (SBEM) for large-scale volume EM data acquisition are also presented. As an example, we discuss the recent multi-scale analysis of Adenoviral protein E4-ORF3 that reveals a new type of multi-functional polymer that disrupts multiple cellular proteins. This new capability to visualize unambiguously specific viral protein structures at high resolutions in the native cellular environment is revealing new insights into how they usurp host proteins and functions to drive pathological viral replication. PMID:26066760
Ou, Horng D; Deerinck, Thomas J; Bushong, Eric; Ellisman, Mark H; O'Shea, Clodagh C
2015-11-15
Structural studies of viral proteins most often use high-resolution techniques such as X-ray crystallography, nuclear magnetic resonance, single particle negative stain, or cryo-electron microscopy (EM) to reveal atomic interactions of soluble, homogeneous viral proteins or viral protein complexes. Once viral proteins or complexes are separated from their host's cellular environment, their natural in situ structure and details of how they interact with other cellular components may be lost. EM has been an invaluable tool in virology since its introduction in the late 1940's and subsequent application to cells in the 1950's. EM studies have expanded our knowledge of viral entry, viral replication, alteration of cellular components, and viral lysis. Most of these early studies were focused on conspicuous morphological cellular changes, because classic EM metal stains were designed to highlight classes of cellular structures rather than specific molecular structures. Much later, to identify viral proteins inducing specific structural configurations at the cellular level, immunostaining with a primary antibody followed by colloidal gold secondary antibody was employed to mark the location of specific viral proteins. This technique can suffer from artifacts in cellular ultrastructure due to compromises required to provide access to the immuno-reagents. Immunolocalization methods also require the generation of highly specific antibodies, which may not be available for every viral protein. Here we discuss new methods to visualize viral proteins and structures at high resolutions in situ using correlated light and electron microscopy (CLEM). We discuss the use of genetically encoded protein fusions that oxidize diaminobenzidine (DAB) into an osmiophilic polymer that can be visualized by EM. Detailed protocols for applying the genetically encoded photo-oxidizing protein MiniSOG to a viral protein, photo-oxidation of the fusion protein to yield DAB polymer staining, and preparation of photo-oxidized samples for TEM and serial block-face scanning EM (SBEM) for large-scale volume EM data acquisition are also presented. As an example, we discuss the recent multi-scale analysis of Adenoviral protein E4-ORF3 that reveals a new type of multi-functional polymer that disrupts multiple cellular proteins. This new capability to visualize unambiguously specific viral protein structures at high resolutions in the native cellular environment is revealing new insights into how they usurp host proteins and functions to drive pathological viral replication. Copyright © 2015 Elsevier Inc. All rights reserved.
Innovative FT-IR imaging of protein film secondary structure before and after heat treatment.
Bonwell, Emily S; Wetzel, David L
2009-11-11
Changes in the secondary structure of globular protein occur during thermal processing. An infrared reflecting mirrored optical substrate that is unaffected by heat allows recording infrared spectra of protein films in a reflection absorption mode on the stage of an FT-IR microspectrometer. Hydrated films of myoglobin protein cast from solution on the mirrored substrate are interrogated before and after thermal denaturation to allow a direct comparison. Focal plane array imaging of 280 protein films allowed selection of the same area in the image from which to extract spectra. After treatment, 110 of 140 spectra from multiple films showed a dramatic shift from the alpha-helix form (1650 +/- 5 cm(-1)) to aggregated forms on either side of the original band. Seventy maxima were near 1625 cm(-1), and 40 shifted in the direction of 1670 cm(-1). The method developed was applied to films cast from two other commercial animal and plant protein sources.
Role of Matricellular Proteins in Disorders of the Central Nervous System.
Jayakumar, A R; Apeksha, A; Norenberg, M D
2017-03-01
Matricellular proteins (MCPs) are actively expressed non-structural proteins present in the extracellular matrix, which rapidly turnover and possess regulatory roles, as well as mediate cell-cell interactions. MCPs characteristically contain binding sites for other extracellular proteins, cell surface receptors, growth factors, cytokines and proteases, that provide structural support for surrounding cells. MCPs are present in most organs, including brain, and play a major role in cell-cell interactions and tissue repair. Among the MCPs found in brain include thrombospondin-1/2, secreted protein acidic and rich in cysteine family (SPARC), including Hevin/SC1, Tenascin C and CYR61/Connective Tissue Growth Factor/Nov family of proteins, glypicans, galectins, plasminogen activator inhibitor (PAI-1), autotaxin, fibulin and perisostin. This review summarizes the potential role of MCPs in the pathogenesis of major neurological disorders, including Alzheimer's disease, amyotrophic lateral sclerosis, ischemia, trauma, hepatic encephalopathy, Down's syndrome, autism, multiple sclerosis, brain neoplasms, Parkinson's disease and epilepsy. Potential therapeutic opportunities of MCP's for these disorders are also considered in this review.
Jin, Lily L.; Wybenga-Groot, Leanne E.; Tong, Jiefei; Taylor, Paul; Minden, Mark D.; Trudel, Suzanne; McGlade, C. Jane; Moran, Michael F.
2015-01-01
Src homology 2 (SH2) domains are modular protein structures that bind phosphotyrosine (pY)-containing polypeptides and regulate cellular functions through protein-protein interactions. Proteomics analysis showed that the SH2 domains of Src family kinases are themselves tyrosine phosphorylated in blood system cancers, including acute myeloid leukemia, chronic lymphocytic leukemia, and multiple myeloma. Using the Src family kinase Lyn SH2 domain as a model, we found that phosphorylation at the conserved SH2 domain residue Y194 impacts the affinity and specificity of SH2 domain binding to pY-containing peptides and proteins. Analysis of the Lyn SH2 domain crystal structure supports a model wherein phosphorylation of Y194 on the EF loop modulates the binding pocket that engages amino acid side chains at the pY+2/+3 position. These data indicate another level of regulation wherein SH2-mediated protein-protein interactions are modulated by SH2 kinases and phosphatases. PMID:25587033
Profiling Synaptic Proteins Identifies Regulators of Insulin Secretion and Lifespan
Kaplan, Joshua M.
2008-01-01
Cells are organized into distinct compartments to perform specific tasks with spatial precision. In neurons, presynaptic specializations are biochemically complex subcellular structures dedicated to neurotransmitter secretion. Activity-dependent changes in the abundance of presynaptic proteins are thought to endow synapses with different functional states; however, relatively little is known about the rules that govern changes in the composition of presynaptic terminals. We describe a genetic strategy to systematically analyze protein localization at Caenorhabditis elegans presynaptic specializations. Nine presynaptic proteins were GFP-tagged, allowing visualization of multiple presynaptic structures. Changes in the distribution and abundance of these proteins were quantified in 25 mutants that alter different aspects of neurotransmission. Global analysis of these data identified novel relationships between particular presynaptic components and provides a new method to compare gene functions by identifying shared protein localization phenotypes. Using this strategy, we identified several genes that regulate secretion of insulin-like growth factors (IGFs) and influence lifespan in a manner dependent on insulin/IGF signaling. PMID:19043554
Effects of proline cis-trans isomerization on TB domain secondary structure.
Yuan, X.; Werner, J. M.; Knott, V.; Handford, P. A.; Campbell, I. D.; Downing, K.
1998-01-01
The transforming growth factor beta (TGF-beta) binding protein-like (TB) domain is found principally in proteins localized to extracellular matrix fibrils, including human fibrillin-1, the defective protein in the Marfan syndrome. Analysis of the nuclear magnetic resonance (NMR) data for the sixth TB module from human fibrillin-1 has revealed the existence of two stable conformers that differ in the isomerization states of two proline residues. Unusually, the two isoforms do not readily interconvert and are stable on the time scale of milliseconds. We have computed independent structures of the major and minor conformers of TB6 to assess how the domain fold adjusts to incorporate alternatively cis- or trans-prolines. Based on previous observations, it has been suggested that multiple conformers can only be accommodated in flexible regions of protein structure. In contrast, P22, which exists in trans in the major form and cis in the minor form of TB6, is in a rigid region of the domain, which is confirmed by backbone dynamics measurements. Overall, the structures of the major and minor conformers are similar. However, the secondary structure topologies of the two forms differ as a direct consequence of the changes in proline conformation. PMID:9792099
Functional and Structural Characterization of Zebrafish ASC.
Li, Yajuan; Huang, Yi; Cao, Xiaocong; Yin, Xueying; Jin, Xiangyu; Liu, Sheng; Jiang, Jiansheng; Jiang, Wei; Xiao, Tsan Sam; Zhou, Rongbin; Cai, Gang; Hu, Bing; Jin, Tengchuan
2018-05-23
The zebrafish genome encodes homologs for most of the proteins involved in inflammatory pathways; however, the molecular components and activation mechanisms of fish inflammasomes are largely unknown. ASC (apoptosis-associated speck-like protein containing a caspase-recruitment domain (CARD)) is the only adaptor involved in the formation of multiple types of inflammasomes. Here, we demonstrate that zASC is also involved in inflammasome activation in zebrafish. When overexpressed in vitro and in vivo in zebrafish, both the zASC and zASC pyrin domain (PYD) proteins form speck and filament structures. Importantly, the crystal structures of the N-terminal PYD and C-terminal CARD of zebrafish ASC were determined independently as two separate entities fused to maltose-binding protein (MBP). Structure-guided mutagenesis revealed the functional relevance of the PYD hydrophilic surface found in the crystal lattice. Finally, the fish caspase-1 homolog Caspy, but not the caspase-4/11 homolog Caspy2, interacts with zASC through homotypic PYD-PYD interactions, which differ from those in mammals. These observations establish the conserved and unique structural/functional features of the zASC-dependent inflammasome pathway. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Accelerating the Conformational Sampling of Intrinsically Disordered Proteins.
Do, Trang Nhu; Choy, Wing-Yiu; Karttunen, Mikko
2014-11-11
Intrinsically disordered proteins (IDPs) are a class of proteins lacking a well-defined secondary structure. Instead, they are able to attain multiple conformations, bind to multiple targets, and respond to changes in their surroundings. Functionally, IDPs have been associated with molecular recognition, cell regulation, and signal transduction. The dynamic conformational ensemble of IDPs is highly environmental and binding partner dependent, rendering the characterization of IDPs extremely challenging. Here, we compare the sampling efficiencies of conventional molecular dynamics (MD), well-tempered metadynamics (WT-META), and bias-exchange metadynamics (BE-META). The total simulation time was over 10 μs, and a 20-mer peptide derived from the Neh2 domain of the Nuclear factor erythroid 2-related factor 2 (Nrf2) protein was simulated. BE-META, with a neutral replica and seven biased replicas employing a set of seven relevant collective variables (CVs), provided the most reliable and efficient sampling. Finally, we propose a free-energy reconstruction method based on the probability distribution of the secondary structure contents. This postprocessing analysis confirms the presence of not only the β-hairpin conformation of the free Neh2 peptide but also its rare bound-state-like conformation, both of that have been experimentally observed. In addition, our simulations also predict other possible conformations to be verified with future experiments.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation.
Mourad, Raphaël; Cuvier, Olivier
2016-05-01
Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation
Mourad, Raphaël; Cuvier, Olivier
2016-01-01
Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1. PMID:27203237
Mms1 is an assistant for regulating G-quadruplex DNA structures.
Schwindt, Eike; Paeschke, Katrin
2018-06-01
The preservation of genome stability is fundamental for every cell. Genomic integrity is constantly challenged. Among those challenges are also non-canonical nucleic acid structures. In recent years, scientists became aware of the impact of G-quadruplex (G4) structures on genome stability. It has been shown that folded G4-DNA structures cause changes in the cell, such as transcriptional up/down-regulation, replication stalling, or enhanced genome instability. Multiple helicases have been identified to regulate G4 structures and by this preserve genome stability. Interestingly, although these helicases are mostly ubiquitous expressed, they show specificity for G4 regulation in certain cellular processes (e.g., DNA replication). To this date, it is not clear how this process and target specificity of helicases are achieved. Recently, Mms1, an ubiquitin ligase complex protein, was identified as a novel G4-DNA-binding protein that supports genome stability by aiding Pif1 helicase binding to these regions. In this perspective review, we discuss the question if G4-DNA interacting proteins are fundamental for helicase function and specificity at G4-DNA structures.
FireProt: Energy- and Evolution-Based Computational Design of Thermostable Multiple-Point Mutants.
Bednar, David; Beerens, Koen; Sebestova, Eva; Bendl, Jaroslav; Khare, Sagar; Chaloupkova, Radka; Prokop, Zbynek; Brezovsky, Jan; Baker, David; Damborsky, Jiri
2015-11-01
There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C) by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.
On the origins of the weak folding cooperativity of a designed ββα ultrafast protein FSD-1.
Wu, Chun; Shea, Joan-Emma
2010-11-18
FSD-1, a designed small ultrafast folder with a ββα fold, has been actively studied in the last few years as a model system for studying protein folding mechanisms and for testing of the accuracy of computational models. The suitability of this protein to describe the folding of naturally occurring α/β proteins has recently been challenged based on the observation that the melting transition is very broad, with ill-resolved baselines. Using molecular dynamics simulations with the AMBER protein force field (ff96) coupled with the implicit solvent model (IGB = 5), we shed new light into the nature of this transition and resolve the experimental controversies. We show that the melting transition corresponds to the melting of the protein as a whole, and not solely to the helix-coil transition. The breadth of the folding transition arises from the spread in the melting temperatures (from ∼325 K to ∼302 K) of the individual transitions: formation of the hydrophobic core, β-hairpin and tertiary fold, with the helix formed earlier. Our simulations initiated from an extended chain accurately predict the native structure, provide a reasonable estimate of the transition barrier height, and explicitly demonstrate the existence of multiple pathways and multiple transition states for folding. Our exhaustive sampling enables us to assess the quality of the Amber ff96/igb5 combination and reveals that while this force field can predict the correct native fold, it nonetheless overstabilizes the α-helix portion of the protein (Tm = ∼387K) as well as the denatured structures.
Kaufman, Brett A.; Durisic, Nela; Mativetsky, Jeffrey M.; Costantino, Santiago; Hancock, Mark A.; Grutter, Peter
2007-01-01
Packaging DNA into condensed structures is integral to the transmission of genomes. The mammalian mitochondrial genome (mtDNA) is a high copy, maternally inherited genome in which mutations cause a variety of multisystem disorders. In all eukaryotic cells, multiple mtDNAs are packaged with protein into spheroid bodies called nucleoids, which are the fundamental units of mtDNA segregation. The mechanism of nucleoid formation, however, remains unknown. Here, we show that the mitochondrial transcription factor TFAM, an abundant and highly conserved High Mobility Group box protein, binds DNA cooperatively with nanomolar affinity as a homodimer and that it is capable of coordinating and fully compacting several DNA molecules together to form spheroid structures. We use noncontact atomic force microscopy, which achieves near cryo-electron microscope resolution, to reveal the structural details of protein–DNA compaction intermediates. The formation of these complexes involves the bending of the DNA backbone, and DNA loop formation, followed by the filling in of proximal available DNA sites until the DNA is compacted. These results indicate that TFAM alone is sufficient to organize mitochondrial chromatin and provide a mechanism for nucleoid formation. PMID:17581862
Structural biology contributions to the discovery of drugs to treat chronic myelogenous leukaemia
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cowan-Jacob, Sandra W., E-mail: sandra.jacob@novartis.com; Fendrich, Gabriele; Floersheimer, Andreas
2007-01-01
A case study showing how the determination of multiple cocrystal structures of the protein tyrosine kinase c-Abl was used to support drug discovery, resulting in a compound effective in the treatment of chronic myelogenous leukaemia. Chronic myelogenous leukaemia (CML) results from the Bcr-Abl oncoprotein, which has a constitutively activated Abl tyrosine kinase domain. Although most chronic phase CML patients treated with imatinib as first-line therapy maintain excellent durable responses, patients who have progressed to advanced-stage CML frequently fail to respond or lose their response to therapy owing to the emergence of drug-resistant mutants of the protein. More than 40 suchmore » point mutations have been observed in imatinib-resistant patients. The crystal structures of wild-type and mutant Abl kinase in complex with imatinib and other small-molecule Abl inhibitors were determined, with the aim of understanding the molecular basis of resistance and to aid in the design and optimization of inhibitors active against the resistance mutants. These results are presented in a way which illustrates the approaches used to generate multiple structures, the type of information that can be gained and the way that this information is used to support drug discovery.« less
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures
Manolakos, Elias S.
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.
Sharma, Anuj; Manolakos, Elias S
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.
3Drefine: an interactive web server for efficient protein structure refinement.
Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin
2016-07-08
3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Lu, Liang; Gu, Xiaorong; Hong, Li; Laird, James; Jaffe, Keeve; Choi, Jaewoo; Crabb, John; Salomon, Robert G
2009-11-01
Protein modifications in which the epsilon-amino group of lysyl residues is incorporated into a 2-(omega-carboxyethyl)pyrrole (CEP) are mediators of age-related macular degeneration (AMD). They promote both angiogenesis into the retina ('wet AMD') and geographic retinal atrophy ('dry AMD'). Blood levels of CEPs are biomarkers for clinical prognosis of the disease. To enable mechanistic studies of their role in promoting AMD, for example, through the activation of B- and T-cells, interaction with receptors, or binding with complement proteins, we developed an efficient synthesis of CEP derivatives, that is especially effective for proteins. The structures of tryptic peptides derived from CEP-modified proteins were also determined. A key finding is that 4,7-dioxoheptanoic acid 9-fluorenylmethyl ester reacts with primary amines to provide 9-fluorenylmethyl esters of CEP-modified proteins that can be deprotected in situ with 1,8-diazabicyclo[5.4.0]undec-7-ene without causing protein denaturation. The introduction of multiple CEP-modifications with a wide variety of CEP:protein ratios is readily achieved using this strategy.
Diversity and functions of protein glycosylation in insects.
Walski, Tomasz; De Schutter, Kristof; Van Damme, Els J M; Smagghe, Guy
2017-04-01
The majority of proteins is modified with carbohydrate structures. This modification, called glycosylation, was shown to be crucial for protein folding, stability and subcellular location, as well as protein-protein interactions, recognition and signaling. Protein glycosylation is involved in multiple physiological processes, including embryonic development, growth, circadian rhythms, cell attachment as well as maintenance of organ structure, immunity and fertility. Although the general principles of glycosylation are similar among eukaryotic organisms, insects synthesize a distinct repertoire of glycan structures compared to plants and vertebrates. Consequently, a number of unique insect glycans mediate functions specific to this class of invertebrates. For instance, the core α1,3-fucosylation of N-glycans is absent in vertebrates, while in insects this modification is crucial for the development of wings and the nervous system. At present, most of the data on insect glycobiology comes from research in Drosophila. Yet, progressively more information on the glycan structures and the importance of glycosylation in other insects like beetles, caterpillars, aphids and bees is becoming available. This review gives a summary of the current knowledge and recent progress related to glycan diversity and function(s) of protein glycosylation in insects. We focus on N- and O-glycosylation, their synthesis, physiological role(s), as well as the molecular and biochemical basis of these processes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Crystallization of Proteins from Crude Bovine Rod Outer Segments☆
Baker, Bo Y.; Gulati, Sahil; Shi, Wuxian; Wang, Benlian; Stewart, Phoebe L.; Palczewski, Krzysztof
2015-01-01
Obtaining protein crystals suitable for X-ray diffraction studies comprises the greatest challenge in the determination of protein crystal structures, especially for membrane proteins and protein complexes. Although high purity has been broadly accepted as one of the most significant requirements for protein crystallization, a recent study of the Escherichia coli proteome showed that many proteins have an inherent propensity to crystallize and do not require a highly homogeneous sample (Totir et al., 2012). As exemplified by RPE65 (Kiser, Golczak, Lodowski, Chance, & Palczewski, 2009), there also are cases of mammalian proteins crystallized from less purified samples. To test whether this phenomenon can be applied more broadly to the study of proteins from higher organisms, we investigated the protein crystallization profile of bovine rod outer segment (ROS) crude extracts. Interestingly, multiple protein crystals readily formed from such extracts, some of them diffracting to high resolution that allowed structural determination. A total of seven proteins were crystallized, one of which was a membrane protein. Successful crystallization of proteins from heterogeneous ROS extracts demonstrates that many mammalian proteins also have an intrinsic propensity to crystallize from complex biological mixtures. By providing an alternative approach to heterologous expression to achieve crystallization, this strategy could be useful for proteins and complexes that are difficult to purify or obtain by recombinant techniques. PMID:25950977
Structures and Mechanism of the Monoamine Oxidase Family
Gaweska, Helena; Fitzpatrick, Paul F.
2011-01-01
Members of the monoamine oxidase family of flavoproteins catalyze the oxidation of primary and secondary amines, polyamines, amino acids, and methylated lysine side chains in proteins. The enzymes have similar overall structures, with conserved FAD-binding domains and varied substrate-binding sites. Multiple mechanisms have been proposed for the catalytic reactions of these enzymes. The present review compares the structures of different members of the family and the various mechanistic proposals. PMID:22022344
SLX4 Assembles a Telomere Maintenance Toolkit by Bridging Multiple Endonucleases with Telomeres
Wan, Bingbing; Yin, Jinhu; Horvath, Kent; Sarkar, Jaya; Chen, Yong; Wu, Jian; Wan, Ke; Lu, Jian; Gu, Peili; Yu, Eun Young; Lue, Neal F.; Chang, Sandy
2014-01-01
Summary SLX4 interacts with several endonucleases to resolve structural barriers in DNA metabolism. SLX4 also interacts with telomeric protein TRF2 in human cells. The molecular mechanism of these interactions at telomeres remains unknown. Here, we report the crystal structure of the TRF2-binding motif of SLX4 (SLX4TBM) in complex with the TRFH domain of TRF2 (TRF2TRFH) and map the interactions of SLX4 with endonucleases SLX1, XPF, and MUS81. TRF2 recognizes a unique HxLxP motif on SLX4 via the peptide-binding site in its TRFH domain. Telomeric localization of SLX4 and associated nucleases depend on the SLX4-endonuclease and SLX4-TRF2 interactions and the protein levels of SLX4 and TRF2. SLX4 assembles an endonuclease toolkit that negatively regulates telomere length via SLX1-catalyzed nucleolytic resolution of telomere DNA structures. We propose that the SLX4-TRF2 complex serves as a double-layer scaffold bridging multiple endonucleases with telomeres for recombination-based telomere maintenance. PMID:24012755
Sequential Release of Proteins from Structured Multishell Microcapsules.
Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J
2017-10-09
In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials and provides opportunities to design more refined functional protein delivery systems.
Antonczak, Alicja K; Milholland, Kedric; Tippmann, Eric M
2018-05-01
The target protein, Hcp1, was first described as part of the bacterial Type VI secretion system from Pseudomonas aeruginosa. The protein first self-assembles into a hexamer and then the hexamers further stack into a nanotubular structure. Hcp1 monomers were targeted for mutagenesis with two widely used photoactivatable amino acids: para-benzoyl phenylalanine or para-azidophenylalanine. The ability of these amino acids to form covalent adducts within the Hcp1 self-assembled system was investigated. Multiple residues, putatively of equal distance between the monomer-monomer interface were targeted. The efficiency of each amino acid to covalently link self-assembled hexamers was determined. The results demonstrate the choice and role of genetically encoded tools applied to complicated biological processes such as self-assembly and also suggested some structural dynamics of the Hcp-1 protein not obvious from crystallographic structures.
Synergistic interactions of lipids and myelin basic protein
NASA Astrophysics Data System (ADS)
Hu, Yufang; Doudevski, Ivo; Wood, Denise; Moscarello, Mario; Husted, Cynthia; Genain, Claude; Zasadzinski, Joseph A.; Israelachvili, Jacob
2004-09-01
This report describes force measurements and atomic force microscope imaging of lipid-protein interactions that determine the structure of a model membrane system that closely mimics the myelin sheath. Our results suggest that noncovalent, mainly electrostatic and hydrophobic, interactions are responsible for the multilamellar structure and stability of myelin. We find that myelin basic protein acts as a lipid coupler between two apposed bilayers and as a lipid "hole-filler," effectively preventing defect holes from developing. From our protein-mediated-adhesion and force-distance measurements, we develop a simple quantitative model that gives a reasonably accurate picture of the molecular mechanism and adhesion of bilayer-bridging proteins by means of noncovalent interactions. The results and model indicate that optimum myelin adhesion and stability depend on the difference between, rather than the product of, the opposite charges on the lipid bilayers and myelin basic protein, as well as on the repulsive forces associated with membrane fluidity, and that small changes in any of these parameters away from the synergistically optimum values can lead to large changes in the adhesion or even its total elimination. Our results also show that the often-asked question of which membrane species, the lipids or the proteins, are the "important ones" may be misplaced. Both components work synergistically to provide the adhesion and overall structure. A better appreciation of the mechanism of this synergy may allow for a better understanding of stacked and especially myelin membrane structures and may lead to better treatments for demyelinating diseases such as multiple sclerosis. lipid-protein interactions | myelin membrane structure | membrane adhesion | membrane regeneration/healing | demyelinating diseases
Landscape of Pleiotropic Proteins Causing Human Disease: Structural and System Biology Insights.
Ittisoponpisan, Sirawit; Alhuzimi, Eman; Sternberg, Michael J E; David, Alessia
2017-03-01
Pleiotropy is the phenomenon by which the same gene can result in multiple phenotypes. Pleiotropic proteins are emerging as important contributors to rare and common disorders. Nevertheless, little is known on the mechanisms underlying pleiotropy and the characteristic of pleiotropic proteins. We analyzed disease-causing proteins reported in UniProt and observed that 12% are pleiotropic (variants in the same protein cause more than one disease). Pleiotropic proteins were enriched in deleterious and rare variants, but not in common variants. Pleiotropic proteins were more likely to be involved in the pathogenesis of neoplasms, neurological, and circulatory diseases and congenital malformations, whereas non-pleiotropic proteins in endocrine and metabolic disorders. Pleiotropic proteins were more essential and had a higher number of interacting partners compared with non-pleiotropic proteins. Significantly more pleiotropic than non-pleiotropic proteins contained at least one intrinsically long disordered region (P < 0.001). Deleterious variants occurring in structurally disordered regions were more commonly found in pleiotropic, rather than non-pleiotropic proteins. In conclusion, pleiotropic proteins are an important contributor to human disease. They represent a biologically different class of proteins compared with non-pleiotropic proteins and a better understanding of their characteristics and genetic variants can greatly aid in the interpretation of genetic studies and drug design. © 2016 WILEY PERIODICALS, INC.
Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.
Kippert, Fred; Gerloff, Dietlind L
2009-09-24
HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.
Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH
Kippert, Fred; Gerloff, Dietlind L.
2009-01-01
Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061
DOE Office of Scientific and Technical Information (OSTI.GOV)
McMahon, Roisin M., E-mail: r.mcmahon1@uq.edu.au; Coinçon, Mathieu; Tay, Stephanie
The crystal structure of a P. aeruginosa DsbA1 variant is more suitable for fragment-based lead discovery efforts to identify inhibitors of this antimicrobial drug target. In the reported structures the active site of the protein can simultaneously bind multiple ligands introduced in the crystallization solution or via soaking. Pseudomonas aeruginosa is an opportunistic human pathogen for which new antimicrobial drug options are urgently sought. P. aeruginosa disulfide-bond protein A1 (PaDsbA1) plays a pivotal role in catalyzing the oxidative folding of multiple virulence proteins and as such holds great promise as a drug target. As part of a fragment-based lead discoverymore » approach to PaDsbA1 inhibitor development, the identification of a crystal form of PaDsbA1 that was more suitable for fragment-soaking experiments was sought. A previously identified crystallization condition for this protein was unsuitable, as in this crystal form of PaDsbA1 the active-site surface loops are engaged in the crystal packing, occluding access to the target site. A single residue involved in crystal-packing interactions was substituted with an amino acid commonly found at this position in closely related enzymes, and this variant was successfully used to generate a new crystal form of PaDsbA1 in which the active-site surface is more accessible for soaking experiments. The PaDsbA1 variant displays identical redox character and in vitro activity to wild-type PaDsbA1 and is structurally highly similar. Two crystal structures of the PaDsbA1 variant were determined in complex with small molecules bound to the protein active site. These small molecules (MES, glycerol and ethylene glycol) were derived from the crystallization or cryoprotectant solutions and provide a proof of principle that the reported crystal form will be amenable to co-crystallization and soaking with small molecules designed to target the protein active-site surface.« less
Improve the prediction of RNA-binding residues using structural neighbours.
Li, Quan; Cao, Zanxia; Liu, Haiyan
2010-03-01
The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.
Lambert, Matthias; Richard, Elodie; Duban-Deweer, Sophie; Krzewinski, Frederic; Deracinois, Barbara; Dupont, Erwan; Bastide, Bruno; Cieniewski-Bernard, Caroline
2016-09-01
The sarcomere structure of skeletal muscle is determined through multiple protein-protein interactions within an intricate sarcomeric cytoskeleton network. The molecular mechanisms involved in the regulation of this sarcomeric organization, essential to muscle function, remain unclear. O-GlcNAcylation, a post-translational modification modifying several key structural proteins and previously described as a modulator of the contractile activity, was never considered to date in the sarcomeric organization. C2C12 skeletal myotubes were treated with Thiamet-G (OGA inhibitor) in order to increase the global O-GlcNAcylation level. Our data clearly showed a modulation of the O-GlcNAc level more sensitive and dynamic in the myofilament-enriched fraction than total proteome. This fine O-GlcNAc level modulation was closely related to changes of the sarcomeric morphometry. Indeed, the dark-band and M-line widths increased, while the I-band width and the sarcomere length decreased according to the myofilament O-GlcNAc level. Some structural proteins of the sarcomere such as desmin, αB-crystallin, α-actinin, moesin and filamin-C have been identified within modulated protein complexes through O-GlcNAc level variations. Their interactions seemed to be changed, especially for desmin and αB-crystallin. For the first time, our findings clearly demonstrate that O-GlcNAcylation, through dynamic regulations of the structural interactome, could be an important modulator of the sarcomeric structure and may provide new insights in the understanding of molecular mechanisms of neuromuscular diseases characterized by a disorganization of the sarcomeric structure. In the present study, we demonstrated a role of O-GlcNAcylation in the sarcomeric structure modulation. Copyright © 2016 Elsevier B.V. All rights reserved.
MultiSeq: unifying sequence and structure data for evolutionary analysis
Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida
2006-01-01
Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055
Protein Conformational Dynamics Probed by Single-Molecule Electron Transfer
NASA Astrophysics Data System (ADS)
Yang, Haw; Luo, Guobin; Karnchanaphanurach, Pallop; Louie, Tai-Man; Rech, Ivan; Cova, Sergio; Xun, Luying; Xie, X. Sunney
2003-10-01
Electron transfer is used as a probe for angstrom-scale structural changes in single protein molecules. In a flavin reductase, the fluorescence of flavin is quenched by a nearby tyrosine residue by means of photo-induced electron transfer. By probing the fluorescence lifetime of the single flavin on a photon-by-photon basis, we were able to observe the variation of flavin-tyrosine distance over time. We could then determine the potential of mean force between the flavin and the tyrosine, and a correlation analysis revealed conformational fluctuation at multiple time scales spanning from hundreds of microseconds to seconds. This phenomenon suggests the existence of multiple interconverting conformers related to the fluctuating catalytic reactivity.
Hybrid Methods Reveal Multiple Flexibly Linked DNA Polymerases within the Bacteriophage T7 Replisome
Wallen, Jamie R.; Zhang, Hao; Weis, Caroline; ...
2017-01-03
The physical organization of DNA enzymes at a replication fork enables efficient copying of two antiparallel DNA strands, yet dynamic protein interactions within the replication complex complicate replisome structural studies. We employed a combination of crystallographic, native mass spectrometry and small-angle X-ray scattering experiments to capture alternative structures of a model replication system encoded by bacteriophage T7. then, the two molecules of DNA polymerase bind the ring-shaped primase-helicase in a conserved orientation and provide structural insight into how the acidic C-terminal tail of the primase-helicase contacts the DNA polymerase to facilitate loading of the polymerase onto DNA. A third DNA polymerasemore » binds the ring in an offset manner that may enable polymerase exchange during replication. Alternative polymerase binding modes are also detected by small-angle X-ray scattering with DNA substrates present. The collective results unveil complex motions within T7 replisome higher-order structures that are underpinned by multivalent protein-protein interactions with functional implications.« less
Hybrid Methods Reveal Multiple Flexibly Linked DNA Polymerases within the Bacteriophage T7 Replisome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wallen, Jamie R.; Zhang, Hao; Weis, Caroline
The physical organization of DNA enzymes at a replication fork enables efficient copying of two antiparallel DNA strands, yet dynamic protein interactions within the replication complex complicate replisome structural studies. We employed a combination of crystallographic, native mass spectrometry and small-angle X-ray scattering experiments to capture alternative structures of a model replication system encoded by bacteriophage T7. then, the two molecules of DNA polymerase bind the ring-shaped primase-helicase in a conserved orientation and provide structural insight into how the acidic C-terminal tail of the primase-helicase contacts the DNA polymerase to facilitate loading of the polymerase onto DNA. A third DNA polymerasemore » binds the ring in an offset manner that may enable polymerase exchange during replication. Alternative polymerase binding modes are also detected by small-angle X-ray scattering with DNA substrates present. The collective results unveil complex motions within T7 replisome higher-order structures that are underpinned by multivalent protein-protein interactions with functional implications.« less
Rissanen, Ilona; Grimes, Jonathan M.; Pawlowski, Alice; Mäntynen, Sari; Harlos, Karl; Bamford, Jaana K.H.; Stuart, David I.
2013-01-01
Summary It has proved difficult to classify viruses unless they are closely related since their rapid evolution hinders detection of remote evolutionary relationships in their genetic sequences. However, structure varies more slowly than sequence, allowing deeper evolutionary relationships to be detected. Bacteriophage P23-77 is an example of a newly identified viral lineage, with members inhabiting extreme environments. We have solved multiple crystal structures of the major capsid proteins VP16 and VP17 of bacteriophage P23-77. They fit the 14 Å resolution cryo-electron microscopy reconstruction of the entire virus exquisitely well, allowing us to propose a model for both the capsid architecture and viral assembly, quite different from previously published models. The structures of the capsid proteins and their mode of association to form the viral capsid suggest that the P23-77-like and adeno-PRD1 lineages of viruses share an extremely ancient common ancestor. PMID:23623731
[Expression and Preliminary Research on the Soluble Domain of EV-D68 3A Protein].
Li, Ting; Kong, Jia; Yu, Xiao-fang; Han, Xue
2015-11-01
To understand the structure of the soluble region of Enterovirus 68 3A protein, we construct a prokaryotic expression vector expressing the soluble region of EV-D68 3A protein, and identify the forms of expression product after purification. The EV-D68 3A(1-61) gene was amplified by PCR and then cloned into the expression vector pET-28a-His-SUMO. The recombinant plasmid was transformed into Escherichia coli BL21 induced by IPTG to express the fusion protein His-SUMO-3A(1-61). The recombinant protein was purified by Ni-NTA Agarose and cleaved by ULP Protease to remove His-SUMO tag. After that, the target protein 3A(1-61) was purified by a series of purification methods such as Ni-NTA, anion exchange chromatography and gel filtration chromato- graphy. Chemical cross-linking reaction assay was taken to determine the multiple polymerization state of the 3A soluble region. A prokaryotic expression vector pET28a-His-SUMO-3A(1-61) expressing the solution region of EV-D68 3A was successfully constructed and plenty of highly pure target proteins were obtained by multiple purification steps . The total protein amount was about 5 mg obtained from 1L Escherichia coli BL21 with purity > 95%. At the same time, those results determined the homomultimer form of soluble 3A construct. These data demonstrated that the expression and purification system of the soluble region of 3A were successfully set up and provide some basic konwledge for the research about 3A crystal structure and the development of antiviral drugs targeted at 3A to block viral replication.
Mudgal, Richa; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2017-07-01
Functional annotation is seldom straightforward with complexities arising due to functional divergence in protein families or functional convergence between non-homologous protein families, leading to mis-annotations. An enzyme may contain multiple domains and not all domains may be involved in a given function, adding to the complexity in function annotation. To address this, we use binding site information from bound cognate ligands and catalytic residues, since it can help in resolving fold-function relationships at a finer level and with higher confidence. A comprehensive database of 2,020 fold-function-binding site relationships has been systematically generated. A network-based approach is employed to capture the complexity in these relationships, from which different types of associations are deciphered, that identify versatile protein folds performing diverse functions, same function associated with multiple folds and one-to-one relationships. Binding site similarity networks integrated with fold, function, and ligand similarity information are generated to understand the depth of these relationships. Apart from the observed continuity in the functional site space, network properties of these revealed versatile families with topologically different or dissimilar binding sites and structural families that perform very similar functions. As a case study, subtle changes in the active site of a set of evolutionarily related superfamilies are studied using these networks. Tracing of such similarities in evolutionarily related proteins provide clues into the transition and evolution of protein functions. Insights from this study will be helpful in accurate and reliable functional annotations of uncharacterized proteins, poly-pharmacology, and designing enzymes with new functional capabilities. Proteins 2017; 85:1319-1335. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R
2013-10-01
Lignin comprises 15-25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP-binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute-binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. Copyright © 2013 Wiley Periodicals, Inc.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C.; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R.
2013-01-01
Lignin comprises 15.25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP.binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute.binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. PMID:23606130
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework
2014-01-01
Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set. PMID:24646119
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework.
Simha, Ramanuja; Shatkay, Hagit
2014-03-19
Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.
Shao, Wei; Liu, Mingxia; Zhang, Daoqiang
2016-01-01
The systematic study of subcellular location pattern is very important for fully characterizing the human proteome. Nowadays, with the great advances in automated microscopic imaging, accurate bioimage-based classification methods to predict protein subcellular locations are highly desired. All existing models were constructed on the independent parallel hypothesis, where the cellular component classes are positioned independently in a multi-class classification engine. The important structural information of cellular compartments is missed. To deal with this problem for developing more accurate models, we proposed a novel cell structure-driven classifier construction approach (SC-PSorter) by employing the prior biological structural information in the learning model. Specifically, the structural relationship among the cellular components is reflected by a new codeword matrix under the error correcting output coding framework. Then, we construct multiple SC-PSorter-based classifiers corresponding to the columns of the error correcting output coding codeword matrix using a multi-kernel support vector machine classification approach. Finally, we perform the classifier ensemble by combining those multiple SC-PSorter-based classifiers via majority voting. We evaluate our method on a collection of 1636 immunohistochemistry images from the Human Protein Atlas database. The experimental results show that our method achieves an overall accuracy of 89.0%, which is 6.4% higher than the state-of-the-art method. The dataset and code can be downloaded from https://github.com/shaoweinuaa/. dqzhang@nuaa.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Wachsmuth, Leah M; Johnson, Meredith G; Gavenonis, Jason
2017-06-01
Parasitic diseases caused by kinetoplastid parasites of the genera Trypanosoma and Leishmania are an urgent public health crisis in the developing world. These closely related species possess a number of multimeric enzymes in highly conserved pathways involved in vital functions, such as redox homeostasis and nucleotide synthesis. Computational alanine scanning of these protein-protein interfaces has revealed a host of potentially ligandable sites on several established and emerging anti-parasitic drug targets. Analysis of interfaces with multiple clustered hotspots has suggested several potentially inhibitable protein-protein interactions that may have been overlooked by previous large-scale analyses focusing solely on secondary structure. These protein-protein interactions provide a promising lead for the development of new peptide and macrocycle inhibitors of these enzymes.
Burai, Ritwik; Ait-Bouziad, Nadine; Chiki, Anass; Lashuel, Hilal A
2015-04-22
Parkinson's disease (PD) is characterized by the loss of dopaminergic neurons in the substantia nigra and the presence of intraneuronal inclusions consisting of aggregated and post-translationally modified α-synuclein (α-syn). Despite advances in the chemical synthesis of α-syn and other proteins, the generation of site-specifically nitrated synthetic proteins has not been reported. Consequently, it has not been possible to determine the roles of nitration at specific residues in regulating the physiological and pathogenic properties of α-syn. Here we report, for the first time, the site-specific incorporation of 3-nitrotyrosine at different regions of α-syn using native chemical ligation combined with a novel desulfurization strategy. This strategy enabled us to investigate the role of nitration at single or multiple tyrosine residues in regulating α-syn structure, membrane binding, oligomerization, and fibrils formation. We demonstrate that different site-specifically nitrated α-syn species exhibit distinct structural and aggregation properties and exhibit reduced affinity to negatively charged vesicle membranes. We provide evidence that intermolecular interactions between the N- and C-terminal regions of α-syn play critical roles in mediating nitration-induced α-syn oligomerization. For example, when Y39 is not available for nitration (Y39F and Y39/125F), the extent of cross-linking is limited mostly to dimer formation, whereas mutants in which Y39 along with one or multiple C-terminal tyrosines (Y125F, Y133F, Y136F and Y133/136F) can still undergo nitration readily to form higher-order oligomers. Our semisynthetic strategy for generating site-specifically nitrated proteins opens up new possibilities for investigating the role of nitration in regulating protein structure and function in health and disease.
Hafsa, Noor E; Arndt, David; Wishart, David S
2015-07-01
The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, β-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, β-strands, coil regions, five common β-turns (type I, II, I', II' and VIII), β hairpins as well as interior and edge β-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kubota, Sho; Morii, Mariko; Yuki, Ryuzaburo; Yamaguchi, Noritaka; Yamaguchi, Hiromi; Aoyama, Kazumasa; Kuga, Takahisa; Tomonaga, Takeshi; Yamaguchi, Naoto
2015-04-24
Protein-tyrosine phosphorylation regulates a wide variety of cellular processes at the plasma membrane. Recently, we showed that nuclear tyrosine kinases induce global nuclear structure changes, which we called chromatin structural changes. However, the mechanisms are not fully understood. In this study we identify protein kinase A anchoring protein 8 (AKAP8/AKAP95), which associates with chromatin and the nuclear matrix, as a nuclear tyrosine-phosphorylated protein. Tyrosine phosphorylation of AKAP8 is induced by several tyrosine kinases, such as Src, Fyn, and c-Abl but not Syk. Nucleus-targeted Lyn and c-Src strongly dissociate AKAP8 from chromatin and the nuclear matrix in a kinase activity-dependent manner. The levels of tyrosine phosphorylation of AKAP8 are decreased by substitution of multiple tyrosine residues on AKAP8 into phenylalanine. Importantly, the phenylalanine mutations of AKAP8 inhibit its dissociation from nuclear structures, suggesting that the association/dissociation of AKAP8 with/from nuclear structures is regulated by its tyrosine phosphorylation. Furthermore, the phenylalanine mutations of AKAP8 suppress the levels of nuclear tyrosine kinase-induced chromatin structural changes. In contrast, AKAP8 knockdown increases the levels of chromatin structural changes. Intriguingly, stimulation with hydrogen peroxide induces chromatin structural changes accompanied by the dissociation of AKAP8 from nuclear structures. These results suggest that AKAP8 is involved in the regulation of chromatin structural changes through nuclear tyrosine phosphorylation. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso
2016-01-01
Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389
Prytkova, Vera; Heyden, Matthias; Khago, Domarin; Freites, J Alfredo; Butts, Carter T; Martin, Rachel W; Tobias, Douglas J
2016-08-25
We present a novel multi-conformation Monte Carlo simulation method that enables the modeling of protein-protein interactions and aggregation in crowded protein solutions. This approach is relevant to a molecular-scale description of realistic biological environments, including the cytoplasm and the extracellular matrix, which are characterized by high concentrations of biomolecular solutes (e.g., 300-400 mg/mL for proteins and nucleic acids in the cytoplasm of Escherichia coli). Simulation of such environments necessitates the inclusion of a large number of protein molecules. Therefore, computationally inexpensive methods, such as rigid-body Brownian dynamics (BD) or Monte Carlo simulations, can be particularly useful. However, as we demonstrate herein, the rigid-body representation typically employed in simulations of many-protein systems gives rise to certain artifacts in protein-protein interactions. Our approach allows us to incorporate molecular flexibility in Monte Carlo simulations at low computational cost, thereby eliminating ambiguities arising from structure selection in rigid-body simulations. We benchmark and validate the methodology using simulations of hen egg white lysozyme in solution, a well-studied system for which extensive experimental data, including osmotic second virial coefficients, small-angle scattering structure factors, and multiple structures determined by X-ray and neutron crystallography and solution NMR, as well as rigid-body BD simulation results, are available for comparison.
Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas
2006-03-09
The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.
Electron transport and light-harvesting switches in cyanobacteria
Mullineaux, Conrad W.
2014-01-01
Cyanobacteria possess multiple mechanisms for regulating the pathways of photosynthetic and respiratory electron transport. Electron transport may be regulated indirectly by controlling the transfer of excitation energy from the light-harvesting complexes, or it may be more directly regulated by controlling the stoichiometry, localization, and interactions of photosynthetic and respiratory electron transport complexes. Regulation of the extent of linear vs. cyclic electron transport is particularly important for controlling the redox balance of the cell. This review discusses what is known of the regulatory mechanisms and the timescales on which they occur, with particular regard to the structural reorganization needed and the constraints imposed by the limited mobility of membrane-integral proteins in the crowded thylakoid membrane. Switching mechanisms requiring substantial movement of integral thylakoid membrane proteins occur on slower timescales than those that require the movement only of cytoplasmic or extrinsic membrane proteins. This difference is probably due to the restricted diffusion of membrane-integral proteins. Multiple switching mechanisms may be needed to regulate electron transport on different timescales. PMID:24478787
Akhter, Nasrin; Shehu, Amarda
2018-01-19
Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.
Baker, Max O D G; Shanmugam, Nirukshan; Pham, Chi L L; Strange, Merryn; Steain, Megan; Sunde, Margaret
2018-05-05
The Receptor-interacting protein kinase Homotypic Interaction Motif (RHIM) is an amino acid sequence that mediates multiple protein:protein interactions in the mammalian programmed cell death pathway known as necroptosis. At least one key RHIM-based complex has been shown to have a functional amyloid fibril structure, which provides a stable hetero-oligomeric platform for downstream signaling. RHIMs and related motifs are present in immunity-related proteins across nature, from viruses to fungi to metazoans. Necroptosis is a hallmark feature of cellular clearance of infection. For this reason, numerous pathogens, including viruses and bacteria, have developed varied methods to modulate necroptosis, focusing on inhibiting RHIM:RHIM interactions, and thus their downstream cell death effects. This review will discuss current understanding of RHIM:RHIM interactions in normal cellular activation of necroptosis, from a structural and cell biology perspective. It will compare the mechanisms by which pathogens subvert these interactions in order to maintain their replicative and infective cycles and consider the similarities between RHIMs and other functional amyloid-forming proteins associated with cell death and innate immunity. It will discuss the implications of the heteromeric nature and structure of RHIM-based amyloid complexes in the context of other functional amyloids. Copyright © 2018. Published by Elsevier Ltd.
Energy landscape paving simulations of the trp-cage protein.
Schug, Alexander; Wenzel, Wolfgang; Hansmann, Ulrich H E
2005-05-15
We evaluate the efficiency of multiple variants of energy landscape paving in all-atom simulations of the trp-cage protein using a recently developed new force field. Especially, we introduce a temperature-free variant of the method and demonstrate that it allows a fast scanning of the energy landscape. Nativelike structures are found in less time than by other techniques. The sampled low-energy configurations indicate a funnel-like energy landscape.
Trindade, Inês B.; Fonseca, Bruno M.; Matias, Pedro M.; Louro, Ricardo O.; Moe, Elin
2016-01-01
Siderophore-binding proteins (SIPs) perform a key role in iron acquisition in multiple organisms. In the genome of the marine bacterium Shewanella frigidimarina NCIMB 400, the gene tagged as SFRI_RS12295 encodes a protein from this family. Here, the cloning, expression, purification and crystallization of this protein are reported, together with its preliminary X-ray crystallographic analysis to 1.35 Å resolution. The SIP crystals belonged to the monoclinic space group P21, with unit-cell parameters a = 48.04, b = 78.31, c = 67.71 Å, α = 90, β = 99.94, γ = 90°, and are predicted to contain two molecules per asymmetric unit. Structure determination by molecular replacement and the use of previously determined ∼2 Å resolution SIP structures with ∼30% sequence identity as templates are ongoing. PMID:27599855
Natural supramolecular building blocks: from virus coat proteins to viral nanoparticles.
Liu, Zhi; Qiao, Jing; Niu, Zhongwei; Wang, Qian
2012-09-21
Viruses belong to a fascinating class of natural supramolecular structures, composed of multiple copies of coat proteins (CPs) that assemble into different shapes with a variety of sizes from tens to hundreds of nanometres. Because of their advantages including simple/economic production, well-defined structural features, unique shapes and sizes, genetic programmability and robust chemistries, recently viruses and virus-like nanoparticles (VLPs) have been used widely in biomedical applications and materials synthesis. In this critical review, we highlight recent advances in the use of virus coat proteins (VCPs) and viral nanoparticles (VNPs) as building blocks in self-assembly studies and materials development. We first discuss the self-assembly of VCPs into VLPs, which can efficiently incorporate a variety of different materials as cores inside the viral protein shells. Then, the self-assembly of VNPs at surfaces or interfaces is summarized. Finally, we discuss the co-assembly of VNPs with different functional materials (178 references).
Structural evolution of glycan recognition by a family of potent HIV antibodies.
Garces, Fernando; Sok, Devin; Kong, Leopold; McBride, Ryan; Kim, Helen J; Saye-Francisco, Karen F; Julien, Jean-Philippe; Hua, Yuanzi; Cupo, Albert; Moore, John P; Paulson, James C; Ward, Andrew B; Burton, Dennis R; Wilson, Ian A
2014-09-25
The HIV envelope glycoprotein (Env) is densely covered with self-glycans that should help shield it from recognition by the human immune system. Here, we examine how a particularly potent family of broadly neutralizing antibodies (Abs) has evolved common and distinct structural features to counter the glycan shield and interact with both glycan and protein components of HIV Env. The inferred germline antibody already harbors potential binding pockets for a glycan and a short protein segment. Affinity maturation then leads to divergent evolutionary branches that either focus on a single glycan and protein segment (e.g., Ab PGT124) or engage multiple glycans (e.g., Abs PGT121-123). Furthermore, other surrounding glycans are avoided by selecting an appropriate initial antibody shape that prevents steric hindrance. Such molecular recognition lessons are important for engineering proteins that can recognize or accommodate glycans. Copyright © 2014 Elsevier Inc. All rights reserved.
Amino acid sequence analysis of the annexin super-gene family of proteins.
Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J
1991-06-15
The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Bioinformatic prediction and in vivo validation of residue-residue interactions in human proteins
NASA Astrophysics Data System (ADS)
Jordan, Daniel; Davis, Erica; Katsanis, Nicholas; Sunyaev, Shamil
2014-03-01
Identifying residue-residue interactions in protein molecules is important for understanding both protein structure and function in the context of evolutionary dynamics and medical genetics. Such interactions can be difficult to predict using existing empirical or physical potentials, especially when residues are far from each other in sequence space. Using a multiple sequence alignment of 46 diverse vertebrate species we explore the space of allowed sequences for orthologous protein families. Amino acid changes that are known to damage protein function allow us to identify specific changes that are likely to have interacting partners. We fit the parameters of the continuous-time Markov process used in the alignment to conclude that these interactions are primarily pairwise, rather than higher order. Candidates for sites under pairwise epistasis are predicted, which can then be tested by experiment. We report the results of an initial round of in vivo experiments in a zebrafish model that verify the presence of multiple pairwise interactions predicted by our model. These experimentally validated interactions are novel, distant in sequence, and are not readily explained by known biochemical or biophysical features.
FASMA: a service to format and analyze sequences in multiple alignments.
Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M
2007-12-01
Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
Deep sequencing methods for protein engineering and design.
Wrenbeck, Emily E; Faber, Matthew S; Whitehead, Timothy A
2017-08-01
The advent of next-generation sequencing (NGS) has revolutionized protein science, and the development of complementary methods enabling NGS-driven protein engineering have followed. In general, these experiments address the functional consequences of thousands of protein variants in a massively parallel manner using genotype-phenotype linked high-throughput functional screens followed by DNA counting via deep sequencing. We highlight the use of information rich datasets to engineer protein molecular recognition. Examples include the creation of multiple dual-affinity Fabs targeting structurally dissimilar epitopes and engineering of a broad germline-targeted anti-HIV-1 immunogen. Additionally, we highlight the generation of enzyme fitness landscapes for conducting fundamental studies of protein behavior and evolution. We conclude with discussion of technological advances. Copyright © 2016 Elsevier Ltd. All rights reserved.
2015-01-01
Background Computer-aided drug design has a long history of being applied to discover new molecules to treat various cancers, but it has always been focused on single targets. The development of systems biology has let scientists reveal more hidden mechanisms of cancers, but attempts to apply systems biology to cancer therapies remain at preliminary stages. Our lab has successfully developed various systems biology models for several cancers. Based on these achievements, we present the first attempt to combine multiple-target therapy with systems biology. Methods In our previous study, we identified 28 significant proteins--i.e., common core network markers--of four types of cancers as house-keeping proteins of these cancers. In this study, we ranked these proteins by summing their carcinogenesis relevance values (CRVs) across the four cancers, and then performed docking and pharmacophore modeling to do virtual screening on the NCI database for anti-cancer drugs. We also performed pathway analysis on these proteins using Panther and MetaCore to reveal more mechanisms of these cancer house-keeping proteins. Results We designed several approaches to discover targets for multiple-target cocktail therapies. In the first one, we identified the top 20 drugs for each of the 28 cancer house-keeping proteins, and analyzed the docking pose to further understand the interaction mechanisms of these drugs. After screening for duplicates, we found that 13 of these drugs could target 11 proteins simultaneously. In the second approach, we chose the top 5 proteins with the highest summed CRVs and used them as the drug targets. We built a pharmacophore and applied it to do virtual screening against the Life-Chemical library for anti-cancer drugs. Based on these results, wet-lab bio-scientists could freely investigate combinations of these drugs for multiple-target therapy for cancers, in contrast to the traditional single target therapy. Conclusions Combination of systems biology with computer-aided drug design could help us develop novel drug cocktails with multiple targets. We believe this will enhance the efficiency of therapeutic practice and lead to new directions for cancer therapy. PMID:26680552
DePorter, Sandra M; McNaughton, Brian R
2014-09-17
The size, well-defined structure, and relatively high folding energies of most proteins allow them to recognize disease-relevant receptors that present a challenge to small molecule reagents. While multiple challenges must be overcome in order to fully exploit the use of protein reagents in basic research and medicine, perhaps the greatest challenge is their intracellular delivery to a particular diseased cell. Here, we describe the genetic and enzymatic manipulation of prostate cancer cell-penetrating M13 bacteriophage to generate nanocarriers for the intracellular delivery of functional exogenous proteins to a human prostate cancer cell line.
Structural principles within the human-virus protein-protein interaction network
Franzosa, Eric A.; Xia, Yu
2011-01-01
General properties of the antagonistic biomolecular interactions between viruses and their hosts (exogenous interactions) remain poorly understood, and may differ significantly from known principles governing the cooperative interactions within the host (endogenous interactions). Systems biology approaches have been applied to study the combined interaction networks of virus and human proteins, but such efforts have so far revealed only low-resolution patterns of host-virus interaction. Here, we layer curated and predicted 3D structural models of human-virus and human-human protein complexes on top of traditional interaction networks to reconstruct the human-virus structural interaction network. This approach reveals atomic resolution, mechanistic patterns of host-virus interaction, and facilitates systematic comparison with the host’s endogenous interactions. We find that exogenous interfaces tend to overlap with and mimic endogenous interfaces, thereby competing with endogenous binding partners. The endogenous interfaces mimicked by viral proteins tend to participate in multiple endogenous interactions which are transient and regulatory in nature. While interface overlap in the endogenous network results largely from gene duplication followed by divergent evolution, viral proteins frequently achieve interface mimicry without any sequence or structural similarity to an endogenous binding partner. Finally, while endogenous interfaces tend to evolve more slowly than the rest of the protein surface, exogenous interfaces—including many sites of endogenous-exogenous overlap—tend to evolve faster, consistent with an evolutionary “arms race” between host and pathogen. These significant biophysical, functional, and evolutionary differences between host-pathogen and within-host protein-protein interactions highlight the distinct consequences of antagonism versus cooperation in biological networks. PMID:21680884
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Populations of the Minor α-Conformation in AcGXGNH2 and the α-Helical Nucleation Propensities
NASA Astrophysics Data System (ADS)
Zhou, Yanjun; He, Liu; Zhang, Wenwen; Hu, Jingjing; Shi, Zhengshuang
2016-06-01
Intrinsic backbone conformational preferences of different amino acids are important for understanding the local structure of unfolded protein chains. Recent evidence suggests α-structure is relatively minor among three major backbone conformations for unfolded proteins. The α-helices are the dominant structures in many proteins. For these proteins, how could the α-structures occur from the least in unfolded to the most in folded states? Populations of the minor α-conformation in model peptides provide vital information. Reliable determination of populations of the α-conformers in these peptides that exist in multiple equilibriums of different conformations remains a challenge. Combined analyses on data from AcGXPNH2 and AcGXGNH2 peptides allow us to derive the populations of PII, β and α in AcGXGNH2. Our results show that on average residue X in AcGXGNH2 adopt PII, β, and α 44.7%, 44.5% and 10.8% of time, respectively. The contents of α-conformations for different amino acids define an α-helix nucleation propensity scale. With derived PII, β and α-contents, we can construct a free energy-conformation diagram on each AcGXGNH2 in aqueous solution for the three major backbone conformations. Our results would have broad implications on early-stage events of protein folding.
Guaitoli, Giambattista; Raimondi, Francesco; Gilsbach, Bernd K.; Gómez-Llorente, Yacob; Deyaert, Egon; Renzi, Fabiana; Li, Xianting; Schaffner, Adam; Jagtap, Pravin Kumar Ankush; Boldt, Karsten; von Zweydorf, Felix; Gotthardt, Katja; Lorimer, Donald D.; Yue, Zhenyu; Burgin, Alex; Janjic, Nebojsa; Sattler, Michael; Versées, Wim; Ueffing, Marius; Ubarretxena-Belandia, Iban; Kortholt, Arjan; Gloeckner, Christian Johannes
2016-01-01
Leucine-rich repeat kinase 2 (LRRK2) is a large, multidomain protein containing two catalytic domains: a Ras of complex proteins (Roc) G-domain and a kinase domain. Mutations associated with familial and sporadic Parkinson’s disease (PD) have been identified in both catalytic domains, as well as in several of its multiple putative regulatory domains. Several of these mutations have been linked to increased kinase activity. Despite the role of LRRK2 in the pathogenesis of PD, little is known about its overall architecture and how PD-linked mutations alter its function and enzymatic activities. Here, we have modeled the 3D structure of dimeric, full-length LRRK2 by combining domain-based homology models with multiple experimental constraints provided by chemical cross-linking combined with mass spectrometry, negative-stain EM, and small-angle X-ray scattering. Our model reveals dimeric LRRK2 has a compact overall architecture with a tight, multidomain organization. Close contacts between the N-terminal ankyrin and C-terminal WD40 domains, and their proximity—together with the LRR domain—to the kinase domain suggest an intramolecular mechanism for LRRK2 kinase activity regulation. Overall, our studies provide, to our knowledge, the first structural framework for understanding the role of the different domains of full-length LRRK2 in the pathogenesis of PD. PMID:27357661
NASA Astrophysics Data System (ADS)
Poornima, C. S.; Dean, P. M.
1995-12-01
Water molecules are known to play an important rôle in mediating protein-ligand interactions. If water molecules are conserved at the ligand-binding sites of homologous proteins, such a finding may suggest the structural importance of water molecules in ligand binding. Structurally conserved water molecules change the conventional definition of `binding sites' by changing the shape and complementarity of these sites. Such conserved water molecules can be important for site-directed ligand/drug design. Therefore, five different sets of homologous protein/protein-ligand complexes have been examined to identify the conserved water molecules at the ligand-binding sites. Our analysis reveals that there are as many as 16 conserved water molecules at the FAD binding site of glutathione reductase between the crystal structures obtained from human and E. coli. In the remaining four sets of high-resolution crystal structures, 2-4 water molecules have been found to be conserved at the ligand-binding sites. The majority of these conserved water molecules are either bound in deep grooves at the protein-ligand interface or completely buried in cavities between the protein and the ligand. All these water molecules, conserved between the protein/protein-ligand complexes from different species, have identical or similar apolar and polar interactions in a given set. The site residues interacting with the conserved water molecules at the ligand-binding sites have been found to be highly conserved among proteins from different species; they are more conserved compared to the other site residues interacting with the ligand. These water molecules, in general, make multiple polar contacts with protein-site residues.
Bhardwaj, Deepak; Lakhanpaul, Suman; Tuteja, Narendra
2012-09-01
Climate change is a major concern especially in view of the increasing global population and food security. Plant scientists need to look for genetic tools whose appropriate usage can contribute to sustainable food availability. G-proteins have been identified as some of the potential genetic tools that could be useful for protecting plants from various stresses. Heterotrimeric G-proteins consisting of three subunits Gα, Gβ and Gγ are important components of a number of signalling pathways. Their structure and functions are already well studied in animals but their potential in plants is now gaining attention for their role in stress tolerance. Earlier we have reported that over expressing pea Gβ conferred heat tolerance in tobacco plants. Here we report the interacting partners (proteins) of Gβ subunit of Pisum sativum and their putative role in stress and development. Out of 90 transformants isolated from the yeast-two-hybrid (Y2H) screening, seven were chosen for further investigation due to their recurrence in multiple experiments. These interacting partners were confirmed using β-galactosidase colony filter lift and ONPG (O-nitrophenyl-β-D-galactopyranoside) assays. These partners include thioredoxin H, histidine-containing phosphotransfer protein 5-like, pathogenesis-related protein, glucan endo-beta-1, 3-glucosidase (acidic isoform), glycine rich RNA binding protein, cold and drought-regulated protein (corA gene) and soluble inorganic pyrophosphatase 1. This study suggests the role of pea Gβ subunit in stress signal transduction and development pathways owing to its capability to interact with a wide range of proteins of multiple functions. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
RPG: the Ribosomal Protein Gene database.
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
RPG: the Ribosomal Protein Gene database
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes. PMID:14681386
Immunoelectron Microscopy of Cryofixed Freeze-Substituted Yeast Saccharomyces cerevisiae.
Fišerová, Jindřiška; Richardson, Christine; Goldberg, Martin W
2016-01-01
Immunolabeling electron microscopy is a challenging technique with demands for perfect ultrastructural and antigen preservation. High-pressure freezing offers an excellent way to fix cellular structure. However, its use for immunolabeling has remained limited because of the low frequency of labeling due to loss of protein antigenicity or accessibility. Here we present a protocol for immunogold labeling of the yeast Saccharomyces cerevisiae that gives specific and multiple labeling while keeping the finest structural details. We use the protocol to reveal the organization of individual nuclear pore complex proteins and the position of transport factors in the yeast Saccharomyces cerevisiae in relation to actual transport events.
Resolving the ambiguity: Making sense of intrinsic disorder when PDB structures disagree.
DeForte, Shelly; Uversky, Vladimir N
2016-03-01
Missing regions in X-ray crystal structures in the Protein Data Bank (PDB) have played a foundational role in the study of intrinsically disordered protein regions (IDPRs), especially in the development of in silico predictors of intrinsic disorder. However, a missing region is only a weak indication of intrinsic disorder, and this uncertainty is compounded by the presence of ambiguous regions, where more than one structure of the same protein sequence "disagrees" in terms of the presence or absence of missing residues. The question is this: are these ambiguous regions intrinsically disordered, or are they the result of static disorder that arises from experimental conditions, ensembles of structures, or domain wobbling? A novel way of looking at ambiguous regions in terms of the pattern between multiple PDB structures has been demonstrated. It was found that the propensity for intrinsic disorder increases as the level of ambiguity decreases. However, it is also shown that ambiguity is more likely to occur as the protein region is placed within different environmental conditions, and even the most ambiguous regions as a set display compositional bias that suggests flexibility. The results suggested that ambiguity is a natural result for many IDPRs crystallized under different conditions and that static disorder and wobbling domains are relatively rare. Instead, it is more likely that ambiguity arises because many of these regions were conditionally or partially disordered. © 2016 The Protein Society.
Structural Insights into Helicobacter pylori Cag Protein Interactions with Host Cell Factors.
Bergé, Célia; Terradot, Laurent
2017-01-01
The most virulent strains of Helicobacter pylori carry a genomic island (cagPAI) containing a set of 27-31 genes. The encoded proteins assemble a syringe-like apparatus to inject the cytotoxin-associated gene A (CagA) protein into gastric cells. This molecular device belongs to the type IV secretion system (T4SS) family albeit with unique characteristics. The cagPAI-encoded T4SS and its effector protein CagA have an intricate relationship with the host cell, with multiple interactions that only start to be deciphered from a structural point of view. On the one hand, the major roles of the interactions between CagL and CagA (and perhaps CagI and CagY) and host cell factors are to facilitate H. pylori adhesion and to mediate the injection of the CagA oncoprotein. On the other hand, CagA interactions with host cell partners interfere with cellular pathways to subvert cell defences and to promote H. pylori infection. Although a clear mechanism for CagA translocation is still lacking, the structural definition of CagA and CagL domains involved in interactions with signalling proteins are progressively coming to light. In this chapter, we will focus on the structural aspects of Cag protein interactions with host cell molecules, critical molecular events precluding H. pylori-mediated gastric cancer development.
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling.
Li, Jilong; Cheng, Jianlin
2016-05-10
Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96-6.37% and 2.42-5.19% on the three datasets over using single templates. MTMG's performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html.
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling
Li, Jilong; Cheng, Jianlin
2016-01-01
Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96–6.37% and 2.42–5.19% on the three datasets over using single templates. MTMG’s performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html. PMID:27161489
Hsp90: a novel target for the disruption of multiple signaling cascades.
Bishop, Stephanie C; Burlison, Joseph A; Blagg, Brian S J
2007-06-01
The 90 kDa heat shock proteins (Hsp90) are proving to be an excellent target for the development of novel anti-cancer agents designed to selectively block the growth and proliferation of tumor cells. Since Hsp90 is a molecular chaperone and is responsible for folding numerous oncogenic proteins, its inhibition represents a novel approach toward the simultaneous disruption of multiple signaling cascades. This review summarizes recent literature implicating Hsp90 as a key facilitator for the maturation of proteins represented in all six hallmarks of cancer: 1) growth signal self-sufficiency, 2) anti-growth signal insensitivity, 3) evasion of apoptosis, 4) unlimited replicative potential, 5) metastasis and tissue invasion, and 6) sustained angiogenesis. Also described are recent advances towards the development of novel Hsp90 inhibitors via structure-based drug design that have contributed to the number of compounds undergoing clinical development.
Christensen, Signe; Horowitz, Scott; Bardwell, James C.A.; Olsen, Johan G.; Willemoës, Martin; Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper; Hamelryck, Thomas; Winther, Jakob R.
2017-01-01
Despite the development of powerful computational tools, the full-sequence design of proteins still remains a challenging task. To investigate the limits and capabilities of computational tools, we conducted a study of the ability of the program Rosetta to predict sequences that recreate the authentic fold of thioredoxin. Focusing on the influence of conformational details in the template structures, we based our study on 8 experimentally determined template structures and generated 120 designs from each. For experimental evaluation, we chose six sequences from each of the eight templates by objective criteria. The 48 selected sequences were evaluated based on their progressive ability to (1) produce soluble protein in Escherichia coli and (2) yield stable monomeric protein, and (3) on the ability of the stable, soluble proteins to adopt the target fold. Of the 48 designs, we were able to synthesize 32, 20 of which resulted in soluble protein. Of these, only two were sufficiently stable to be purified. An X-ray crystal structure was solved for one of the designs, revealing a close resemblance to the target structure. We found a significant difference among the eight template structures to realize the above three criteria despite their high structural similarity. Thus, in order to improve the success rate of computational full-sequence design methods, we recommend that multiple template structures are used. Furthermore, this study shows that special care should be taken when optimizing the geometry of a structure prior to computational design when using a method that is based on rigid conformations. PMID:27659562
Johansson, Kristoffer E; Tidemand Johansen, Nicolai; Christensen, Signe; Horowitz, Scott; Bardwell, James C A; Olsen, Johan G; Willemoës, Martin; Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper; Hamelryck, Thomas; Winther, Jakob R
2016-10-23
Despite the development of powerful computational tools, the full-sequence design of proteins still remains a challenging task. To investigate the limits and capabilities of computational tools, we conducted a study of the ability of the program Rosetta to predict sequences that recreate the authentic fold of thioredoxin. Focusing on the influence of conformational details in the template structures, we based our study on 8 experimentally determined template structures and generated 120 designs from each. For experimental evaluation, we chose six sequences from each of the eight templates by objective criteria. The 48 selected sequences were evaluated based on their progressive ability to (1) produce soluble protein in Escherichia coli and (2) yield stable monomeric protein, and (3) on the ability of the stable, soluble proteins to adopt the target fold. Of the 48 designs, we were able to synthesize 32, 20 of which resulted in soluble protein. Of these, only two were sufficiently stable to be purified. An X-ray crystal structure was solved for one of the designs, revealing a close resemblance to the target structure. We found a significant difference among the eight template structures to realize the above three criteria despite their high structural similarity. Thus, in order to improve the success rate of computational full-sequence design methods, we recommend that multiple template structures are used. Furthermore, this study shows that special care should be taken when optimizing the geometry of a structure prior to computational design when using a method that is based on rigid conformations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Calponin-Like Chd64 Is Partly Disordered
Jakób, Michał; Szpotkowski, Kamil; Wojtas, Magdalena; Rymarczyk, Grzegorz; Ożyhar, Andrzej
2014-01-01
20-hydroxyecdysone (20E) and juvenile hormone (JH) signaling pathways interact to regulate insect development. Recently, two proteins, a calponin-like Chd64 and immunophilin FKBP39 have been found to play a pivotal role in the cross-talk between 20E and JH, although the molecular basis of interaction remains unknown. The aim of this work was to identify the structural features that would provide understanding of the role of Chd64 in multiple and dynamic complex that cross-links the signaling pathways. Here, we demonstrate the results of in silico and in vitro analyses of the structural organization of Chd64 from Drosophila melanogaster and its homologue from Tribolium castaneum. Computational analysis predicted the existence of disordered regions on the termini of both proteins, while the central region appeared to be globular, probably corresponding to the calponin homology (CH) domain. In vitro analyses of the hydrodynamic properties of the proteins from analytical size-exclusion chromatography and analytical ultracentrifugation revealed that DmChd64 and TcChd64 had an asymmetrical, elongated shape, which was further confirmed by small angle X-ray scattering (SAXS). The Kratky plot indicated disorderness in both Chd64 proteins, which could possibly be on the protein termini and which would give rise to specific hydrodynamic properties. Disordered tails are often involved in diverse interactions. Therefore, it is highly possible that there are intrinsically disordered regions (IDRs) on both termini of the Chd64 proteins that serve as platforms for multiple interaction with various partners and constitute the foundation for their regulatory function. PMID:24805353
Structural Basis for Activation of Fatty Acid-binding Protein 4
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gillilan,R.; Ayers, S.; Noy, N.
2007-01-01
Fatty acid-binding protein 4 (FABP4) delivers ligands from the cytosol to the nuclear receptor PPAR{gamma} in the nucleus, thereby enhancing the transcriptional activity of the receptor. Notably, FABP4 binds multiple ligands with a similar affinity but its nuclear translocation is activated only by specific compounds. To gain insight into the structural features that underlie the ligand-specificity in activation of the nuclear import of FABP4, we solved the crystal structures of the protein complexed with two compounds that induce its nuclear translocation, and compared these to the apo-protein and to FABP4 structures bound to non-activating ligands. Examination of these structures indicatesmore » that activation coincides with closure of a portal loop phenylalanine side-chain, contraction of the binding pocket, a subtle shift in a helical domain containing the nuclear localization signal of the protein, and a resultant change in oligomeric state that exposes the nuclear localization signal to the solution. Comparisons of backbone displacements induced by activating ligands with a measure of mobility derived from translation, libration, screw (TLS) refinement, and with a composite of slowest normal modes of the apo state suggest that the helical motion associated with the activation of the protein is part of the repertoire of the equilibrium motions of the apo-protein, i.e. that ligand binding does not induce the activated configuration but serves to stabilize it. Nuclear import of FABP4 can thus be understood in terms of the pre-existing equilibrium hypothesis of ligand binding.« less
Bera, Krishnendu; Rani, Priyanka; Kishor, Gaurav; Agarwal, Shikha; Kumar, Antresh; Singh, Durg Vijay
2017-09-20
ATP-Binding cassette (ABC) transporters play an extensive role in the translocation of diverse sets of biologically important molecules across membrane. EchnocandinB (antifungal) and EcdL protein of Aspergillus rugulosus are encoded by the same cluster of genes. Co-expression of EcdL and echinocandinB reflects tightly linked biological functions. EcdL belongs to Multidrug Resistance associated Protein (MRP) subfamily of ABC transporters with an extra transmembrane domain zero (TMD0). Complete structure of MRP subfamily comprising of TMD0 domain, at atomic resolution is not known. We hypothesized that the transportation of echonocandinB is mediated via EcdL protein. Henceforth, it is pertinent to know the topological arrangement of TMD0, with other domains of protein and its possible role in transportation of echinocandinB. Absence of effective template for TMD0 domain lead us to model by I-TASSER, further structure has been refined by multiple template modelling using homologous templates of remaining domains (TMD1, NBD1, TMD2, NBD2). The modelled structure has been validated for packing, folding and stereochemical properties. MD simulation for 0.1 μs has been carried out in the biphasic environment for refinement of modelled protein. Non-redundant structures have been excavated by clustering of MD trajectory. The structural alignment of modelled structure has shown Z-score -37.9; 31.6, 31.5 with RMSD; 2.4, 4.2, 4.8 with ABC transporters; PDB ID 4F4C, 4M1 M, 4M2T, respectively, reflecting the correctness of structure. EchinocandinB has been docked to the modelled as well as to the clustered structures, which reveals interaction of echinocandinB with TMD0 and other TM helices in the translocation path build of TMDs.
Invited review the coiled coil silk of bees, ants, and hornets.
Sutherland, Tara D; Weisman, Sarah; Walker, Andrew A; Mudie, Stephen T
2012-06-01
In this article, we review current knowledge about the silk produced by the larvae of bees, ants, and hornets [Apoidea and Vespoidea: Hymenoptera]. Different species use the silk either alone or in composites for a variety of purposes including mechanical reinforcement, thermal regulation, or humidification. The characteristic molecular structure of this silk is α-helical proteins assembled into tetrameric coiled coils. Gene sequences from seven species are available, and each species possesses a copy of each of four related silk genes that encode proteins predicted to form coiled coils. The proteins are ordered at multiple length scales within the labial gland of the final larval instar before spinning. The insects control the morphology of the silk during spinning to produce either fibers or sheets. The silk proteins are small and non repetitive and have been produced artificially at high levels by fermentation in E. coli. The artificial silk proteins can be fabricated into materials with structural and mechanical properties similar to those of native silks. Copyright © 2011 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tirado-Lee, Leidamarie; Lee, Allen; Rees, Douglas C.
2014-10-02
molA (HI1472) from H. influenzae encodes a periplasmic binding protein (PBP) that delivers substrate to the ABC transporter MolB{sub 2}C{sub 2} (formerly HI1470/71). The structures of MolA with molybdate and tungstate in the binding pocket were solved to 1.6 and 1.7 {angstrom} resolution, respectively. The MolA-binding protein binds molybdate and tungstate, but not other oxyanions such as sulfate and phosphate, making it the first class III molybdate-binding protein structurally solved. The {approx}100 {mu}M binding affinity for tungstate and molybdate is significantly lower than observed for the class II ModA molybdate-binding proteins that have nanomolar to low micromolar affinity for molybdate.more » The presence of two molybdate loci in H. influenzae suggests multiple transport systems for one substrate, with molABC constituting a low-affinity molybdate locus.« less
NASA Astrophysics Data System (ADS)
Lerner, Michael G.; Meagher, Kristin L.; Carlson, Heather A.
2008-10-01
Use of solvent mapping, based on multiple-copy minimization (MCM) techniques, is common in structure-based drug discovery. The minima of small-molecule probes define locations for complementary interactions within a binding pocket. Here, we present improved methods for MCM. In particular, a Jarvis-Patrick (JP) method is outlined for grouping the final locations of minimized probes into physical clusters. This algorithm has been tested through a study of protein-protein interfaces, showing the process to be robust, deterministic, and fast in the mapping of protein "hot spots." Improvements in the initial placement of probe molecules are also described. A final application to HIV-1 protease shows how our automated technique can be used to partition data too complicated to analyze by hand. These new automated methods may be easily and quickly extended to other protein systems, and our clustering methodology may be readily incorporated into other clustering packages.
Keedy, Daniel A; Hill, Zachary B; Biel, Justin T; Kang, Emily; Rettenmaier, T Justin; Brandao-Neto, Jose; Pearce, Nicholas M; von Delft, Frank; Wells, James A; Fraser, James S
2018-06-07
Allostery is an inherent feature of proteins, but it remains challenging to reveal the mechanisms by which allosteric signals propagate. A clearer understanding of this intrinsic circuitry would afford new opportunities to modulate protein function. Here we have identified allosteric sites in protein tyrosine phosphatase 1B (PTP1B) by combining multiple-temperature X-ray crystallography experiments and structure determination from hundreds of individual small-molecule fragment soaks. New modeling approaches reveal 'hidden' low-occupancy conformational states for protein and ligands. Our results converge on allosteric sites that are conformationally coupled to the active-site WPD loop and are hotspots for fragment binding. Targeting one of these sites with covalently tethered molecules or mutations allosterically inhibits enzyme activity. Overall, this work demonstrates how the ensemble nature of macromolecular structure, revealed here by multitemperature crystallography, can elucidate allosteric mechanisms and open new doors for long-range control of protein function. © 2018, Keedy et al.
Leishmania replication protein A-1 binds in vivo single-stranded telomeric DNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neto, J.L. Siqueira; Instituto de Biologia, UNICAMP, Campinas, SP; Lira, C.B.B.
Replication protein A (RPA) is a highly conserved heterotrimeric single-stranded DNA-binding protein involved in different events of DNA metabolism. In yeast, subunits 1 (RPA-1) and 2 (RPA-2) work also as telomerase recruiters and, in humans, the complex unfolds G-quartet structures formed by the 3' G-rich telomeric strand. In most eukaryotes, RPA-1 and RPA-2 bind DNA using multiple OB fold domains. In trypanosomatids, including Leishmania, RPA-1 has a canonical OB fold and a truncated RFA-1 structural domain. In Leishmania amazonensis, RPA-1 alone can form a complex in vitro with the telomeric G-rich strand. In this work, we show that LaRPA-1 ismore » a nuclear protein that associates in vivo with Leishmania telomeres. We mapped the boundaries of the OB fold DNA-binding domain using deletion mutants. Since Leishmania and other trypanosomatids lack homologues of known telomere end binding proteins, our results raise questions about the function of RPA-1 in parasite telomeres.« less
Multiscale weighted colored graphs for protein flexibility and rigidity analysis
NASA Astrophysics Data System (ADS)
Bramer, David; Wei, Guo-Wei
2018-02-01
Protein structural fluctuation, measured by Debye-Waller factors or B-factors, is known to correlate to protein flexibility and function. A variety of methods has been developed for protein Debye-Waller factor prediction and related applications to domain separation, docking pose ranking, entropy calculation, hinge detection, stability analysis, etc. Nevertheless, none of the current methodologies are able to deliver an accuracy of 0.7 in terms of the Pearson correlation coefficients averaged over a large set of proteins. In this work, we introduce a paradigm-shifting geometric graph model, multiscale weighted colored graph (MWCG), to provide a new generation of computational algorithms to significantly change the current status of protein structural fluctuation analysis. Our MWCG model divides a protein graph into multiple subgraphs based on interaction types between graph nodes and represents the protein rigidity by generalized centralities of subgraphs. MWCGs not only predict the B-factors of protein residues but also accurately analyze the flexibility of all atoms in a protein. The MWCG model is validated over a number of protein test sets and compared with many standard methods. An extensive numerical study indicates that the proposed MWCG offers an accuracy of over 0.8 and thus provides perhaps the first reliable method for estimating protein flexibility and B-factors. It also simultaneously predicts all-atom flexibility in a molecule.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tewary, Sunil K.; Liang, Lingfei; Lin, Zihan
Members of the Parvoviridae family all encode a non-structural protein 1 (NS1) that directs replication of single-stranded viral DNA, packages viral DNA into capsid, and serves as a potent transcriptional activator. Here we report the X-ray structure of the minute virus of mice (MVM) NS1 N-terminal domain at 1.45 Å resolution, showing that sites for dsDNA binding, ssDNA binding and cleavage, nuclear localization, and other functions are integrated on a canonical fold of the histidine-hydrophobic-histidine superfamily of nucleases, including elements specific for this Protoparvovirus but distinct from its Bocaparvovirus or Dependoparvovirus orthologs. High resolution structural analysis reveals a nickase activemore » site with an architecture that allows highly versatile metal ligand binding. The structures support a unified mechanism of replication origin recognition for homotelomeric and heterotelomeric parvoviruses, mediated by a basic-residue-rich hairpin and an adjacent helix in the initiator proteins and by tandem tetranucleotide motifs in the replication origins. - Highlights: • The structure of a parvovirus replication initiator protein has been determined; • The structure sheds light on mechanisms of ssDNA binding and cleavage; • The nickase active site is preconfigured for versatile metal ligand binding; • The binding site for the double-stranded replication origin DNA is identified; • A single domain integrates multiple functions in virus replication.« less
Cao, Renzhi; Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin
2016-09-01
Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. Proteins 2016; 84(Suppl 1):247-259. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Critical Features of Fragment Libraries for Protein Structure Prediction
dos Santos, Karina Baptista
2017-01-01
The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928
Critical Features of Fragment Libraries for Protein Structure Prediction.
Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel
2017-01-01
The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.
Shah, Khyati H; Nostramo, Regina; Zhang, Bo; Varia, Sapna N; Klett, Bethany M; Herman, Paul K
2014-12-01
The cytoplasm of the eukaryotic cell is subdivided into distinct functional domains by the presence of a variety of membrane-bound organelles. The remaining aqueous space may be further partitioned by the regulated assembly of discrete ribonucleoprotein (RNP) complexes that contain particular proteins and messenger RNAs. These RNP granules are conserved structures whose importance is highlighted by studies linking them to human disorders like amyotrophic lateral sclerosis. However, relatively little is known about the diversity, composition, and physiological roles of these cytoplasmic structures. To begin to address these issues, we examined the cytoplasmic granules formed by a key set of signaling molecules, the protein kinases of the budding yeast Saccharomyces cerevisiae. Interestingly, a significant fraction of these proteins, almost 20%, was recruited to cytoplasmic foci specifically as cells entered into the G0-like quiescent state, stationary phase. Colocalization studies demonstrated that these foci corresponded to eight different granules, including four that had not been reported previously. All of these granules were found to rapidly disassemble upon the resumption of growth, and the presence of each was correlated with cell viability in the quiescent cultures. Finally, this work also identified new constituents of known RNP granules, including the well-characterized processing body and stress granule. The composition of these latter structures is therefore more varied than previously thought and could be an indicator of additional biological activities being associated with these complexes. Altogether, these observations indicate that quiescent yeast cells contain multiple distinct cytoplasmic granules that may make important contributions to their long-term survival. Copyright © 2014 by the Genetics Society of America.
Template based protein structure modeling by global optimization in CASP11.
Joo, Keehyoung; Joung, InSuk; Lee, Sun Young; Kim, Jong Yun; Cheng, Qianyi; Manavalan, Balachandran; Joung, Jong Young; Heo, Seungryong; Lee, Juyong; Nam, Mikyung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung
2016-09-01
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Alvarez, Sophie; Roy Choudhury, Swarup; Hicks, Leslie M; Pandey, Sona
2013-03-01
Abscisic acid (ABA) is proposed to be perceived by multiple receptors in plants. We have previously reported on the role of two GPCR-type G-proteins (GTG proteins) as plasma membrane-localized ABA receptors in Arabidopsis thaliana. However, due to the presence of multiple transmembrane domains, detailed structural and biochemical characterization of GTG proteins remains limited. Since ABA induces substantial changes in the proteome of plants, a labeling LC-based quantitative proteomics approach was applied to elucidate the global effects and possible downstream targets of GTG1/GTG2 proteins. Quantitative differences in protein abundance between wild-type and gtg1gtg2 were analyzed for evaluation of the effect of ABA on the root proteome and its dependence on the presence of functional GTG1/GTG2 proteins. The results presented in this study reveal the most comprehensive ABA-responsive root proteome reported to date in Arabidopsis. Notably, the majority of ABA-responsive proteins required the presence of GTG proteins, supporting their key role in ABA signaling. These observations were further confirmed by additional experiments. Overall, comparison of the ABA-dependent protein abundance changes in wild-type versus gtg1gtg2 provides clues to their possible links with some of the well-established effectors of the ABA signaling pathways and their role in mediating phytohormone cross-talk.
Structural mechanism of JH delivery in hemolymph by JHBP of silkworm, Bombyx mori
Suzuki, Rintaro; Fujimoto, Zui; Shiotsuki, Takahiro; Tsuchiya, Wataru; Momma, Mitsuru; Tase, Akira; Miyazawa, Mitsuhiro; Yamazaki, Toshimasa
2011-01-01
Juvenile hormone (JH) plays crucial roles in many aspects of the insect life. All the JH actions are initiated by transport of JH in the hemolymph as a complex with JH-binding protein (JHBP) to target tissues. Here, we report structural mechanism of JH delivery by JHBP based upon the crystal and solution structures of apo and JH-bound JHBP. In solution, apo-JHBP exists in equilibrium of multiple conformations with different orientations of the gate helix for the hormone-binding pocket ranging from closed to open forms. JH-binding to the gate-open form results in the fully closed JHBP-JH complex structure where the bound JH is completely buried inside the protein. JH-bound JHBP opens the gate helix to release the bound hormone likely by sensing the less polar environment at the membrane surface of target cells. This is the first report that provides structural insight into JH signaling. PMID:22355650
Ichikawa, Muneyoshi; Liu, Dinan; Kastritis, Panagiotis L.; Basu, Kaustuv; Hsu, Tzu Chin; Yang, Shunkai; Bui, Khanh Huy
2017-01-01
Cilia are ubiquitous, hair-like appendages found in eukaryotic cells that carry out functions of cell motility and sensory reception. Cilia contain an intriguing cytoskeletal structure, termed the axoneme that consists of nine doublet microtubules radially interlinked and longitudinally organized in multiple specific repeat units. Little is known, however, about how the axoneme allows cilia to be both actively bendable and sturdy or how it is assembled. To answer these questions, we used cryo-electron microscopy to structurally analyse several of the repeating units of the doublet at sub-nanometre resolution. This structural detail enables us to unambiguously assign α- and β-tubulins in the doublet microtubule lattice. Our study demonstrates the existence of an inner sheath composed of different kinds of microtubule inner proteins inside the doublet that likely stabilizes the structure and facilitates the specific building of the B-tubule. PMID:28462916
Wang, Ludi; Clarke, Lisa A; Eason, Russell J; Parker, Christopher C; Qi, Baoxiu; Scott, Rod J; Doughty, James
2017-01-01
The establishment of pollen-pistil compatibility is strictly regulated by factors derived from both male and female reproductive structures. Highly diverse small cysteine-rich proteins (CRPs) have been found to play multiple roles in plant reproduction, including the earliest stages of the pollen-stigma interaction. Secreted CRPs found in the pollen coat of members of the Brassicaceae, the pollen coat proteins (PCPs), are emerging as important signalling molecules that regulate the pollen-stigma interaction. Using a combination of protein characterization, expression and phylogenetic analyses we identified a novel class of Arabidopsis thaliana pollen-borne CRPs, the PCP-Bs (for pollen coat protein B-class) that are related to embryo surrounding factor (ESF1) developmental regulators. Single and multiple PCP-B mutant lines were utilized in bioassays to assess effects on pollen hydration, adhesion and pollen tube growth. Our results revealed that pollen hydration is severely impaired when multiple PCP-Bs are lost from the pollen coat. The hydration defect also resulted in reduced pollen adhesion and delayed pollen tube growth in all mutants studied. These results demonstrate that AtPCP-Bs are key regulators of the hydration 'checkpoint' in establishment of pollen-stigma compatibility. In addition, we propose that interspecies diversity of PCP-Bs may contribute to reproductive barriers in the Brassicaceae. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Probing protein flexibility reveals a mechanism for selective promiscuity
Pabon, Nicolas A; Camacho, Carlos J
2017-01-01
Many eukaryotic regulatory proteins adopt distinct bound and unbound conformations, and use this structural flexibility to bind specifically to multiple partners. However, we lack an understanding of how an interface can select some ligands, but not others. Here, we present a molecular dynamics approach to identify and quantitatively evaluate the interactions responsible for this selective promiscuity. We apply this approach to the anticancer target PD-1 and its ligands PD-L1 and PD-L2. We discover that while unbound PD-1 exhibits a hard-to-drug hydrophilic interface, conserved specific triggers encoded in the cognate ligands activate a promiscuous binding pathway that reveals a flexible hydrophobic binding cavity. Specificity is then established by additional contacts that stabilize the PD-1 cavity into distinct bound-like modes. Collectively, our studies provide insight into the structural basis and evolution of multiple binding partners, and also suggest a biophysical approach to exploit innate binding pathways to drug seemingly undruggable targets. DOI: http://dx.doi.org/10.7554/eLife.22889.001 PMID:28432789
Structural and functional analyses of genes encoding VQ proteins in apple.
Dong, Qinglong; Zhao, Shuang; Duan, Dingyue; Tian, Yi; Wang, Yanpeng; Mao, Ke; Zhou, Zongshan; Ma, Fengwang
2018-07-01
Recent studies with Arabidopsis and soybean have shown that a class of valine-glutamine (VQ) motif-containing proteins interacts with some WRKY transcription factors. However, little is known about the evolution, structures, and functions of those proteins in apple. Here, we examined their features and identified 49 apple VQ genes. Our evolutional analysis revealed that the proteins could be clustered into nine groups together with their homologues in 33 species. Historically, the main characteristics of proteins in Groups I, V, VI, VII, IX, and X were thought to have been generated before the monocot-dicot split, whereas those in Groups II, III + IV, and VIII were generated after that split. In the structural analysis, apple MdVQ proteins appeared to bind only with Group I and IIc MdWRKY proteins. Meanwhile, MdVQ1, MdVQ10, MdVQ15, and MdVQ36 interacted with multiple MdVQ proteins to form heterodimers but MdVQ15 formed a homodimer. The functional analysis indicated that overexpression of some apple MdVQs in Arabidopsis and tobacco plants effected their vegetative and reproductive growth. These results provide important information about the characteristics of apple MdVQ genes and can serve as a solid foundation for further studies about the role of WRKY-VQ interactions in regulating apple developmental and defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Olivier-Mason, Anique; Wojtyniak, Martin; Bowie, Rachel V; Nechipurenko, Inna V; Blacque, Oliver E; Sengupta, Piali
2013-04-01
The structure and function of primary cilia are critically dependent on intracellular trafficking pathways that transport ciliary membrane and protein components. The mechanisms by which these trafficking pathways are regulated are not fully characterized. Here we identify the transmembrane protein OSTA-1 as a new regulator of the trafficking pathways that shape the morphology and protein composition of sensory cilia in C. elegans. osta-1 encodes an organic solute transporter alpha-like protein, mammalian homologs of which have been implicated in membrane trafficking and solute transport, although a role in regulating cilia structure has not previously been demonstrated. We show that mutations in osta-1 result in altered ciliary membrane volume, branch length and complexity, as well as defects in localization of a subset of ciliary transmembrane proteins in different sensory cilia types. OSTA-1 is associated with transport vesicles, localizes to a ciliary compartment shown to house trafficking proteins, and regulates both retrograde and anterograde flux of the endosome-associated RAB-5 small GTPase. Genetic epistasis experiments with sensory signaling, exocytic and endocytic proteins further implicate OSTA-1 as a crucial regulator of ciliary architecture via regulation of cilia-destined trafficking. Our findings suggest that regulation of transport pathways in a cell type-specific manner contributes to diversity in sensory cilia structure and might allow dynamic remodeling of ciliary architecture via multiple inputs.
Johnson, Jennifer L; Entzminger, Kevin C; Hyun, Jeongmin; Kalyoncu, Sibel; Heaner, David P; Morales, Ivan A; Sheppard, Aly; Gumbart, James C; Maynard, Jennifer A; Lieberman, Raquel L
2015-04-01
Crystallization chaperones are attracting increasing interest as a route to crystal growth and structure elucidation of difficult targets such as membrane proteins. While strategies to date have typically employed protein-specific chaperones, a peptide-specific chaperone to crystallize multiple cognate peptide epitope-containing client proteins is envisioned. This would eliminate the target-specific chaperone-production step and streamline the co-crystallization process. Previously, protein engineering and directed evolution were used to generate a single-chain variable (scFv) antibody fragment with affinity for the peptide sequence EYMPME (scFv/EE). This report details the conversion of scFv/EE to an anti-EE Fab format (Fab/EE) followed by its biophysical characterization. The addition of constant chains increased the overall stability and had a negligible impact on the antigen affinity. The 2.0 Å resolution crystal structure of Fab/EE reveals contacts with larger surface areas than those of scFv/EE. Surface plasmon resonance, an enzyme-linked immunosorbent assay, and size-exclusion chromatography were used to assess Fab/EE binding to EE-tagged soluble and membrane test proteins: namely, the β-barrel outer membrane protein intimin and α-helical A2a G protein-coupled receptor (A2aR). Molecular-dynamics simulation of the intimin constructs with and without Fab/EE provides insight into the energetic complexities of the co-crystallization approach.
Loquet, Antoine; Tolchard, James; Berbon, Melanie; Martinez, Denis; Habenstein, Birgit
2017-09-17
Supramolecular protein assemblies play fundamental roles in biological processes ranging from host-pathogen interaction, viral infection to the propagation of neurodegenerative disorders. Such assemblies consist in multiple protein subunits organized in a non-covalent way to form large macromolecular objects that can execute a variety of cellular functions or cause detrimental consequences. Atomic insights into the assembly mechanisms and the functioning of those macromolecular assemblies remain often scarce since their inherent insolubility and non-crystallinity often drastically reduces the quality of the data obtained from most techniques used in structural biology, such as X-ray crystallography and solution Nuclear Magnetic Resonance (NMR). We here present magic-angle spinning solid-state NMR spectroscopy (SSNMR) as a powerful method to investigate structures of macromolecular assemblies at atomic resolution. SSNMR can reveal atomic details on the assembled complex without size and solubility limitations. The protocol presented here describes the essential steps from the production of 13 C/ 15 N isotope-labeled macromolecular protein assemblies to the acquisition of standard SSNMR spectra and their analysis and interpretation. As an example, we show the pipeline of a SSNMR structural analysis of a filamentous protein assembly.
Manik, Mohammad Kawsar; Yang, Huiseon; Tong, Junsen; Im, Young Jun
2017-04-04
Yeast Osh1 belongs to the oxysterol-binding protein (OSBP) family of proteins and contains multiple targeting modules optimized for lipid transport at the nucleus-vacuole junction (NVJ). The key determinants for NVJ targeting and the role of Osh1 at NVJs have remained elusive because of unknown lipid specificities. In this study, we determined the structures of the ankyrin repeat domain (ANK), and OSBP-related domain (ORD) of Osh1, in complex with Nvj1 and ergosterol, respectively. The Osh1 ANK forms a unique bi-lobed structure that recognizes a cytosolic helical segment of Nvj1. We discovered that Osh1 ORD binds ergosterol and phosphatidylinositol 4-phosphate PI(4)P in a competitive manner, suggesting counter-transport function of the two lipids. Ergosterol is bound to the hydrophobic pocket in a head-down orientation, and the structure of the PI(4)P-binding site in Osh1 is well conserved. Our results suggest that Osh1 performs non-vesicular transport of ergosterol and PI(4)P at the NVJ. Copyright © 2017 Elsevier Ltd. All rights reserved.
Vassall, Kenrick A; Bamm, Vladimir V; Harauz, George
2015-11-15
The classic isoforms of myelin basic protein (MBP, 14-21.5 kDa) are essential to formation of the multilamellar myelin sheath of the mammalian central nervous system (CNS). The predominant 18.5-kDa isoform links together the cytosolic surfaces of oligodendrocytes, but additionally participates in cytoskeletal turnover and membrane extension, Fyn-mediated signalling pathways, sequestration of phosphoinositides and maintenance of calcium homoeostasis. All MBP isoforms are intrinsically disordered proteins (IDPs) that interact via molecular recognition fragments (MoRFs), which thereby undergo local disorder-to-order transitions. Their conformations and associations are modulated by environment and by a dynamic barcode of post-translational modifications, particularly phosphorylation by mitogen-activated and other protein kinases and deimination [a hallmark of demyelination in multiple sclerosis (MS)]. The MBPs are thus to myelin what basic histones are to chromatin. Originally thought to be merely structural proteins forming an inert spool, histones are now known to be dynamic entities involved in epigenetic regulation and diseases such as cancer. Analogously, the MBPs are not mere adhesives of compact myelin, but active participants in oligodendrocyte proliferation and in membrane process extension and stabilization during myelinogenesis. A central segment of these proteins is pivotal in membrane-anchoring and SH3 domain (Src homology 3) interaction. We discuss in the present review advances in our understanding of conformational conversions of this classic basic protein upon membrane association, including new thermodynamic analyses of transitions into different structural ensembles and how a shift in the pattern of its post-translational modifications is associated with the pathogenesis and potentially onset of demyelination in MS. © 2015 Authors; published by Portland Press Limited.
Protein intrinsic disorder in plants.
Pazos, Florencio; Pietrosemoli, Natalia; García-Martín, Juan A; Solano, Roberto
2013-09-12
To some extent contradicting the classical paradigm of the relationship between protein 3D structure and function, now it is clear that large portions of the proteomes, especially in higher organisms, lack a fixed structure and still perform very important functions. Proteins completely or partially unstructured in their native (functional) form are involved in key cellular processes underlain by complex networks of protein interactions. The intrinsic conformational flexibility of these disordered proteins allows them to bind multiple partners in transient interactions of high specificity and low affinity. In concordance, in plants this type of proteins has been found in processes requiring these complex and versatile interaction networks. These include transcription factor networks, where disordered proteins act as integrators of different signals or link different transcription factor subnetworks due to their ability to interact (in many cases simultaneously) with different partners. Similarly, they also serve as signal integrators in signaling cascades, such as those related to response to external stimuli. Disordered proteins have also been found in plants in many stress-response processes, acting as protein chaperones or protecting other cellular components and structures. In plants, it is especially important to have complex and versatile networks able to quickly and efficiently respond to changing environmental conditions since these organisms cannot escape and have no other choice than adapting to them. Consequently, protein disorder can play an especially important role in plants, providing them with a fast mechanism to obtain complex, interconnected and versatile molecular networks.
Protein intrinsic disorder in plants
Pazos, Florencio; Pietrosemoli, Natalia; García-Martín, Juan A.; Solano, Roberto
2013-01-01
To some extent contradicting the classical paradigm of the relationship between protein 3D structure and function, now it is clear that large portions of the proteomes, especially in higher organisms, lack a fixed structure and still perform very important functions. Proteins completely or partially unstructured in their native (functional) form are involved in key cellular processes underlain by complex networks of protein interactions. The intrinsic conformational flexibility of these disordered proteins allows them to bind multiple partners in transient interactions of high specificity and low affinity. In concordance, in plants this type of proteins has been found in processes requiring these complex and versatile interaction networks. These include transcription factor networks, where disordered proteins act as integrators of different signals or link different transcription factor subnetworks due to their ability to interact (in many cases simultaneously) with different partners. Similarly, they also serve as signal integrators in signaling cascades, such as those related to response to external stimuli. Disordered proteins have also been found in plants in many stress-response processes, acting as protein chaperones or protecting other cellular components and structures. In plants, it is especially important to have complex and versatile networks able to quickly and efficiently respond to changing environmental conditions since these organisms cannot escape and have no other choice than adapting to them. Consequently, protein disorder can play an especially important role in plants, providing them with a fast mechanism to obtain complex, interconnected and versatile molecular networks. PMID:24062761
Peterson, Lenna X; Shin, Woong-Hee; Kim, Hyungrae; Kihara, Daisuke
2018-03-01
We report our group's performance for protein-protein complex structure prediction and scoring in Round 37 of the Critical Assessment of PRediction of Interactions (CAPRI), an objective assessment of protein-protein complex modeling. We demonstrated noticeable improvement in both prediction and scoring compared to previous rounds of CAPRI, with our human predictor group near the top of the rankings and our server scorer group at the top. This is the first time in CAPRI that a server has been the top scorer group. To predict protein-protein complex structures, we used both multi-chain template-based modeling (TBM) and our protein-protein docking program, LZerD. LZerD represents protein surfaces using 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. Because 3DZD are a soft representation of the protein surface, LZerD is tolerant to small conformational changes, making it well suited to docking unbound and TBM structures. The key to our improved performance in CAPRI Round 37 was to combine multi-chain TBM and docking. As opposed to our previous strategy of performing docking for all target complexes, we used TBM when multi-chain templates were available and docking otherwise. We also describe the combination of multiple scoring functions used by our server scorer group, which achieved the top rank for the scorer phase. © 2017 Wiley Periodicals, Inc.
Cao, Renzhi; Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin
2015-01-01
Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. PMID:26369671
Aghera, Nilesh; Udgaonkar, Jayant B
2012-07-13
Determining whether or not a protein uses multiple pathways to fold is an important goal in protein folding studies. When multiple pathways are present, defined by transition states that differ in their compactness and structure but not significantly in energy, they may manifest themselves by causing the dependence on denaturant concentration of the logarithm of the observed rate constant of folding to have an upward curvature. In this study, the folding mechanism of heterodimeric monellin [double-chain monellin (dcMN)] has been studied over a range of protein and guanidine hydrochloride (GdnHCl) concentrations, using the intrinsic tryptophan fluorescence of the protein as the probe for the folding reaction. Refolding is shown to occur in multiple kinetic phases. In the first stage of refolding, which is silent to any change in intrinsic fluorescence, the two chains of monellin bind to one another to form an encounter complex. Interrupted folding experiments show that the initial encounter complex folds to native dcMN via two folding routes. A productive folding intermediate population is identified on one route but not on both of these routes. Two intermediate subpopulations appear to form in a fast kinetic phase, and native dcMN forms in a slow kinetic phase. The chevron arms for both the fast and slow phases of refolding are shown to have upward curvatures, suggesting that at least two pathways each defined by a different intermediate are operational during these kinetic phases of structure formation. Refolding switches from one pathway to the other as the GdnHCl concentration is increased. Copyright © 2012 Elsevier Ltd. All rights reserved.
Assenberg, R; Delmas, O; Morin, B; Graham, S C; De Lamballerie, X; Laubert, C; Coutard, B; Grimes, J M; Neyts, J; Owens, R J; Brandt, B W; Gorbalenya, A; Tucker, P; Stuart, D I; Canard, B; Bourhy, H
2010-08-01
Some mammalian rhabdoviruses may infect humans, and also infect invertebrates, dogs, and bats, which may act as vectors transmitting viruses among different host species. The VIZIER programme, an EU-funded FP6 program, has characterized viruses that belong to the Vesiculovirus, Ephemerovirus and Lyssavirus genera of the Rhabdoviridae family to perform ground-breaking research on the identification of potential new drug targets against these RNA viruses through comprehensive structural characterization of the replicative machinery. The contribution of VIZIER programme was of several orders. First, it contributed substantially to research aimed at understanding the origin, evolution and diversity of rhabdoviruses. This diversity was then used to obtain further structural information on the proteins involved in replication. Two strategies were used to produce recombinant proteins by expression of both full length or domain constructs in either E. coli or insect cells, using the baculovirus system. In both cases, parallel cloning and expression screening at small-scale of multiple constructs based on different viruses including the addition of fusion tags, was key to the rapid generation of expression data. As a result, some progress has been made in the VIZIER programme towards dissecting the multi-functional L protein into components suitable for structural and functional studies. However, the phosphoprotein polymerase co-factor and the structural matrix protein, which play a number of roles during viral replication and drives viral assembly, have both proved much more amenable to structural biology. Applying the multi-construct/multi-virus approach central to protein production processes in VIZIER has yielded new structural information which may ultimately be exploitable in the derivation of novel ways of intervening in viral replication. Copyright 2010 Elsevier B.V. All rights reserved.
Probing structures of large protein complexes using zero-length cross-linking.
Rivera-Santiago, Roland F; Sriswasdi, Sira; Harper, Sandra L; Speicher, David W
2015-11-01
Structural mass spectrometry (MS) is a field with growing applicability for addressing complex biophysical questions regarding proteins and protein complexes. One of the major structural MS approaches involves the use of chemical cross-linking coupled with MS analysis (CX-MS) to identify proximal sites within macromolecules. Identified cross-linked sites can be used to probe novel protein-protein interactions or the derived distance constraints can be used to verify and refine molecular models. This review focuses on recent advances of "zero-length" cross-linking. Zero-length cross-linking reagents do not add any atoms to the cross-linked species due to the lack of a spacer arm. This provides a major advantage in the form of providing more precise distance constraints as the cross-linkable groups must be within salt bridge distances in order to react. However, identification of cross-linked peptides using these reagents presents unique challenges. We discuss recent efforts by our group to minimize these challenges by using multiple cycles of LC-MS/MS analysis and software specifically developed and optimized for identification of zero-length cross-linked peptides. Representative data utilizing our current protocol are presented and discussed. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyer, K.D.; Handen, J.S.; Rosenberg, H.F.
The Charcot-Leyden crystal (CLC) protein, or eosinophil lysophospholipase, is a characteristic protein of human eosinophils and basophils; recent work has demonstrated that the CLC protein is both structurally and functionally related to the galectin family of {beta}-galactoside binding proteins. The galectins as a group share a number of features in common, including a linear ligand binding site encoded on a single exon. In this work, we demonstrate that the intron-exon structure of the gene encoding CLC is analogous to those encoding the galectins. The coding sequence of the CLC gene is divided into four exons, with the entire {beta}-galactoside bindingmore » site encoded by exon III. We have isolated CLC {beta}-galactoside binding sites from both orangutan (Pongo pygmaeus) and murine (Mus musculus) genomic DNAs, both encoded on single exons, and noted conservation of the amino acids shown to interact directly with the {beta}-galactoside ligand. The most likely interpretation of these results suggests the occurrence of one or more exon duplication and insertion events, resulting in the distribution of this lectin domain to CLC as well as to the multiple galectin genes. 35 refs., 3 figs.« less
Neuregulin: First Steps Towards a Structure
NASA Technical Reports Server (NTRS)
Ferree, D. S.; Malone, C. C.; Karr, L. J.
2003-01-01
Neuregulins are growth factor domain proteins with diverse bioactivities, such as cell proliferation, receptor binding, and differentiation. Neureguh- 1 binds to two members of the ErbB class I tyrosine kinase receptors, ErbB3 and ErbB4. A number of human cancers overexpress the ErbB receptors, and neuregulin can modulate the growth of certain cancer types. Neuregulin-1 has been shown to promote the migration of invasive gliomas of the central nervous system. Neuregulin has also been implicated in schizophrenia, multiple sclerosis and abortive cardiac abnormalities. The full function of neuregulin-1 is not known. In this study we are inserting a cDNA clone obtained from American Type Culture Collection into E.coli expression vectors to express neuregulin- 1 protein. Metal chelate affinity chromatography is used for recombinant protein purification. Crystallization screening will proceed for X-ray diffraction studies following expression, optimization, and protein purification. In spite of medical and scholarly interest in the neuregulins, there are currently no high-resolution structures available for these proteins. Here we present the first steps toward attaining a high-resolution structure of neuregulin- 1, which will help enable us to better understand its function
Masso, Majid; Vaisman, Iosif I
2014-01-01
The AUTO-MUTE 2.0 stand-alone software package includes a collection of programs for predicting functional changes to proteins upon single residue substitutions, developed by combining structure-based features with trained statistical learning models. Three of the predictors evaluate changes to protein stability upon mutation, each complementing a distinct experimental approach. Two additional classifiers are available, one for predicting activity changes due to residue replacements and the other for determining the disease potential of mutations associated with nonsynonymous single nucleotide polymorphisms (nsSNPs) in human proteins. These five command-line driven tools, as well as all the supporting programs, complement those that run our AUTO-MUTE web-based server. Nevertheless, all the codes have been rewritten and substantially altered for the new portable software, and they incorporate several new features based on user feedback. Included among these upgrades is the ability to perform three highly requested tasks: to run "big data" batch jobs; to generate predictions using modified protein data bank (PDB) structures, and unpublished personal models prepared using standard PDB file formatting; and to utilize NMR structure files that contain multiple models.
NASA Astrophysics Data System (ADS)
Miao, Xijiang; Mukhopadhyay, Rishi; Valafar, Homayoun
2008-10-01
Advances in NMR instrumentation and pulse sequence design have resulted in easier acquisition of Residual Dipolar Coupling (RDC) data. However, computational and theoretical analysis of this type of data has continued to challenge the international community of investigators because of their complexity and rich information content. Contemporary use of RDC data has required a-priori assignment, which significantly increases the overall cost of structural analysis. This article introduces a novel algorithm that utilizes unassigned RDC data acquired from multiple alignment media ( nD-RDC, n ⩾ 3) for simultaneous extraction of the relative order tensor matrices and reconstruction of the interacting vectors in space. Estimation of the relative order tensors and reconstruction of the interacting vectors can be invaluable in a number of endeavors. An example application has been presented where the reconstructed vectors have been used to quantify the fitness of a template protein structure to the unknown protein structure. This work has other important direct applications such as verification of the novelty of an unknown protein and validation of the accuracy of an available protein structure model in drug design. More importantly, the presented work has the potential to bridge the gap between experimental and computational methods of structure determination.
Interactions among tobacco sieve element occlusion (SEO) proteins.
Jekat, Stephan B; Ernst, Antonia M; Zielonka, Sascia; Noll, Gundula A; Prüfer, Dirk
2012-12-01
Angiosperms transport their photoassimilates through sieve tubes, which comprise longitudinally-connected sieve elements. In dicots and also some monocots, the sieve elements contain parietal structural proteins known as phloem proteins or P-proteins. Following injury, P proteins disperse and accumulate as viscous plugs at the sieve plates to prevent the loss of valuable transport sugars. Tobacco (Nicotiana tabacum) P-proteins are multimeric complexes comprising subunits encoded by members of the SEO (sieve element occlusion) gene family. The existence of multiple subunits suggests that P-protein assembly involves interactions between SEO proteins, but this process is largely uncharacterized and it is unclear whether the different subunits perform unique roles or are redundant. We therefore extended our analysis of the tobacco P-proteins NtSEO1 and NtSEO2 to investigate potential interactions between them, and found that both proteins can form homomeric and heteromeric complexes in planta.
Ankyrin-repeat containing proteins of microbes: a conserved structure with functional diversity
Al-Khodor, Souhaila; Price, Christopher T.; Kalia, Awdhesh; Kwaik, Yousef Abu
2009-01-01
Summary The ankyrin repeat (ANK) is the most common protein-protein interaction motif in nature and predominantly found in eukaryotic proteins. The genome sequencing of various pathogenic or symbiotic bacteria and eukaryotic viruses identified numerous genes encoding ANK-containing proteins that were proposed to have been acquired from eukaryotes by horizontal gene transfer. However, the recent discovery of additional ANK-containing proteins encoded in the genomes of archaea and free-living bacteria suggests either a more ancient origin of the ANK motif or multiple convergent evolution events. Many bacterial pathogens employ various types of secretion systems to deliver ANK-containing proteins into eukaryotic cells where they mimic or manipulate various host functions. Understanding the molecular and biochemical functions of this family of proteins will enhance our understanding of important host-microbe interactions. PMID:19962898
Carvajal, Felipe; Vallejos, Maricarmen; Walters, Beth; Contreras, Nataly; Hertz, Marla I; Olivares, Eduardo; Cáceres, Carlos J; Pino, Karla; Letelier, Alejandro; Thompson, Sunnie R; López-Lastra, Marcelo
2016-07-01
The 5' leader of the HIV-1 genomic RNA is a multifunctional region that folds into secondary/tertiary structures that regulate multiple processes during viral replication including translation initiation. In this work, we examine the internal ribosome entry site (IRES) located in the 5' leader that drives translation initiation of the viral Gag protein under conditions that hinder cap-dependent translation initiation. We show that activity of the HIV-1 IRES relies on ribosomal protein S25 (eS25). Additionally, a mechanistic and mutational analysis revealed that the HIV-1 IRES is modular in nature and that once the 40S ribosomal subunit is recruited to the IRES, translation initiates without the need of ribosome scanning. These findings elucidate a mechanism of initiation by the HIV-1 IRES whereby a number of highly structured sites present within the HIV-1 5' leader leads to the recruitment of the 40S subunit directly at the site of initiation of protein synthesis. © 2016 Federation of European Biochemical Societies.
Structure and function of POTRA domains of Omp85/TPS superfamily.
Simmerman, Richard F; Dave, Ashita M; Bruce, Barry D
2014-01-01
The Omp85/TPS (outer-membrane protein of 85 kDa/two-partner secretion) superfamily is a ubiquitous and major class of β-barrel proteins. This superfamily is restricted to the outer membranes of gram-negative bacteria, mitochondria, and chloroplasts. The common architecture, with an N-terminus consisting of repeats of soluble polypeptide-transport-associated (POTRA) domains and a C-terminal β-barrel pore is highly conserved. The structures of multiple POTRA domains and one full-length TPS protein have been solved, yet discovering roles of individual POTRA domains has been difficult. This review focuses on similarities and differences between POTRA structures, emphasizing POTRA domains in autotrophic organisms including plants and cyanobacteria. Unique roles, specific for certain POTRA domains, are examined in the context of POTRA location with respect to their attachment to the β-barrel pore, and their degree of biological dispensability. Finally, because many POTRA domains may have the ability to interact with thousands of partner proteins, possible modes of these interactions are also explored. © 2014 Elsevier Inc. All rights reserved.
Prediction of pi-turns in proteins using PSI-BLAST profiles and secondary structure information.
Wang, Yan; Xue, Zhi-Dong; Shi, Xiao-Hong; Xu, Jin
2006-09-01
Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.
Carvajal, Felipe; Vallejos, Maricarmen; Walters, Beth A.; Contreras, Nataly; Hertz, Marla I.; Olivares, Eduardo; Cáceres, C. Joaquín; Pino, Karla; Letelier, Alejandro; Thompson, Sunnie R.; López-Lastra, Marcelo
2016-01-01
The 5′leader of the HIV-1 genomic RNA is a multifunctional region that folds into secondary/tertiary structures that regulate multiple processes during viral replication including translation initiation. In this work we examine the internal ribosome entry site (IRES) located in the 5′leader that drives translation initiation of the viral Gag protein under conditions that hinder cap-dependent translation initiation. We show that activity of the HIV-1 IRES relies on ribosomal protein S25 (eS25). Additionally, a mechanistic and mutational analysis revealed that the HIV-1 IRES is modular in nature and that once the 40S ribosomal subunit is recruited to the IRES, translation initiates without the need of ribosome scanning. These findings elucidate a mechanism of initiation by the HIV-1 IRES whereby a number of highly structured sites present within the HIV-1 5′leader leads to the recruitment of the 40S subunit directly at the site of initiation of protein synthesis. PMID:27191820
A Survey of Protein Structures from Archaeal Viruses
Dellas, Nikki; Lawrence, C. Martin; Young, Mark J.
2013-01-01
Viruses that infect the third domain of life, Archaea, are a newly emerging field of interest. To date, all characterized archaeal viruses infect archaea that thrive in extreme conditions, such as halophilic, hyperthermophilic, and methanogenic environments. Viruses in general, especially those replicating in extreme environments, contain highly mosaic genomes with open reading frames (ORFs) whose sequences are often dissimilar to all other known ORFs. It has been estimated that approximately 85% of virally encoded ORFs do not match known sequences in the nucleic acid databases, and this percentage is even higher for archaeal viruses (typically 90%–100%). This statistic suggests that either virus genomes represent a larger segment of sequence space and/or that viruses encode genes of novel fold and/or function. Because the overall three-dimensional fold of a protein evolves more slowly than its sequence, efforts have been geared toward structural characterization of proteins encoded by archaeal viruses in order to gain insight into their potential functions. In this short review, we provide multiple examples where structural characterization of archaeal viral proteins has indeed provided significant functional and evolutionary insight. PMID:25371334
Darboe, Numukunda; Kenjale, Roma; Picking, Wendy L; Picking, William D; Middaugh, C Russell
2006-03-01
Shigella and Salmonella use similar type III secretion systems for delivering effector proteins into host cells. This secretion system consists of a base anchored in both bacterial membranes and an extracellular "needle" that forms a rod-like structure exposed on the pathogen surface. The needle is composed of multiple subunits of a single protein and makes direct contact with host cells to facilitate protein delivery. The proteins that make up the needle of Shigella and Salmonella are MxiH and PrgI, respectively. These proteins are attractive vaccine candidates because of their essential role in virulence and surface exposure. We therefore isolated, purified, and characterized the monomeric forms of MxiH and PrgI. Their far-UV circular dichroism spectra show structural similarities with hints of subtle differences in their secondary structure. Both proteins are highly helical and thermally unstable, with PrgI having a midpoint of thermal unfolding (Tm) near 37 degrees C and MxiH having a value near 42 degrees C. The two proteins also have comparable intrinsic stabilities as measured by chemically induced (urea) unfolding. MxiH, however, with a free energy of unfolding (DeltaG degrees 0,un) of 1.6 kcal/mol, is slightly more stable than PrgI (1.2 kcal/mol). The relatively low m-values obtained for the urea-induced unfolding of the proteins suggest that they undergo only a small change in solvent-accessible surface area. This argues that when MxiH and PrgI are incorporated into the needle complex, they obtain a more stable structural state through the introduction of protein-protein interactions.
Wienk, Hans; Slootweg, Jack C.; Speerstra, Sietske; Kaptein, Robert; Boelens, Rolf; Folkers, Gert E.
2013-01-01
To maintain the integrity of the genome, multiple DNA repair systems exist to repair damaged DNA. Recognition of altered DNA, including bulky adducts, pyrimidine dimers and interstrand crosslinks (ICL), partially depends on proteins containing helix-hairpin-helix (HhH) domains. To understand how ICL is specifically recognized by the Fanconi anemia proteins FANCM and FAAP24, we determined the structure of the HhH domain of FAAP24. Although it resembles other HhH domains, the FAAP24 domain contains a canonical hairpin motif followed by distorted motif. The HhH domain can bind various DNA substrates; using nuclear magnetic resonance titration experiments, we demonstrate that the canonical HhH motif is required for double-stranded DNA (dsDNA) binding, whereas the unstructured N-terminus can interact with single-stranded DNA. Both DNA binding surfaces are used for binding to ICL-like single/double-strand junction-containing DNA substrates. A structural model for FAAP24 bound to dsDNA has been made based on homology with the translesion polymerase iota. Site-directed mutagenesis, sequence conservation and charge distribution support the dsDNA-binding model. Analogous to other HhH domain-containing proteins, we suggest that multiple FAAP24 regions together contribute to binding to single/double-strand junction, which could contribute to specificity in ICL DNA recognition. PMID:23661679
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mei, Yang; Glover, Karen; Su, Minfei
BECN1 (Beclin 1), a highly conserved eukaryotic protein, is a key regulator of autophagy, a cellular homeostasis pathway, and also participates in vacuolar protein sorting, endocytic trafficking, and apoptosis. BECN1 is important for embryonic development, the innate immune response, tumor suppression, and protection against neurodegenerative disorders, diabetes, and heart disease. BECN1 mediates autophagy as a core component of the class III phosphatidylinositol 3-kinase complexes. However, the exact mechanism by which it regulates the activity of these complexes, or mediates its other diverse functions is unclear. BECN1 interacts with several diverse protein partners, perhaps serving as a scaffold or interaction hubmore » for autophagy. Based on extensive structural, biophysical and bioinformatics analyses, BECN1 consists of an intrinsically disordered region (IDR), which includes a BH3 homology domain (BH3D); a flexible helical domain (FHD); a coiled-coil domain (CCD); and a β-α-repeated autophagy-specific domain (BARAD). Each of these BECN1 domains mediates multiple diverse interactions that involve concomitant conformational changes. Thus, BECN1 conformational flexibility likely plays a key role in facilitating diverse protein interactions. Further, BECN1 conformation and interactions are also modulated by numerous post-translational modifications. A better structure-based understanding of the interplay between different BECN1 conformational and binding states, and the impact of post-translational modifications will be essential to elucidating the mechanism of its multiple biological roles.« less
Shahinyan, Grigor; Margaryan, Armine; Panosyan, Hovik; Trchounian, Armen
2017-05-02
Among the huge diversity of thermophilic bacteria mainly bacilli have been reported as active thermostable lipase producers. Geothermal springs serve as the main source for isolation of thermostable lipase producing bacilli. Thermostable lipolytic enzymes, functioning in the harsh conditions, have promising applications in processing of organic chemicals, detergent formulation, synthesis of biosurfactants, pharmaceutical processing etc. In order to study the distribution of lipase-producing thermophilic bacilli and their specific lipase protein primary structures, three lipase producers from different genera were isolated from mesothermal (27.5-70 °C) springs distributed on the territory of Armenia and Nagorno Karabakh. Based on phenotypic characteristics and 16S rRNA gene sequencing the isolates were identified as Geobacillus sp., Bacillus licheniformis and Anoxibacillus flavithermus strains. The lipase genes of isolates were sequenced by using initially designed primer sets. Multiple alignments generated from primary structures of the lipase proteins and annotated lipase protein sequences, conserved regions analysis and amino acid composition have illustrated the similarity (98-99%) of the lipases with true lipases (family I) and GDSL esterase family (family II). A conserved sequence block that determines the thermostability has been identified in the multiple alignments of the lipase proteins. The results are spreading light on the lipase producing bacilli distribution in geothermal springs in Armenia and Nagorno Karabakh. Newly isolated bacilli strains could be prospective source for thermostable lipases and their genes.
Multiple forms of statherin in human salivary secretions.
Jensen, J L; Lamkin, M S; Troxler, R F; Oppenheim, F G
1991-01-01
Sequential chromatography of hydroxyapatite-adsorbed salivary proteins from submandibular/sublingual secretions on Sephadex G-50 and reversed-phase HPLC resulted in the purification of statherin and several statherin variants. Amino acid analysis, Edman degradation and carboxypeptidase digestion of the obtained protein fractions led to the determination of the complete primary structures of statherin SV1, statherin SV2, and statherin SV3. SV1 is identical to statherin but lacks the carboxyl-terminal phenylalanine residue. SV2, lacking residues 6-15, is otherwise identical to statherin. SV3 is identical to SV2 but lacks the carboxyl-terminal phenylalanine. These results provide the first evidence for multiple forms of statherin which are probably derived both by post-translational modification and alternative splicing of the statherin gene.
Guenot, J.; Kollman, P. A.
1992-01-01
Although aqueous simulations with periodic boundary conditions more accurately describe protein dynamics than in vacuo simulations, these are computationally intensive for most proteins. Trp repressor dynamic simulations with a small water shell surrounding the starting model yield protein trajectories that are markedly improved over gas phase, yet computationally efficient. Explicit water in molecular dynamics simulations maintains surface exposure of protein hydrophilic atoms and burial of hydrophobic atoms by opposing the otherwise asymmetric protein-protein forces. This properly orients protein surface side chains, reduces protein fluctuations, and lowers the overall root mean square deviation from the crystal structure. For simulations with crystallographic waters only, a linear or sigmoidal distance-dependent dielectric yields a much better trajectory than does a constant dielectric model. As more water is added to the starting model, the differences between using distance-dependent and constant dielectric models becomes smaller, although the linear distance-dependent dielectric yields an average structure closer to the crystal structure than does a constant dielectric model. Multiplicative constants greater than one, for the linear distance-dependent dielectric simulations, produced trajectories that are progressively worse in describing trp repressor dynamics. Simulations of bovine pancreatic trypsin were used to ensure that the trp repressor results were not protein dependent and to explore the effect of the nonbonded cutoff on the distance-dependent and constant dielectric simulation models. The nonbonded cutoff markedly affected the constant but not distance-dependent dielectric bovine pancreatic trypsin inhibitor simulations. As with trp repressor, the distance-dependent dielectric model with a shell of water surrounding the protein produced a trajectory in better agreement with the crystal structure than a constant dielectric model, and the physical properties of the trajectory average structure, both with and without a nonbonded cutoff, were comparable. PMID:1304396
Anding, A L; Baehrecke, E H
2015-03-01
Autophagy is a catabolic process used to deliver cellular material to the lysosome for degradation. The core Vps34/class III phosphatidylinositol 3-kinase (PI3K) complex, consisting of Atg6, Vps15, and Vps34, is highly conserved throughout evolution, critical for recruiting autophagy-related proteins to the preautophagosomal structure and for other vesicular trafficking processes, including vacuolar protein sorting. Atg6 and Vps34 have been well characterized, but the Vps15 kinase remains poorly characterized with most studies focusing on nutrient deprivation-induced autophagy. Here, we investigate the function of Vps15 in different cellular contexts and find that it is necessary for both stress-induced and developmentally programmed autophagy in various tissues in Drosophila melanogaster. Vps15 is required for autophagy that is induced by multiple forms of stress, including nutrient deprivation, hypoxia, and oxidative stress. Furthermore, autophagy that is triggered by physiological stimuli during development in the fat body, intestine, and salivary gland also require the function of Vps15. In addition, we show that Vps15 is necessary for efficient salivary gland protein secretion. These data illustrate the broad importance of Vps15 in multiple forms of autophagy in different animal cells, and also highlight the pleiotropic function of this kinase in multiple vesicle-trafficking pathways.
A novel approach to multiple sequence alignment using hadoop data grids.
Sudha Sadasivam, G; Baktavatchalam, G
2010-01-01
Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Disruption of Inhibitory Function in the Ts65Dn Mouse Hippocampus Through Overexpression of GIRK2
2007-10-24
are prominent (Galdzicki and Siarey, 2003). We found that GIRK2 mRNA and protein subunits are highly overexpressed in multiple CNS structures ... STRUCTURE GIRK channels are members of the large family of potassium inward rectifiers (Kir). The seven subfamilies of Kir channels (Kir1-7) differ as...This ability to discriminate against the smaller Na+ (atomic radius: 0.95 Å) was elucidated by examining the pore structure of the bacterial KcsA