Sample records for folded protein structures

  1. Replica exchange molecular dynamics simulation of structure variation from α/4β-fold to 3α-fold protein.

    PubMed

    Lazim, Raudah; Mei, Ye; Zhang, Dawei

    2012-03-01

    Replica exchange molecular dynamics (REMD) simulation provides an efficient conformational sampling tool for the study of protein folding. In this study, we explore the mechanism directing the structure variation from α/4β-fold protein to 3α-fold protein after mutation by conducting REMD simulation on 42 replicas with temperatures ranging from 270 K to 710 K. The simulation began from a protein possessing the primary structure of GA88 but the tertiary structure of GB88, two G proteins with "high sequence identity." Albeit the large Cα-root mean square deviation (RMSD) of the folded protein (4.34 Å at 270 K and 4.75 Å at 304 K), a variation in tertiary structure was observed. Together with the analysis of secondary structure assignment, cluster analysis and principal component, it provides insights to the folding and unfolding pathway of 3α-fold protein and α/4β-fold protein respectively paving the way toward the understanding of the ongoings during conformational variation.

  2. Shaping up the protein folding funnel by local interaction: lesson from a structure prediction study.

    PubMed

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-02-28

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of "chimera proteins." In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape.

  3. Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study

    PubMed Central

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-01-01

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of “chimera proteins.” In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape. PMID:16488978

  4. Course 12: Proteins: Structural, Thermodynamic and Kinetic Aspects

    NASA Astrophysics Data System (ADS)

    Finkelstein, A. V.

    1 Introduction 2 Overview of protein architectures and discussion of physical background of their natural selection 2.1 Protein structures 2.2 Physical selection of protein structures 3 Thermodynamic aspects of protein folding 3.1 Reversible denaturation of protein structures 3.2 What do denatured proteins look like? 3.3 Why denaturation of a globular protein is the first-order phase transition 3.4 "Gap" in energy spectrum: The main characteristic that distinguishes protein chains from random polymers 4 Kinetic aspects of protein folding 4.1 Protein folding in vivo 4.2 Protein folding in vitro (in the test-tube) 4.3 Theory of protein folding rates and solution of the Levinthal paradox

  5. General mechanism of two-state protein folding kinetics.

    PubMed

    Rollins, Geoffrey C; Dill, Ken A

    2014-08-13

    We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s.

  6. Extant fold-switching proteins are widespread.

    PubMed

    Porter, Lauren L; Looger, Loren L

    2018-06-05

    A central tenet of biology is that globular proteins have a unique 3D structure under physiological conditions. Recent work has challenged this notion by demonstrating that some proteins switch folds, a process that involves remodeling of secondary structure in response to a few mutations (evolved fold switchers) or cellular stimuli (extant fold switchers). To date, extant fold switchers have been viewed as rare byproducts of evolution, but their frequency has been neither quantified nor estimated. By systematically and exhaustively searching the Protein Data Bank (PDB), we found ∼100 extant fold-switching proteins. Furthermore, we gathered multiple lines of evidence suggesting that these proteins are widespread in nature. Based on these lines of evidence, we hypothesized that the frequency of extant fold-switching proteins may be underrepresented by the structures in the PDB. Thus, we sought to identify other putative extant fold switchers with only one solved conformation. To do this, we identified two characteristic features of our ∼100 extant fold-switching proteins, incorrect secondary structure predictions and likely independent folding cooperativity, and searched the PDB for other proteins with similar features. Reassuringly, this method identified dozens of other proteins in the literature with indication of a structural change but only one solved conformation in the PDB. Thus, we used it to estimate that 0.5-4% of PDB proteins switch folds. These results demonstrate that extant fold-switching proteins are likely more common than the PDB reflects, which has implications for cell biology, genomics, and human health. Copyright © 2018 the Author(s). Published by PNAS.

  7. Amino Acid Distribution Rules Predict Protein Fold: Protein Grammar for Beta-Strand Sandwich-Like Structures

    PubMed Central

    Kister, Alexander

    2015-01-01

    We present an alternative approach to protein 3D folding prediction based on determination of rules that specify distribution of “favorable” residues, that are mainly responsible for a given fold formation, and “unfavorable” residues, that are incompatible with that fold, in polypeptide sequences. The process of determining favorable and unfavorable residues is iterative. The starting assumptions are based on the general principles of protein structure formation as well as structural features peculiar to a protein fold under investigation. The initial assumptions are tested one-by-one for a set of all known proteins with a given structure. The assumption is accepted as a “rule of amino acid distribution” for the protein fold if it holds true for all, or near all, structures. If the assumption is not accepted as a rule, it can be modified to better fit the data and then tested again in the next step of the iterative search algorithm, or rejected. We determined the set of amino acid distribution rules for a large group of beta sandwich-like proteins characterized by a specific arrangement of strands in two beta sheets. It was shown that this set of rules is highly sensitive (~90%) and very specific (~99%) for identifying sequences of proteins with specified beta sandwich fold structure. The advantage of the proposed approach is that it does not require that query proteins have a high degree of homology to proteins with known structure. So long as the query protein satisfies residue distribution rules, it can be confidently assigned to its respective protein fold. Another advantage of our approach is that it allows for a better understanding of which residues play an essential role in protein fold formation. It may, therefore, facilitate rational protein engineering design. PMID:25625198

  8. Protein folding, protein structure and the origin of life: Theoretical methods and solutions of dynamical problems

    NASA Technical Reports Server (NTRS)

    Weaver, D. L.

    1982-01-01

    Theoretical methods and solutions of the dynamics of protein folding, protein aggregation, protein structure, and the origin of life are discussed. The elements of a dynamic model representing the initial stages of protein folding are presented. The calculation and experimental determination of the model parameters are discussed. The use of computer simulation for modeling protein folding is considered.

  9. How a Spatial Arrangement of Secondary Structure Elements Is Dispersed in the Universe of Protein Folds

    PubMed Central

    Minami, Shintaro; Sawada, Kengo; Chikenji, George

    2014-01-01

    It has been known that topologically different proteins of the same class sometimes share the same spatial arrangement of secondary structure elements (SSEs). However, the frequency by which topologically different structures share the same spatial arrangement of SSEs is unclear. It is important to estimate this frequency because it provides both a deeper understanding of the geometry of protein folds and a valuable suggestion for predicting protein structures with novel folds. Here we clarified the frequency with which protein folds share the same SSE packing arrangement with other folds, the types of spatial arrangement of SSEs that are frequently observed across different folds, and the diversity of protein folds that share the same spatial arrangement of SSEs with a given fold, using a protein structure alignment program MICAN, which we have been developing. By performing comprehensive structural comparison of SCOP fold representatives, we found that approximately 80% of protein folds share the same spatial arrangement of SSEs with other folds. We also observed that many protein pairs that share the same spatial arrangement of SSEs belong to the different classes, often with an opposing N- to C-terminal direction of the polypeptide chain. The most frequently observed spatial arrangement of SSEs was the 2-layer α/β packing arrangement and it was dispersed among as many as 27% of SCOP fold representatives. These results suggest that the same spatial arrangements of SSEs are adopted by a wide variety of different folds and that the spatial arrangement of SSEs is highly robust against the N- to C-terminal direction of the polypeptide chain. PMID:25243952

  10. Protein domain definition should allow for conditional disorder

    PubMed Central

    Yegambaram, Kavestri; Bulloch, Esther MM; Kingston, Richard L

    2013-01-01

    Abstract: Proteins are often classified in a binary fashion as either structured or disordered. However this approach has several deficits. Firstly, protein folding is always conditional on the physiochemical environment. A protein which is structured in some circumstances will be disordered in others. Secondly, it hides a fundamental asymmetry in behavior. While all structured proteins can be unfolded through a change in environment, not all disordered proteins have the capacity for folding. Failure to accommodate these complexities confuses the definition of both protein structural domains and intrinsically disordered regions. We illustrate these points with an experimental study of a family of small binding domains, drawn from the RNA polymerase of mumps virus and its closest relatives. Assessed at face value the domains fall on a structural continuum, with folded, partially folded, and near unstructured members. Yet the disorder present in the family is conditional, and these closely related polypeptides can access the same folded state under appropriate conditions. Any heuristic definition of the protein domain emphasizing conformational stability divides this domain family in two, in a way that makes no biological sense. Structural domains would be better defined by their ability to adopt a specific tertiary structure: a structure that may or may not be realized, dependent on the circumstances. This explicitly allows for the conditional nature of protein folding, and more clearly demarcates structural domains from intrinsically disordered regions that may function without folding. PMID:23963781

  11. Mathematics, thermodynamics, and modeling to address ten common misconceptions about protein structure, folding, and stability.

    PubMed

    Robic, Srebrenka

    2010-01-01

    To fully understand the roles proteins play in cellular processes, students need to grasp complex ideas about protein structure, folding, and stability. Our current understanding of these topics is based on mathematical models and experimental data. However, protein structure, folding, and stability are often introduced as descriptive, qualitative phenomena in undergraduate classes. In the process of learning about these topics, students often form incorrect ideas. For example, by learning about protein folding in the context of protein synthesis, students may come to an incorrect conclusion that once synthesized on the ribosome, a protein spends its entire cellular life time in its fully folded native confirmation. This is clearly not true; proteins are dynamic structures that undergo both local fluctuations and global unfolding events. To prevent and address such misconceptions, basic concepts of protein science can be introduced in the context of simple mathematical models and hands-on explorations of publicly available data sets. Ten common misconceptions about proteins are presented, along with suggestions for using equations, models, sequence, structure, and thermodynamic data to help students gain a deeper understanding of basic concepts relating to protein structure, folding, and stability.

  12. SeqRate: sequence-based protein folding type classification and rates prediction

    PubMed Central

    2010-01-01

    Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647

  13. Progress towards mapping the universe of protein folds

    PubMed Central

    Grant, Alastair; Lee, David; Orengo, Christine

    2004-01-01

    Although the precise aims differ between the various international structural genomics initiatives currently aiming to illuminate the universe of protein folds, many selectively target protein families for which the fold is unknown. How well can the current set of known protein families and folds be used to estimate the total number of folds in nature, and will structural genomics initiatives yield representatives for all the major protein families within a reasonable time scale? PMID:15128436

  14. General Mechanism of Two-State Protein Folding Kinetics

    PubMed Central

    Rollins, Geoffrey C.; Dill, Ken A.

    2016-01-01

    We describe here a general model of the kinetic mechanism of protein folding. In the Foldon Funnel Model, proteins fold in units of secondary structures, which form sequentially along the folding pathway, stabilized by tertiary interactions. The model predicts that the free energy landscape has a volcano shape, rather than a simple funnel, that folding is two-state (single-exponential) when secondary structures are intrinsically unstable, and that each structure along the folding path is a transition state for the previous structure. It shows how sequential pathways are consistent with multiple stochastic routes on funnel landscapes, and it gives good agreement with the 9 order of magnitude dependence of folding rates on protein size for a set of 93 proteins, at the same time it is consistent with the near independence of folding equilibrium constant on size. This model gives estimates of folding rates of proteomes, leading to a median folding time in Escherichia coli of about 5 s. PMID:25056406

  15. A stoichiometry driven universal spatial organization of backbones of folded proteins: are there Chargaff's rules for protein folding?

    PubMed

    Mittal, A; Jayaram, B; Shenoy, Sandhya; Bawa, Tejdeep Singh

    2010-10-01

    Protein folding is at least a six decade old problem, since the times of Pauling and Anfinsen. However, rules of protein folding remain elusive till date. In this work, rigorous analyses of several thousand crystal structures of folded proteins reveal a surprisingly simple unifying principle of backbone organization in protein folding. We find that protein folding is a direct consequence of a narrow band of stoichiometric occurrences of amino-acids in primary sequences, regardless of the size and the fold of a protein. We observe that "preferential interactions" between amino-acids do not drive protein folding, contrary to all prevalent views. We dedicate our discovery to the seminal contribution of Chargaff which was one of the major keys to elucidation of the stoichiometry-driven spatially organized double helical structure of DNA.

  16. Mathematics, Thermodynamics, and Modeling to Address Ten Common Misconceptions about Protein Structure, Folding, and Stability

    ERIC Educational Resources Information Center

    Robic, Srebrenka

    2010-01-01

    To fully understand the roles proteins play in cellular processes, students need to grasp complex ideas about protein structure, folding, and stability. Our current understanding of these topics is based on mathematical models and experimental data. However, protein structure, folding, and stability are often introduced as descriptive, qualitative…

  17. Steady-state structural fluctuation is a predictor of the necessity of pausing-mediated co-translational folding for small proteins.

    PubMed

    Huang, Wenxi; Liu, Wanting; Jin, Jingjie; Xiao, Qilan; Lu, Ruibin; Chen, Wei; Xiong, Sheng; Zhang, Gong

    2018-03-25

    Translational pausing coordinates protein synthesis and co-translational folding. It is a common factor that facilitates the correct folding of large, multi-domain proteins. For small proteins, pausing sites rarely occurs in the gene body, and the 3'-end pausing sites are only essential for the folding of a fraction of proteins. The determinant of the necessity of the pausings remains obscure. In this study, we demonstrated that the steady-state structural fluctuation is a predictor of the necessity of pausing-mediated co-translational folding for small proteins. Validated by experiments with 5 model proteins, we found that the rigid protein structures do not, while the flexible structures do need 3'-end pausings to fold correctly. Therefore, rational optimization of translational pausing can improve soluble expression of small proteins with flexible structures, but not the rigid ones. The rigidity of the structure can be quantitatively estimated in silico using molecular dynamic simulation. Nevertheless, we also found that the translational pausing optimization increases the fitness of the expression host, and thus benefits the recombinant protein production, independent from the soluble expression. These results shed light on the structural basis of the translational pausing and provided a practical tool for industrial protein fermentation. Copyright © 2017. Published by Elsevier Inc.

  18. Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins.

    PubMed

    Raimondi, Daniele; Orlando, Gabriele; Pancsa, Rita; Khan, Taushif; Vranken, Wim F

    2017-08-18

    Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which are likely defined by intrinsic local interactions between amino acids close to each other in the protein sequence. We here present EFoldMine, a method that predicts, from the primary amino acid sequence of a protein, which amino acids are likely involved in early folding events. The method is based on early folding data from hydrogen deuterium exchange (HDX) data from NMR pulsed labelling experiments, and uses backbone and sidechain dynamics as well as secondary structure propensities as features. The EFoldMine predictions give insights into the folding process, as illustrated by a qualitative comparison with independent experimental observations. Furthermore, on a quantitative proteome scale, the predicted early folding residues tend to become the residues that interact the most in the folded structure, and they are often residues that display evolutionary covariation. The connection of the EFoldMine predictions with both folding pathway data and the folded protein structure suggests that the initial statistical behavior of the protein chain with respect to local structure formation has a lasting effect on its subsequent states.

  19. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  20. Role of Tryptophan Side Chain Dynamics on the Trp-Cage Mini-Protein Folding Studied by Molecular Dynamics Simulations

    PubMed Central

    Kannan, Srinivasaraghavan; Zacharias, Martin

    2014-01-01

    The 20 residue Trp-cage mini-protein is one of smallest proteins that adopt a stable folded structure containing also well-defined secondary structure elements. The hydrophobic core is arranged around a single central Trp residue. Despite several experimental and simulation studies the detailed folding mechanism of the Trp-cage protein is still not completely understood. Starting from fully extended as well as from partially folded Trp-cage structures a series of molecular dynamics simulations in explicit solvent and using four different force fields was performed. All simulations resulted in rapid collapse of the protein to on average relatively compact states. The simulations indicate a significant dependence of the speed of folding to near-native states on the side chain rotamer state of the central Trp residue. Whereas the majority of intermediate start structures with the central Trp side chain in a near-native rotameric state folded successfully within less than 100 ns only a fraction of start structures reached near-native folded states with an initially non-native Trp side chain rotamer state. Weak restraining of the Trp side chain dihedral angles to the state in the folded protein resulted in significant acceleration of the folding both starting from fully extended or intermediate conformations. The results indicate that the side chain conformation of the central Trp residue can create a significant barrier for controlling transitions to a near native folded structure. Similar mechanisms might be of importance for the folding of other protein structures. PMID:24563686

  1. Protein folding: Over half a century lasting quest. Comment on "There and back again: Two views on the protein folding puzzle" by Alexei V. Finkelstein et al.

    NASA Astrophysics Data System (ADS)

    Krokhotin, Andrey; Dokholyan, Nikolay V.

    2017-07-01

    Most proteins fold into unique three-dimensional (3D) structures that determine their biological functions, such as catalytic activity or macromolecular binding. Misfolded proteins can pose a threat through aberrant interactions with other proteins leading to a number of diseases including Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis [1,2]. What does determine 3D structure of proteins? The first clue to this question came more than fifty years ago when Anfinsen demonstrated that unfolded proteins can spontaneously fold to their native 3D structures [3,4]. Anfinsen's experiments lead to the conclusion that proteins fold to unique native structure corresponding to the stable and kinetically accessible free energy minimum, and protein native structure is solely determined by its amino acid sequence. The question of how exactly proteins find their free energy minimum proved to be a difficult problem. One of the puzzles, initially pointed out by Levinthal, was an inconsistency between observed protein folding times and theoretical estimates. A self-avoiding polymer model of a globular protein of 100-residues length on a cubic lattice can sample at least 1047 states. Based on the assumption that conformational sampling occurs at the highest vibrational mode of proteins (∼picoseconds), predicted folding time by searching among all the possible conformations leads to ∼1027 years (much larger than the age of the universe) [5]. In contrast, observed protein folding time range from microseconds to minutes. Due to tremendous theoretical progress in protein folding field that has been achieved in past decades, the source of this inconsistency is currently understood that is thoroughly described in the review by Finkelstein et al. [6].

  2. Improvement on a simplified model for protein folding simulation.

    PubMed

    Zhang, Ming; Chen, Changjun; He, Yi; Xiao, Yi

    2005-11-01

    Improvements were made on a simplified protein model--the Ramachandran model-to achieve better computer simulation of protein folding. To check the validity of such improvements, we chose the ultrafast folding protein Engrailed Homeodomain as an example and explored several aspects of its folding. The engrailed homeodomain is a mainly alpha-helical protein of 61 residues from Drosophila melanogaster. We found that the simplified model of Engrailed Homeodomain can fold into a global minimum state with a tertiary structure in good agreement with its native structure.

  3. How Does Your Protein Fold? Elucidating the Apomyoglobin Folding Pathway

    PubMed Central

    Dyson, H. Jane; Wright, Peter E.

    2017-01-01

    Conspectus Although each type of protein fold and in some cases individual proteins within a fold classification can have very different mechanisms of folding, the underlying biophysical and biochemical principles that operate to cause a linear polypeptide chain to fold into a globular structure must be the same. In an aqueous solution, the protein takes up the thermodynamically most stable structure, but the pathway along which the polypeptide proceeds in order to reach that structure is a function of the amino acid sequence, which must be the final determining factor, not only in shaping the final folded structure, but in dictating the folding pathway. A number of groups have focused on a single protein or group of proteins, to determine in detail the factors that influence the rate and mechanism of folding in a defined system, with the hope that hypothesis-driven experiments can elucidate the underlying principles governing the folding process. Our research group has focused on the folding of the globin family of proteins, and in particular on the monomeric protein apomyoglobin. Apomyoglobin (apoMb) folds relatively slowly (~2 seconds) via an ensemble of obligatory intermediates that form rapidly after the initiation of folding. The folding pathway can be dissected using rapid-mixing techniques, which can probe processes in the millisecond time range. Stopped-flow measurements detected by circular dichroism (CD) or fluorescence spectroscopy give information on the rates of folding events. Quench-flow experiments utilize the differential rates of hydrogen-deuterium exchange of amide protons protected in parts of the structure that are folded early; protection of amides can be detected by mass spectrometry or proton nuclear magnetic resonance spectroscopy (NMR). In addition, apoMb forms an intermediate at equilibrium at pH ~ 4, which is sufficiently stable for it to be structurally characterized by solution methods such as CD, fluorescence and NMR spectroscopies, and the conformational ensembles formed in the presence of denaturing agents and low pH can be characterized as models for the unfolded states of the protein. Newer NMR techniques such as measurement of residual dipolar couplings in the various partly folded states, and relaxation dispersion measurements to probe invisible states present at low concentrations, have contributed to providing a detailed picture of the apomyoglobin folding pathway. The research summarized in this review was aimed at characterizing and comparing the equilibrium and kinetic intermediates both structurally and dynamically, as well as delineating the complete folding pathway at a residue-specific level, in order to answer the question “What is it about the amino acid sequence that causes each molecule in the unfolded protein ensemble to start folding, and, once started, to proceed towards the formation of the correctly folded three-dimensional structure?” PMID:28032989

  4. Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.

    PubMed

    Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A

    2007-12-15

    Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.

  5. Folding and Stabilization of Native-Sequence-Reversed Proteins

    PubMed Central

    Zhang, Yuanzhao; Weber, Jeffrey K; Zhou, Ruhong

    2016-01-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols. PMID:27113844

  6. Folding and Stabilization of Native-Sequence-Reversed Proteins

    NASA Astrophysics Data System (ADS)

    Zhang, Yuanzhao; Weber, Jeffrey K.; Zhou, Ruhong

    2016-04-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols.

  7. High-Resolution Mapping of a Repeat Protein Folding Free Energy Landscape.

    PubMed

    Fossat, Martin J; Dao, Thuy P; Jenkins, Kelly; Dellarole, Mariano; Yang, Yinshan; McCallum, Scott A; Garcia, Angel E; Barrick, Doug; Roumestand, Christian; Royer, Catherine A

    2016-12-06

    A complete description of the pathways and mechanisms of protein folding requires a detailed structural and energetic characterization of the conformational ensemble along the entire folding reaction coordinate. Simulations can provide this level of insight for small proteins. In contrast, with the exception of hydrogen exchange, which does not monitor folding directly, experimental studies of protein folding have not yielded such structural and energetic detail. NMR can provide residue specific atomic level structural information, but its implementation in protein folding studies using chemical or temperature perturbation is problematic. Here we present a highly detailed structural and energetic map of the entire folding landscape of the leucine-rich repeat protein, pp32 (Anp32), obtained by combining pressure-dependent site-specific 1 H- 15 N HSQC data with coarse-grained molecular dynamics simulations. The results obtained using this equilibrium approach demonstrate that the main barrier to folding of pp32 is quite broad and lies near the unfolded state, with structure apparent only in the C-terminal region. Significant deviation from two-state unfolding under pressure reveals an intermediate on the folded side of the main barrier in which the N-terminal region is disordered. A nonlinear temperature dependence of the population of this intermediate suggests a large heat capacity change associated with its formation. The combination of pressure, which favors the population of folding intermediates relative to chemical denaturants; NMR, which allows their observation; and constrained structure-based simulations yield unparalleled insight into protein folding mechanisms. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  8. A protein block based fold recognition method for the annotation of twilight zone sequences.

    PubMed

    Suresh, V; Ganesan, K; Parthasarathy, S

    2013-03-01

    The description of protein backbone was recently improved with a group of structural fragments called Structural Alphabets instead of the regular three states (Helix, Sheet and Coil) secondary structure description. Protein Blocks is one of the Structural Alphabets used to describe each and every region of protein backbone including the coil. According to de Brevern (2000) the Protein Blocks has 16 structural fragments and each one has 5 residues in length. Protein Blocks fragments are highly informative among the available Structural Alphabets and it has been used for many applications. Here, we present a protein fold recognition method based on Protein Blocks for the annotation of twilight zone sequences. In our method, we align the predicted Protein Blocks of a query amino acid sequence with a library of assigned Protein Blocks of 953 known folds using the local pair-wise alignment. The alignment results with z-value ≥ 2.5 and P-value ≤ 0.08 are predicted as possible folds. Our method is able to recognize the possible folds for nearly 35.5% of the twilight zone sequences with their predicted Protein Block sequence obtained by pb_prediction, which is available at Protein Block Export server.

  9. Right- and left-handed three-helix proteins. I. Experimental and simulation analysis of differences in folding and structure.

    PubMed

    Glyakina, Anna V; Pereyaslavets, Leonid B; Galzitskaya, Oxana V

    2013-09-01

    Despite the large number of publications on three-helix protein folding, there is no study devoted to the influence of handedness on the rate of three-helix protein folding. From the experimental studies, we make a conclusion that the left-handed three-helix proteins fold faster than the right-handed ones. What may explain this difference? An important question arising in this paper is whether the modeling of protein folding can catch the difference between the protein folding rates of proteins with similar structures but with different folding mechanisms. To answer this question, the folding of eight three-helix proteins (four right-handed and four left-handed), which are similar in size, was modeled using the Monte Carlo and dynamic programming methods. The studies allowed us to determine the orders of folding of the secondary-structure elements in these domains and amino acid residues which are important for the folding. The obtained data are in good correlation with each other and with the experimental data. Structural analysis of these proteins demonstrated that the left-handed domains have a lesser number of contacts per residue and a smaller radius of cross section than the right-handed domains. This may be one of the explanations of the observed fact. The same tendency is observed for the large dataset consisting of 332 three-helix proteins (238 right- and 94 left-handed). From our analysis, we found that the left-handed three-helix proteins have some less-dense packing that should result in faster folding for some proteins as compared to the case of right-handed proteins. Copyright © 2013 Wiley Periodicals, Inc.

  10. Protein classification using sequential pattern mining.

    PubMed

    Exarchos, Themis P; Papaloukas, Costas; Lampros, Christos; Fotiadis, Dimitrios I

    2006-01-01

    Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered.

  11. Robustness of atomistic Gō models in predicting native-like folding intermediates

    NASA Astrophysics Data System (ADS)

    Estácio, S. G.; Fernandes, C. S.; Krobath, H.; Faísca, P. F. N.; Shakhnovich, E. I.

    2012-08-01

    Gō models are exceedingly popular tools in computer simulations of protein folding. These models are native-centric, i.e., they are directly constructed from the protein's native structure. Therefore, it is important to understand up to which extent the atomistic details of the native structure dictate the folding behavior exhibited by Gō models. Here we address this challenge by performing exhaustive discrete molecular dynamics simulations of a Gō potential combined with a full atomistic protein representation. In particular, we investigate the robustness of this particular type of Gō models in predicting the existence of intermediate states in protein folding. We focus on the N47G mutational form of the Spc-SH3 folding domain (x-ray structure) and compare its folding pathway with that of alternative native structures produced in silico. Our methodological strategy comprises equilibrium folding simulations, structural clustering, and principal component analysis.

  12. Complete fold annotation of the human proteome using a novel structural feature space.

    PubMed

    Middleton, Sarah A; Illuminati, Joseph; Kim, Junhyong

    2017-04-13

    Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.

  13. Complete fold annotation of the human proteome using a novel structural feature space

    PubMed Central

    Middleton, Sarah A.; Illuminati, Joseph; Kim, Junhyong

    2017-01-01

    Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families. PMID:28406174

  14. Energetic frustrations in protein folding at residue resolution: a homologous simulation study of Im9 proteins.

    PubMed

    Sun, Yunxiang; Ming, Dengming

    2014-01-01

    Energetic frustration is becoming an important topic for understanding the mechanisms of protein folding, which is a long-standing big biological problem usually investigated by the free energy landscape theory. Despite the significant advances in probing the effects of folding frustrations on the overall features of protein folding pathways and folding intermediates, detailed characterizations of folding frustrations at an atomic or residue level are still lacking. In addition, how and to what extent folding frustrations interact with protein topology in determining folding mechanisms remains unclear. In this paper, we tried to understand energetic frustrations in the context of protein topology structures or native-contact networks by comparing the energetic frustrations of five homologous Im9 alpha-helix proteins that share very similar topology structures but have a single hydrophilic-to-hydrophobic mutual mutation. The folding simulations were performed using a coarse-grained Gō-like model, while non-native hydrophobic interactions were introduced as energetic frustrations using a Lennard-Jones potential function. Energetic frustrations were then examined at residue level based on φ-value analyses of the transition state ensemble structures and mapped back to native-contact networks. Our calculations show that energetic frustrations have highly heterogeneous influences on the folding of the four helices of the examined structures depending on the local environment of the frustration centers. Also, the closer the introduced frustration is to the center of the native-contact network, the larger the changes in the protein folding. Our findings add a new dimension to the understanding of protein folding the topology determination in that energetic frustrations works closely with native-contact networks to affect the protein folding.

  15. Accelerated molecular dynamics simulations of protein folding.

    PubMed

    Miao, Yinglong; Feixas, Ferran; Eun, Changsun; McCammon, J Andrew

    2015-07-30

    Folding of four fast-folding proteins, including chignolin, Trp-cage, villin headpiece and WW domain, was simulated via accelerated molecular dynamics (aMD). In comparison with hundred-of-microsecond timescale conventional molecular dynamics (cMD) simulations performed on the Anton supercomputer, aMD captured complete folding of the four proteins in significantly shorter simulation time. The folded protein conformations were found within 0.2-2.1 Å of the native NMR or X-ray crystal structures. Free energy profiles calculated through improved reweighting of the aMD simulations using cumulant expansion to the second-order are in good agreement with those obtained from cMD simulations. This allows us to identify distinct conformational states (e.g., unfolded and intermediate) other than the native structure and the protein folding energy barriers. Detailed analysis of protein secondary structures and local key residue interactions provided important insights into the protein folding pathways. Furthermore, the selections of force fields and aMD simulation parameters are discussed in detail. Our work shows usefulness and accuracy of aMD in studying protein folding, providing basic references in using aMD in future protein-folding studies. © 2015 Wiley Periodicals, Inc.

  16. Physics of protein folding

    NASA Astrophysics Data System (ADS)

    Finkelstein, A. V.; Galzitskaya, O. V.

    2004-04-01

    Protein physics is grounded on three fundamental experimental facts: protein, this long heteropolymer, has a well defined compact three-dimensional structure; this structure can spontaneously arise from the unfolded protein chain in appropriate environment; and this structure is separated from the unfolded state of the chain by the “all-or-none” phase transition, which ensures robustness of protein structure and therefore of its action. The aim of this review is to consider modern understanding of physical principles of self-organization of protein structures and to overview such important features of this process, as finding out the unique protein structure among zillions alternatives, nucleation of the folding process and metastable folding intermediates. Towards this end we will consider the main experimental facts and simple, mostly phenomenological theoretical models. We will concentrate on relatively small (single-domain) water-soluble globular proteins (whose structure and especially folding are much better studied and understood than those of large or membrane and fibrous proteins) and consider kinetic and structural aspects of transition of initially unfolded protein chains into their final solid (“native”) 3D structures.

  17. Three key residues form a critical contact network in a protein folding transition state

    NASA Astrophysics Data System (ADS)

    Vendruscolo, Michele; Paci, Emanuele; Dobson, Christopher M.; Karplus, Martin

    2001-02-01

    Determining how a protein folds is a central problem in structural biology. The rate of folding of many proteins is determined by the transition state, so that a knowledge of its structure is essential for understanding the protein folding reaction. Here we use mutation measurements-which determine the role of individual residues in stabilizing the transition state-as restraints in a Monte Carlo sampling procedure to determine the ensemble of structures that make up the transition state. We apply this approach to the experimental data for the 98-residue protein acylphosphatase, and obtain a transition-state ensemble with the native-state topology and an average root-mean-square deviation of 6Å from the native structure. Although about 20 residues with small positional fluctuations form the structural core of this transition state, the native-like contact network of only three of these residues is sufficient to determine the overall fold of the protein. This result reveals how a nucleation mechanism involving a small number of key residues can lead to folding of a polypeptide chain to its unique native-state structure.

  18. How the folding rates of two- and multistate proteins depend on the amino acid properties.

    PubMed

    Huang, Jitao T; Huang, Wei; Huang, Shanran R; Li, Xin

    2014-10-01

    Proteins fold by either two-state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two-state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two-state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. © 2014 Wiley Periodicals, Inc.

  19. Processing of Cholinesterase-like α/β-Hydrolase Fold Proteins: Alterations Associated with Congenital Disorders

    PubMed Central

    De Jaco, Antonella; Comoletti, Davide; Dubi, Noga; Camp, Shelley; Taylor, Palmer

    2016-01-01

    The α/β hydrolase fold family is perhaps the largest group of proteins presenting significant structural homology with divergent functions, ranging from catalytic hydrolysis to heterophilic cell adhesive interactions to chaperones in hormone production. All the proteins of the family share a common three-dimensional core structure containing the α/β-hydrolase fold domain that is crucial for proper protein function. Several mutations associated with congenital diseases or disorders have been reported in conserved residues within the α/β-hydrolase fold domain of cholinesterase-like proteins, neuroligins, butyrylcholinesterase and thyroglobulin. These mutations are known to disrupt the architecture of the common structural domain either globally or locally. Characterization of the natural mutations affecting the α/β-hydrolase fold domain in these proteins has shown that they mainly impair processing and trafficking along the secretory pathway causing retention of the mutant protein in the endoplasmic reticulum. Studying the processing of α/β-hydrolase fold mutant proteins should uncover new functions for this domain, that in some cases require structural integrity for both export of the protein from the ER and for facilitating subunit dimerization. A comparative study of homologous mutations in proteins that are closely related family members, along with the definition of new three-dimensional crystal structures, will identify critical residues for the assembly of the α/β-hydrolase fold. PMID:21933121

  20. An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis

    PubMed Central

    Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang

    2013-01-01

    Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234

  1. Metamorphic Proteins: Emergence of Dual Protein Folds from One Primary Sequence.

    PubMed

    Lella, Muralikrishna; Mahalakshmi, Radhakrishnan

    2017-06-20

    Every amino acid exhibits a different propensity for distinct structural conformations. Hence, decoding how the primary amino acid sequence undergoes the transition to a defined secondary structure and its final three-dimensional fold is presently considered predictable with reasonable certainty. However, protein sequences that defy the first principles of secondary structure prediction (they attain two different folds) have recently been discovered. Such proteins, aptly named metamorphic proteins, decrease the conformational constraint by increasing flexibility in the secondary structure and thereby result in efficient functionality. In this review, we discuss the major factors driving the conformational switch related both to protein sequence and to structure using illustrative examples. We discuss the concept of an evolutionary transition in sequence and structure, the functional impact of the tertiary fold, and the pressure of intrinsic and external factors that give rise to metamorphic proteins. We mainly focus on the major components of protein architecture, namely, the α-helix and β-sheet segments, which are involved in conformational switching within the same or highly similar sequences. These chameleonic sequences are widespread in both cytosolic and membrane proteins, and these folds are equally important for protein structure and function. We discuss the implications of metamorphic proteins and chameleonic peptide sequences in de novo peptide design.

  2. Structure of a Trypanosoma Brucei Alpha/Beta--Hydrolase Fold Protein With Unknown Function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Merritt, E.A.; Holmes, M.; Buckner, F.S.

    2009-05-26

    The structure of a structural genomics target protein, Tbru020260AAA from Trypanosoma brucei, has been determined to a resolution of 2.2 {angstrom} using multiple-wavelength anomalous diffraction at the Se K edge. This protein belongs to Pfam sequence family PF08538 and is only distantly related to previously studied members of the {alpha}/{beta}-hydrolase fold family. Structural superposition onto representative {alpha}/{beta}-hydrolase fold proteins of known function indicates that a possible catalytic nucleophile, Ser116 in the T. brucei protein, lies at the expected location. However, the present structure and by extension the other trypanosomatid members of this sequence family have neither sequence nor structural similaritymore » at the location of other active-site residues typical for proteins with this fold. Together with the presence of an additional domain between strands {beta}6 and {beta}7 that is conserved in trypanosomatid genomes, this suggests that the function of these homologs has diverged from other members of the fold family.« less

  3. On the Origin of Protein Superfamilies and Superfolds

    NASA Astrophysics Data System (ADS)

    Magner, Abram; Szpankowski, Wojciech; Kihara, Daisuke

    2015-02-01

    Distributions of protein families and folds in genomes are highly skewed, having a small number of prevalent superfamiles/superfolds and a large number of families/folds of a small size. Why are the distributions of protein families and folds skewed? Why are there only a limited number of protein families? Here, we employ an information theoretic approach to investigate the protein sequence-structure relationship that leads to the skewed distributions. We consider that protein sequences and folds constitute an information theoretic channel and computed the most efficient distribution of sequences that code all protein folds. The identified distributions of sequences and folds are found to follow a power law, consistent with those observed for proteins in nature. Importantly, the skewed distributions of sequences and folds are suggested to have different origins: the skewed distribution of sequences is due to evolutionary pressure to achieve efficient coding of necessary folds, whereas that of folds is based on the thermodynamic stability of folds. The current study provides a new information theoretic framework for proteins that could be widely applied for understanding protein sequences, structures, functions, and interactions.

  4. Structural Characteristic of the Initial Unfolded State on Refolding Determines Catalytic Efficiency of the Folded Protein in Presence of Osmolytes

    PubMed Central

    Warepam, Marina; Sharma, Gurumayum Suraj; Dar, Tanveer Ali; Khan, Md. Khurshid Alam; Singh, Laishram Rajendrakumar

    2014-01-01

    Osmolytes are low molecular weight organic molecules accumulated by organisms to assist proper protein folding, and to provide protection to the structural integrity of proteins under denaturing stress conditions. It is known that osmolyte-induced protein folding is brought by unfavorable interaction of osmolytes with the denatured/unfolded states. The interaction of osmolyte with the native state does not significantly contribute to the osmolyte-induced protein folding. We have therefore investigated if different denatured states of a protein (generated by different denaturing agents) interact differently with the osmolytes to induce protein folding. We observed that osmolyte-assisted refolding of protein obtained from heat-induced denatured state produces native molecules with higher enzyme activity than those initiated from GdmCl- or urea-induced denatured state indicating that the structural property of the initial denatured state during refolding by osmolytes determines the catalytic efficiency of the folded protein molecule. These conclusions have been reached from the systematic measurements of enzymatic kinetic parameters (K m and k cat), thermodynamic stability (T m and ΔH m) and secondary and tertiary structures of the folded native proteins obtained from refolding of various denatured states (due to heat-, urea- and GdmCl-induced denaturation) of RNase-A in the presence of various osmolytes. PMID:25313668

  5. Direct Observation of Parallel Folding Pathways Revealed Using a Symmetric Repeat Protein System

    PubMed Central

    Aksel, Tural; Barrick, Doug

    2014-01-01

    Although progress has been made to determine the native fold of a polypeptide from its primary structure, the diversity of pathways that connect the unfolded and folded states has not been adequately explored. Theoretical and computational studies predict that proteins fold through parallel pathways on funneled energy landscapes, although experimental detection of pathway diversity has been challenging. Here, we exploit the high translational symmetry and the direct length variation afforded by linear repeat proteins to directly detect folding through parallel pathways. By comparing folding rates of consensus ankyrin repeat proteins (CARPs), we find a clear increase in folding rates with increasing size and repeat number, although the size of the transition states (estimated from denaturant sensitivity) remains unchanged. The increase in folding rate with chain length, as opposed to a decrease expected from typical models for globular proteins, is a clear demonstration of parallel pathways. This conclusion is not dependent on extensive curve-fitting or structural perturbation of protein structure. By globally fitting a simple parallel-Ising pathway model, we have directly measured nucleation and propagation rates in protein folding, and have quantified the fluxes along each path, providing a detailed energy landscape for folding. This finding of parallel pathways differs from results from kinetic studies of repeat-proteins composed of sequence-variable repeats, where modest repeat-to-repeat energy variation coalesces folding into a single, dominant channel. Thus, for globular proteins, which have much higher variation in local structure and topology, parallel pathways are expected to be the exception rather than the rule. PMID:24988356

  6. Improving Protein Fold Recognition by Deep Learning Networks.

    PubMed

    Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

    2015-12-04

    For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl's benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.

  7. Solvent effect on the folding dynamics and structure of E6-associated protein characterized from ab initio protein folding simulations

    NASA Astrophysics Data System (ADS)

    Xu, Zhijun; Lazim, Raudah; Sun, Tiedong; Mei, Ye; Zhang, Dawei

    2012-04-01

    Solvent effect on protein conformation and folding mechanism of E6-associated protein (E6ap) peptide are investigated using a recently developed charge update scheme termed as adaptive hydrogen bond-specific charge (AHBC). On the basis of the close agreement between the calculated helix contents from AHBC simulations and experimental results, we observed based on the presented simulations that the two ends of the peptide may simultaneously take part in the formation of the helical structure at the early stage of folding and finally merge to form a helix with lowest backbone RMSD of about 0.9 Å in 40% 2,2,2-trifluoroethanol solution. However, in pure water, the folding may start at the center of the peptide sequence instead of at the two opposite ends. The analysis of the free energy landscape indicates that the solvent may determine the folding clusters of E6ap, which subsequently leads to the different final folded structure. The current study demonstrates new insight to the role of solvent in the determination of protein structure and folding dynamics.

  8. Protein folding and misfolding: mechanism and principles

    PubMed Central

    Englander, S. Walter; Mayne, Leland; Krishna, Mallela M. G.

    2012-01-01

    Two fundamentally different views of how proteins fold are now being debated. Do proteins fold through multiple unpredictable routes directed only by the energetically downhill nature of the folding landscape or do they fold through specific intermediates in a defined pathway that systematically puts predetermined pieces of the target native protein into place? It has now become possible to determine the structure of protein folding intermediates, evaluate their equilibrium and kinetic parameters, and establish their pathway relationships. Results obtained for many proteins have serendipitously revealed a new dimension of protein structure. Cooperative structural units of the native protein, called foldons, unfold and refold repeatedly even under native conditions. Much evidence obtained by hydrogen exchange and other methods now indicates that cooperative foldon units and not individual amino acids account for the unit steps in protein folding pathways. The formation of foldons and their ordered pathway assembly systematically puts native-like foldon building blocks into place, guided by a sequential stabilization mechanism in which prior native-like structure templates the formation of incoming foldons with complementary structure. Thus the same propensities and interactions that specify the final native state, encoded in the amino-acid sequence of every protein, determine the pathway for getting there. Experimental observations that have been interpreted differently, in terms of multiple independent pathways, appear to be due to chance misfolding errors that cause different population fractions to block at different pathway points, populate different pathway intermediates, and fold at different rates. This paper summarizes the experimental basis for these three determining principles and their consequences. Cooperative native-like foldon units and the sequential stabilization process together generate predetermined stepwise pathways. Optional misfolding errors are responsible for 3-state and heterogeneous kinetic folding. PMID:18405419

  9. Complete fold annotation of the human proteome using a novel structural feature space

    DOE PAGES

    Middleton, Sarah A.; Illuminati, Joseph; Kim, Junhyong

    2017-04-13

    Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this methodmore » by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Finally, our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.« less

  10. How the Sequence of a Gene Specifies Structural Symmetry in Proteins

    PubMed Central

    Shen, Xiaojuan; Huang, Tongcheng; Wang, Guanyu; Li, Guanglin

    2015-01-01

    Internal symmetry is commonly observed in the majority of fundamental protein folds. Meanwhile, sufficient evidence suggests that nascent polypeptide chains of proteins have the potential to start the co-translational folding process and this process allows mRNA to contain additional information on protein structure. In this paper, we study the relationship between gene sequences and protein structures from the viewpoint of symmetry to explore how gene sequences code for structural symmetry in proteins. We found that, for a set of two-fold symmetric proteins from left-handed beta-helix fold, intragenic symmetry always exists in their corresponding gene sequences. Meanwhile, codon usage bias and local mRNA structure might be involved in modulating translation speed for the formation of structural symmetry: a major decrease of local codon usage bias in the middle of the codon sequence can be identified as a common feature; and major or consecutive decreases in local mRNA folding energy near the boundaries of the symmetric substructures can also be observed. The results suggest that gene duplication and fusion may be an evolutionarily conserved process for this protein fold. In addition, the usage of rare codons and the formation of higher order of secondary structure near the boundaries of symmetric substructures might have coevolved as conserved mechanisms to slow down translation elongation and to facilitate effective folding of symmetric substructures. These findings provide valuable insights into our understanding of the mechanisms of translation and its evolution, as well as the design of proteins via symmetric modules. PMID:26641668

  11. Amyloidogenesis of Natively Unfolded Proteins

    PubMed Central

    Uversky, Vladimir N.

    2009-01-01

    Aggregation and subsequent development of protein deposition diseases originate from conformational changes in corresponding amyloidogenic proteins. The accumulated data support the model where protein fibrillogenesis proceeds via the formation of a relatively unfolded amyloidogenic conformation, which shares many structural properties with the pre-molten globule state, a partially folded intermediate first found during the equilibrium and kinetic (un)folding studies of several globular proteins and later described as one of the structural forms of natively unfolded proteins. The flexibility of this structural form is essential for the conformational rearrangements driving the formation of the core cross-beta structure of the amyloid fibril. Obviously, molecular mechanisms describing amyloidogenesis of ordered and natively unfolded proteins are different. For ordered protein to fibrillate, its unique and rigid structure has to be destabilized and partially unfolded. On the other hand, fibrillogenesis of a natively unfolded protein involves the formation of partially folded conformation; i.e., partial folding rather than unfolding. In this review recent findings are surveyed to illustrate some unique features of the natively unfolded proteins amyloidogenesis. PMID:18537543

  12. Exploring the Universe of Protein Structures beyond the Protein Data Bank

    PubMed Central

    Cossio, Pilar; Trovato, Antonio; Pietrucci, Fabio; Seno, Flavio; Maritan, Amos; Laio, Alessandro

    2010-01-01

    It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds. PMID:21079678

  13. Structural perturbations on huntingtin N17 domain during its folding on 2D-nanomaterials

    NASA Astrophysics Data System (ADS)

    Zhang, Leili; Feng, Mei; Zhou, Ruhong; Luan, Binquan

    2017-09-01

    A globular protein’s folded structure in its physiological environment is largely determined by its amino acid sequence. Recently, newly discovered transformer proteins as well as intrinsically disordered proteins may adopt the folding-upon-binding mechanism where their secondary structures are highly dependent on their binding partners. Due to the various applications of nanomaterials in biological sensors and potential wearable devices, it is important to discover possible conformational changes of proteins on nanomaterials. Here, through molecular dynamics simulations, we show that the first 17 residues of the huntingtin protein (HTT-N17) exhibit appreciable differences during its folding on 2D-nanomaterials, such as graphene and MoS2 nanosheets. Namely, the protein is disordered on the graphene surface but is helical on the MoS2 surface. Despite that the amphiphilic environment at the nanosheet-water interface promotes the folding of the amphipathic proteins (such as HTT-N17), competitions between protein-nanosheet and intra-protein interactions yield very different protein conformations. Therefore, as engineered binding partners, nanomaterials might significantly affect the structures of adsorbed proteins.

  14. Stereochemistry and solvent role in protein folding: nuclear magnetic resonance and molecular dynamics studies of poly-L and alternating-L,D homopolypeptides in dimethyl sulfoxide.

    PubMed

    Srivastava, Kinshuk Raj; Kumar, Anil; Goyal, Bhupesh; Durani, Susheel

    2011-05-26

    The competing interactions folding and unfolding protein structure remain obscure. Using homopolypeptides, we ask if poly-L structure may have a role. We mutate the structure to alternating-L,D stereochemistry and substitute water as the fold-promoting solvent with methanol and dimethyl sulfoxide (DMSO) as the fold-denaturing solvents. Circular dichroism and molecular dynamics established previously that, while both isomers were folded in water, the poly-L isomer was unfolded and alternating-L,D isomer folded in methanol. Nuclear magnetic resonance and molecular dynamics establish now that both isomers are unfolded in DMSO. We calculated energetics of folding-unfolding equilibrium with water and methanol as solvents. We have now calculated interactions of unfolded polypeptide structures with DMSO as solvent. Methanol was found to unfold and water fold poly-L structure as a dielectric. DMSO has now been found to unfold both poly-L and alternating-L,D structures by strong solvation of peptides to disrupt their hydrogen bonds. Accordingly, we propose that while linked peptides fold protein structure with hydrogen bonds they unfold the structure electrostatically due to the stereochemical effect of the poly-L structure. Protein folding to ordering of peptide hydrogen bonds with water as canonical solvent may thus involve two specific and independent solvent effects-one, strong screening of electrostatics of poly-L linked peptides, and two, weak dipolar solvation of peptides. Correspondingly, protein denaturation may involve two independent solvent effects-one, weak dielectric to unfold poly-L structure electrostatically, and two, strong polarity to disrupt peptide hydrogen bonds by solvation of peptides.

  15. Protein Folding and Self-Organized Criticality

    NASA Astrophysics Data System (ADS)

    Bajracharya, Arun; Murray, Joelle

    Proteins are known to fold into tertiary structures that determine their functionality in living organisms. However, the complex dynamics of protein folding and the way they consistently fold into the same structures is not fully understood. Self-organized criticality (SOC) has provided a framework for understanding complex systems in various systems (earthquakes, forest fires, financial markets, and epidemics) through scale invariance and the associated power law behavior. In this research, we use a simple hydrophobic-polar lattice-bound computational model to investigate self-organized criticality as a possible mechanism for generating complexity in protein folding.

  16. Protein folding simulations: from coarse-grained model to all-atom model.

    PubMed

    Zhang, Jian; Li, Wenfei; Wang, Jun; Qin, Meng; Wu, Lei; Yan, Zhiqiang; Xu, Weixin; Zuo, Guanghong; Wang, Wei

    2009-06-01

    Protein folding is an important and challenging problem in molecular biology. During the last two decades, molecular dynamics (MD) simulation has proved to be a paramount tool and was widely used to study protein structures, folding kinetics and thermodynamics, and structure-stability-function relationship. It was also used to help engineering and designing new proteins, and to answer even more general questions such as the minimal number of amino acid or the evolution principle of protein families. Nowadays, the MD simulation is still undergoing rapid developments. The first trend is to toward developing new coarse-grained models and studying larger and more complex molecular systems such as protein-protein complex and their assembling process, amyloid related aggregations, and structure and motion of chaperons, motors, channels and virus capsides; the second trend is toward building high resolution models and explore more detailed and accurate pictures of protein folding and the associated processes, such as the coordination bond or disulfide bond involved folding, the polarization, charge transfer and protonate/deprotonate process involved in metal coupled folding, and the ion permeation and its coupling with the kinetics of channels. On these new territories, MD simulations have given many promising results and will continue to offer exciting views. Here, we review several new subjects investigated by using MD simulations as well as the corresponding developments of appropriate protein models. These include but are not limited to the attempt to go beyond the topology based Gō-like model and characterize the energetic factors in protein structures and dynamics, the study of the thermodynamics and kinetics of disulfide bond involved protein folding, the modeling of the interactions between chaperonin and the encapsulated protein and the protein folding under this circumstance, the effort to clarify the important yet still elusive folding mechanism of protein BBL, the development of discrete MD and its application in studying the alpha-beta conformational conversion and oligomer assembling process, and the modeling of metal ion involved protein folding. (c) 2009 IUBMB.

  17. Designing pH induced fold switch in proteins

    NASA Astrophysics Data System (ADS)

    Baruah, Anupaul; Biswas, Parbati

    2015-05-01

    This work investigates the computational design of a pH induced protein fold switch based on a self-consistent mean-field approach by identifying the ensemble averaged characteristics of sequences that encode a fold switch. The primary challenge to balance the alternative sets of interactions present in both target structures is overcome by simultaneously optimizing two foldability criteria corresponding to two target structures. The change in pH is modeled by altering the residual charge on the amino acids. The energy landscape of the fold switch protein is found to be double funneled. The fold switch sequences stabilize the interactions of the sites with similar relative surface accessibility in both target structures. Fold switch sequences have low sequence complexity and hence lower sequence entropy. The pH induced fold switch is mediated by attractive electrostatic interactions rather than hydrophobic-hydrophobic contacts. This study may provide valuable insights to the design of fold switch proteins.

  18. Using Chou's general PseAAC to analyze the evolutionary relationship of receptor associated proteins (RAP) with various folding patterns of protein domains.

    PubMed

    Muthu Krishnan, S

    2018-05-14

    The receptor-associated protein (RAP) is an inhibitor of endocytic receptors that belong to the lipoprotein receptor gene family. In this study, a computational approach was tried to find the evolutionarily related fold of the RAP proteins. Through the structural and sequence-based analysis, found various protein folds that are very close to the RAP folds. Remote homolog datasets were used potentially to develop a different support vector machine (SVM) methods to recognize the homologous RAP fold. This study helps in understanding the relationship of RAP homologs folds based on the structure, function and evolutionary history. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. Atomic interaction networks in the core of protein domains and their native folds.

    PubMed

    Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S; Sasisekharan, V; Sasisekharan, Ram

    2010-02-23

    Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) C(alpha) RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools.

  20. Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds

    PubMed Central

    Soundararajan, Venkataramanan; Raman, Rahul; Raguram, S.; Sasisekharan, V.; Sasisekharan, Ram

    2010-01-01

    Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be “signature” of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1–2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the ‘twilight’ and ‘midnight’ zones wherein <30% and <10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plague-causative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools. PMID:20186337

  1. A thermodynamic definition of protein domains.

    PubMed

    Porter, Lauren L; Rose, George D

    2012-06-12

    Protein domains are conspicuous structural units in globular proteins, and their identification has been a topic of intense biochemical interest dating back to the earliest crystal structures. Numerous disparate domain identification algorithms have been proposed, all involving some combination of visual intuition and/or structure-based decomposition. Instead, we present a rigorous, thermodynamically-based approach that redefines domains as cooperative chain segments. In greater detail, most small proteins fold with high cooperativity, meaning that the equilibrium population is dominated by completely folded and completely unfolded molecules, with a negligible subpopulation of partially folded intermediates. Here, we redefine structural domains in thermodynamic terms as cooperative folding units, based on m-values, which measure the cooperativity of a protein or its substructures. In our analysis, a domain is equated to a contiguous segment of the folded protein whose m-value is largely unaffected when that segment is excised from its parent structure. Defined in this way, a domain is a self-contained cooperative unit; i.e., its cooperativity depends primarily upon intrasegment interactions, not intersegment interactions. Implementing this concept computationally, the domains in a large representative set of proteins were identified; all exhibit consistency with experimental findings. Specifically, our domain divisions correspond to the experimentally determined equilibrium folding intermediates in a set of nine proteins. The approach was also proofed against a representative set of 71 additional proteins, again with confirmatory results. Our reframed interpretation of a protein domain transforms an indeterminate structural phenomenon into a quantifiable molecular property grounded in solution thermodynamics.

  2. Improving Protein Fold Recognition by Deep Learning Networks

    NASA Astrophysics Data System (ADS)

    Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

    2015-12-01

    For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl’s benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.

  3. Homochiral stereochemistry: the missing link of structure to energetics in protein folding.

    PubMed

    Kumar, Anil; Ramakrishnan, Vibin; Ranbhor, Ranjit; Patel, Kirti; Durani, Susheel

    2009-12-24

    The notion is tested that homochiral stereochemistry being ubiquitous to protein structure could be critical to protein folding as well, causing it to become frustrated energetically providing the basis for its solvent- and sequence-mediated control. The proof in support of the notion is found in a consensus of experiment and computation according to which suitable oligopeptides are in their folding-unfolding equilibria, at both macrostate and microstate levels, susceptible to dielectric because of the conflict of peptide-chain electrostatics with interpeptide hydrogen bonds when the structure is poly-L but not when it is alternating-L,D. The argument is thus made that homochiral stereochemistry may in protein folding provide the unifying basis for its solvent- and sequence-mediated control based on screening of peptide-chain electrostatics under conflict with folding of the chain due to homochiral stereochemistry. Dielectric is brought into spotlight as the effect comparatively obscure but presumably critical to the folding in protein structure for its control.

  4. Direct folding simulation of helical proteins using an effective polarizable bond force field.

    PubMed

    Duan, Lili; Zhu, Tong; Ji, Changge; Zhang, Qinggang; Zhang, John Z H

    2017-06-14

    We report a direct folding study of seven helical proteins (, Trpcage, , C34, N36, , ) ranging from 17 to 53 amino acids through standard molecular dynamics simulations using a recently developed polarizable force field-Effective Polarizable Bond (EPB) method. The backbone RMSDs, radius of gyrations, native contacts and native helix content are in good agreement with the experimental results. Cluster analysis has also verified that these folded structures with the highest population are in good agreement with their corresponding native structures for these proteins. In addition, the free energy landscape of seven proteins in the two dimensional space comprised of RMSD and radius of gyration proved that these folded structures are indeed of the lowest energy conformations. However, when the corresponding simulations were performed using the standard (nonpolarizable) AMBER force fields, no stable folded structures were observed for these proteins. Comparison of the simulation results based on a polarizable EPB force field and a nonpolarizable AMBER force field clearly demonstrates the importance of polarization in the folding of stable helical structures.

  5. Principles of protein folding--a perspective from simple exact models.

    PubMed Central

    Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.

    1995-01-01

    General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459

  6. Roles of beta-turns in protein folding: from peptide models to protein engineering.

    PubMed

    Marcelino, Anna Marie C; Gierasch, Lila M

    2008-05-01

    Reverse turns are a major class of protein secondary structure; they represent sites of chain reversal and thus sites where the globular character of a protein is created. It has been speculated for many years that turns may nucleate the formation of structure in protein folding, as their propensity to occur will favor the approximation of their flanking regions and their general tendency to be hydrophilic will favor their disposition at the solvent-accessible surface. Reverse turns are local features, and it is therefore not surprising that their structural properties have been extensively studied using peptide models. In this article, we review research on peptide models of turns to test the hypothesis that the propensities of turns to form in short peptides will relate to the roles of corresponding sequences in protein folding. Turns with significant stability as isolated entities should actively promote the folding of a protein, and by contrast, turn sequences that merely allow the chain to adopt conformations required for chain reversal are predicted to be passive in the folding mechanism. We discuss results of protein engineering studies of the roles of turn residues in folding mechanisms. Factors that correlate with the importance of turns in folding indeed include their intrinsic stability, as well as their topological context and their participation in hydrophobic networks within the protein's structure.

  7. The Dominant Folding Route Minimizes Backbone Distortion in SH3

    PubMed Central

    Lammert, Heiko; Noel, Jeffrey K.; Onuchic, José N.

    2012-01-01

    Energetic frustration in protein folding is minimized by evolution to create a smooth and robust energy landscape. As a result the geometry of the native structure provides key constraints that shape protein folding mechanisms. Chain connectivity in particular has been identified as an essential component for realistic behavior of protein folding models. We study the quantitative balance of energetic and geometrical influences on the folding of SH3 in a structure-based model with minimal energetic frustration. A decomposition of the two-dimensional free energy landscape for the folding reaction into relevant energy and entropy contributions reveals that the entropy of the chain is not responsible for the folding mechanism. Instead the preferred folding route through the transition state arises from a cooperative energetic effect. Off-pathway structures are penalized by excess distortion in local backbone configurations and contact pair distances. This energy cost is a new ingredient in the malleable balance of interactions that controls the choice of routes during protein folding. PMID:23166485

  8. Selection of stably folded proteins by phage-display with proteolysis.

    PubMed

    Bai, Yawen; Feng, Hanqiao

    2004-05-01

    To facilitate the process of protein design and learn the basic rules that control the structure and stability of proteins, combinatorial methods have been developed to select or screen proteins with desired properties from libraries of mutants. One such method uses phage-display and proteolysis to select stably folded proteins. This method does not rely on specific properties of proteins for selection. Therefore, in principle it can be applied to any protein. Since its first demonstration in 1998, the method has been used to create hyperthermophilic proteins, to evolve novel folded domains from a library generated by combinatorial shuffling of polypeptide segments and to convert a partially unfolded structure to a fully folded protein.

  9. Intermediates and the folding of proteins L and G

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Scott; Head-Gordon, Teresa

    We use a minimalist protein model, in combination with a sequence design strategy, to determine differences in primary structure for proteins L and G that are responsible for the two proteins folding through distinctly different folding mechanisms. We find that the folding of proteins L and G are consistent with a nucleation-condensation mechanism, each of which is described as helix-assisted {beta}-1 and {beta}-2 hairpin formation, respectively. We determine that the model for protein G exhibits an early intermediate that precedes the rate-limiting barrier of folding and which draws together misaligned secondary structure elements that are stabilized by hydrophobic core contactsmore » involving the third {beta}-strand, and presages the later transition state in which the correct strand alignment of these same secondary structure elements is restored. Finally the validity of the targeted intermediate ensemble for protein G was analyzed by fitting the kinetic data to a two-step first order reversible reaction, proving that protein G folding involves an on-pathway early intermediate, and should be populated and therefore observable by experiment.« less

  10. Intermediates and the folding of proteins L and G

    PubMed Central

    Brown, Scott; Head-Gordon, Teresa

    2004-01-01

    We use a minimalist protein model, in combination with a sequence design strategy, to determine differences in primary structure for proteins L and G, which are responsible for the two proteins folding through distinctly different folding mechanisms. We find that the folding of proteins L and G are consistent with a nucleation-condensation mechanism, each of which is described as helix-assisted β-1 and β-2 hairpin formation, respectively. We determine that the model for protein G exhibits an early intermediate that precedes the rate-limiting barrier of folding, and which draws together misaligned secondary structure elements that are stabilized by hydrophobic core contacts involving the third β-strand, and presages the later transition state in which the correct strand alignment of these same secondary structure elements is restored. Finally, the validity of the targeted intermediate ensemble for protein G was analyzed by fitting the kinetic data to a two-step first-order reversible reaction, proving that protein G folding involves an on-pathway early intermediate, and should be populated and therefore observable by experiment. PMID:15044729

  11. Generation of a consensus protein domain dictionary

    PubMed Central

    Schaeffer, R. Dustin; Jonsson, Amanda L.; Simms, Andrew M.; Daggett, Valerie

    2011-01-01

    Motivation: The discovery of new protein folds is a relatively rare occurrence even as the rate of protein structure determination increases. This rarity reinforces the concept of folds as reusable units of structure and function shared by diverse proteins. If the folding mechanism of proteins is largely determined by their topology, then the folding pathways of members of existing folds could encompass the full set used by globular protein domains. Results: We have used recent versions of three common protein domain dictionaries (SCOP, CATH and Dali) to generate a consensus domain dictionary (CDD). Surprisingly, 40% of the metafolds in the CDD are not composed of autonomous structural domains, i.e. they are not plausible independent folding units. This finding has serious ramifications for bioinformatics studies mining these domain dictionaries for globular protein properties. However, our main purpose in deriving this CDD was to generate an updated CDD to choose targets for MD simulation as part of our dynameomics effort, which aims to simulate the native and unfolding pathways of representatives of all globular protein consensus folds (metafolds). Consequently, we also compiled a list of representative protein targets of each metafold in the CDD. Availability and implementation: This domain dictionary is available at www.dynameomics.org. Contact: daggett@u.washington.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21068000

  12. Visualizing chaperone-assisted protein folding

    DOE PAGES

    Horowitz, Scott; Salmon, Loïc; Koldewey, Philipp; ...

    2016-05-30

    We present that challenges in determining the structures of heterogeneous and dynamic protein complexes have greatly hampered past efforts to obtain a mechanistic understanding of many important biological processes. One such process is chaperone-assisted protein folding. Obtaining structural ensembles of chaperone–substrate complexes would ultimately reveal how chaperones help proteins fold into their native state. To address this problem, we devised a new structural biology approach based on X-ray crystallography, termed residual electron and anomalous density (READ). READ enabled us to visualize even sparsely populated conformations of the substrate protein immunity protein 7 (Im7) in complex with the Escherichia coli chaperonemore » Spy, and to capture a series of snapshots depicting the various folding states of Im7 bound to Spy. The ensemble shows that Spy-associated Im7 samples conformations ranging from unfolded to partially folded to native-like states and reveals how a substrate can explore its folding landscape while being bound to a chaperone.« less

  13. Characterization of protein-folding pathways by reduced-space modeling.

    PubMed

    Kmiecik, Sebastian; Kolinski, Andrzej

    2007-07-24

    Ab initio simulations of the folding pathways are currently limited to very small proteins. For larger proteins, some approximations or simplifications in protein models need to be introduced. Protein folding and unfolding are among the basic processes in the cell and are very difficult to characterize in detail by experiment or simulation. Chymotrypsin inhibitor 2 (CI2) and barnase are probably the best characterized experimentally in this respect. For these model systems, initial folding stages were simulated by using CA-CB-side chain (CABS), a reduced-space protein-modeling tool. CABS employs knowledge-based potentials that proved to be very successful in protein structure prediction. With the use of isothermal Monte Carlo (MC) dynamics, initiation sites with a residual structure and weak tertiary interactions were identified. Such structures are essential for the initiation of the folding process through a sequential reduction of the protein conformational space, overcoming the Levinthal paradox in this manner. Furthermore, nucleation sites that initiate a tertiary interactions network were located. The MC simulations correspond perfectly to the results of experimental and theoretical research and bring insights into CI2 folding mechanism: unambiguous sequence of folding events was reported as well as cooperative substructures compatible with those obtained in recent molecular dynamics unfolding studies. The correspondence between the simulation and experiment shows that knowledge-based potentials are not only useful in protein structure predictions but are also capable of reproducing the folding pathways. Thus, the results of this work significantly extend the applicability range of reduced models in the theoretical study of proteins.

  14. Evolution of a protein folding nucleus.

    PubMed

    Xia, Xue; Longo, Liam M; Sutherland, Mason A; Blaber, Michael

    2016-07-01

    The folding nucleus (FN) is a cryptic element within protein primary structure that enables an efficient folding pathway and is the postulated heritable element in the evolution of protein architecture; however, almost nothing is known regarding how the FN structurally changes as complex protein architecture evolves from simpler peptide motifs. We report characterization of the FN of a designed purely symmetric β-trefoil protein by ϕ-value analysis. We compare the structure and folding properties of key foldable intermediates along the evolutionary trajectory of the β-trefoil. The results show structural acquisition of the FN during gene fusion events, incorporating novel turn structure created by gene fusion. Furthermore, the FN is adjusted by circular permutation in response to destabilizing functional mutation. FN plasticity by way of circular permutation is made possible by the intrinsic C3 cyclic symmetry of the β-trefoil architecture, identifying a possible selective advantage that helps explain the prevalence of cyclic structural symmetry in the proteome. © 2015 The Protein Society.

  15. Protein folding optimization based on 3D off-lattice model via an improved artificial bee colony algorithm.

    PubMed

    Li, Bai; Lin, Mu; Liu, Qiao; Li, Ya; Zhou, Changjun

    2015-10-01

    Protein folding is a fundamental topic in molecular biology. Conventional experimental techniques for protein structure identification or protein folding recognition require strict laboratory requirements and heavy operating burdens, which have largely limited their applications. Alternatively, computer-aided techniques have been developed to optimize protein structures or to predict the protein folding process. In this paper, we utilize a 3D off-lattice model to describe the original protein folding scheme as a simplified energy-optimal numerical problem, where all types of amino acid residues are binarized into hydrophobic and hydrophilic ones. We apply a balance-evolution artificial bee colony (BE-ABC) algorithm as the minimization solver, which is featured by the adaptive adjustment of search intensity to cater for the varying needs during the entire optimization process. In this work, we establish a benchmark case set with 13 real protein sequences from the Protein Data Bank database and evaluate the convergence performance of BE-ABC algorithm through strict comparisons with several state-of-the-art ABC variants in short-term numerical experiments. Besides that, our obtained best-so-far protein structures are compared to the ones in comprehensive previous literature. This study also provides preliminary insights into how artificial intelligence techniques can be applied to reveal the dynamics of protein folding. Graphical Abstract Protein folding optimization using 3D off-lattice model and advanced optimization techniques.

  16. Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment

    PubMed Central

    Xu, Dong; Zhang, Yang

    2013-01-01

    Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms. PMID:23719418

  17. Roles of β-Turns in Protein Folding: From Peptide Models to Protein Engineering

    PubMed Central

    Marcelino, Anna Marie C.; Gierasch, Lila M.

    2010-01-01

    Reverse turns are a major class of protein secondary structure; they represent sites of chain reversal and thus sites where the globular character of a protein is created. It has been speculated for many years that turns may nucleate the formation of structure in protein folding, as their propensity to occur will favor the approximation of their flanking regions and their general tendency to be hydrophilic will favor their disposition at the solvent-accessible surface. Reverse turns are local features, and it is therefore not surprising that their structural properties have been extensively studied using peptide models. In this article, we review research on peptide models of turns to test the hypothesis that the propensities of turns to form in short peptides will relate to the roles of corresponding sequences in protein folding. Turns with significant stability as isolated entities should actively promote the folding of a protein, and by contrast, turn sequences that merely allow the chain to adopt conformations required for chain reversal are predicted to be passive in the folding mechanism. We discuss results of protein engineering studies of the roles of turn residues in folding mechanisms. Factors that correlate with the importance of turns in folding indeed include their intrinsic stability, as well as their topological context and their participation in hydrophobic networks within the protein’s structure. PMID:18275088

  18. Evolution of the arginase fold and functional diversity

    PubMed Central

    Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.

    2009-01-01

    The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740

  19. Simplified Protein Models: Predicting Folding Pathways and Structure Using Amino Acid Sequences

    NASA Astrophysics Data System (ADS)

    Adhikari, Aashish N.; Freed, Karl F.; Sosnick, Tobin R.

    2013-07-01

    We demonstrate the ability of simultaneously determining a protein’s folding pathway and structure using a properly formulated model without prior knowledge of the native structure. Our model employs a natural coordinate system for describing proteins and a search strategy inspired by the observation that real proteins fold in a sequential fashion by incrementally stabilizing nativelike substructures or “foldons.” Comparable folding pathways and structures are obtained for the twelve proteins recently studied using atomistic molecular dynamics simulations [K. Lindorff-Larsen, S. Piana, R. O. Dror, D. E. Shaw, Science 334, 517 (2011)], with our calculations running several orders of magnitude faster. We find that nativelike propensities in the unfolded state do not necessarily determine the order of structure formation, a departure from a major conclusion of the molecular dynamics study. Instead, our results support a more expansive view wherein intrinsic local structural propensities may be enhanced or overridden in the folding process by environmental context. The success of our search strategy validates it as an expedient mechanism for folding both in silico and in vivo.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Middleton, Sarah A.; Illuminati, Joseph; Kim, Junhyong

    Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this methodmore » by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Finally, our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.« less

  1. Unique Features of Halophilic Proteins.

    PubMed

    Arakawa, Tsutomu; Yamaguchi, Rui; Tokunaga, Hiroko; Tokunaga, Masao

    2017-01-01

    Proteins from moderate and extreme halophiles have unique characteristics. They are highly acidic and hydrophilic, similar to intrinsically disordered proteins. These characteristics make the halophilic proteins soluble in water and fold reversibly. In addition to reversible folding, the rate of refolding of halophilic proteins from denatured structure is generally slow, often taking several days, for example, for extremely halophilic proteins. This slow folding rate makes the halophilic proteins a novel model system for folding mechanism analysis. High solubility and reversible folding also make the halophilic proteins excellent fusion partners for soluble expression of recombinant proteins.

  2. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  3. Probabilistic analysis for identifying the driving force of protein folding

    NASA Astrophysics Data System (ADS)

    Tokunaga, Yoshihiko; Yamamori, Yu; Matubayasi, Nobuyuki

    2018-03-01

    Toward identifying the driving force of protein folding, energetics was analyzed in water for Trp-cage (20 residues), protein G (56 residues), and ubiquitin (76 residues) at their native (folded) and heat-denatured (unfolded) states. All-atom molecular dynamics simulation was conducted, and the hydration effect was quantified by the solvation free energy. The free-energy calculation was done by employing the solution theory in the energy representation, and it was seen that the sum of the protein intramolecular (structural) energy and the solvation free energy is more favorable for a folded structure than for an unfolded one generated by heat. Probabilistic arguments were then developed to determine which of the electrostatic, van der Waals, and excluded-volume components of the interactions in the protein-water system governs the relative stabilities between the folded and unfolded structures. It was found that the electrostatic interaction does not correspond to the preference order of the two structures. The van der Waals and excluded-volume components were shown, on the other hand, to provide the right order of preference at probabilities of almost unity, and it is argued that a useful modeling of protein folding is possible on the basis of the excluded-volume effect.

  4. Use of conserved key amino acid positions to morph protein folds.

    PubMed

    Reddy, Boojala V B; Li, Wilfred W; Bourne, Philip E

    2002-07-15

    By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments. Copyright 2002 Wiley Periodicals, Inc.

  5. Molecular chaperone function of Mia40 triggers consecutive induced folding steps of the substrate in mitochondrial protein import

    PubMed Central

    Banci, Lucia; Bertini, Ivano; Cefaro, Chiara; Cenacchi, Lucia; Ciofi-Baffoni, Simone; Felli, Isabella Caterina; Gallo, Angelo; Gonnelli, Leonardo; Luchinat, Enrico; Sideris, Dionisia; Tokatlidis, Kostas

    2010-01-01

    Several proteins of the mitochondrial intermembrane space are targeted by internal targeting signals. A class of such proteins with α-helical hairpin structure bridged by two intramolecular disulfides is trapped by a Mia40-dependent oxidative process. Here, we describe the oxidative folding mechanism underpinning this process by an exhaustive structural characterization of the protein in all stages and as a complex with Mia40. Two consecutive induced folding steps are at the basis of the protein-trapping process. In the first one, Mia40 functions as a molecular chaperone assisting α-helical folding of the internal targeting signal of the substrate. Subsequently, in a Mia40-independent manner, folding of the second substrate helix is induced by the folded targeting signal functioning as a folding scaffold. The Mia40-induced folding pathway provides a proof of principle for the general concept that internal targeting signals may operate as a folding nucleus upon compartment-specific activation. PMID:21059946

  6. Molecular dynamics studies of protein folding and aggregation

    NASA Astrophysics Data System (ADS)

    Ding, Feng

    This thesis applies molecular dynamics simulations and statistical mechanics to study: (i) protein folding; and (ii) protein aggregation. Most small proteins fold into their native states via a first-order-like phase transition with a major free energy barrier between the folded and unfolded states. A set of protein conformations corresponding to the free energy barrier, Delta G >> kBT, are the folding transition state ensemble (TSE). Due to their evasive nature, TSE conformations are hard to capture (probability ∝ exp(-DeltaG/k BT)) and characterize. A coarse-grained discrete molecular dynamics model with realistic steric constraints is constructed to reproduce the experimentally observed two-state folding thermodynamics. A kinetic approach is proposed to identify the folding TSE. A specific set of contacts, common to the TSE conformations, is identified as the folding nuclei which are necessary to be formed in order for the protein to fold. Interestingly, the amino acids at the site of the identified folding nuclei are highly conserved for homologous proteins sharing the same structures. Such conservation suggests that amino acids that are important for folding kinetics are under selective pressure to be preserved during the course of molecular evolution. In addition, studies of the conformations close to the transition states uncover the importance of topology in the construction of order parameter for protein folding transition. Misfolded proteins often form insoluble aggregates, amyloid fibrils, that deposit in the extracellular space and lead to a type of disease known as amyloidosis. Due to its insoluble and non-crystalline nature, the aggregation structure and, thus the aggregation mechanism, has yet to be uncovered. Discrete molecular dynamics studies reveal an aggregate structure with the same structural signatures as in experimental observations and show a nucleation aggregation scenario. The simulations also suggest a generic aggregation mechanism that globular proteins under a denaturing environment partially unfold and aggregate by forming stabilizing hydrogen bonds between the backbones of the partial folded substructures. Proteins or peptides rich in alpha-helices also aggregate into beta-rich amyloid fibrils. Upon aggregation, the protein or peptide undergoes a conformational transition from alpha-helices to beta-sheets. The transition of alpha-helix to beta-hairpin (two-stranded beta-sheet) is studied in an all-heavy-atom discrete molecular dynamics model of a polyalanine chain. An entropical driving scenario for the alpha-helix to beta-hairpin transition is discovered.

  7. Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships.

    PubMed

    Gold, Nicola D; Jackson, Richard M

    2006-02-03

    The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.

  8. Ab initio folding of proteins using all-atom discrete molecular dynamics

    PubMed Central

    Ding, Feng; Tsao, Douglas; Nie, Huifen; Dokholyan, Nikolay V.

    2008-01-01

    Summary Discrete molecular dynamics (DMD) is a rapid sampling method used in protein folding and aggregation studies. Until now, DMD was used to perform simulations of simplified protein models in conjunction with structure-based force fields. Here, we develop an all-atom protein model and a transferable force field featuring packing, solvation, and environment-dependent hydrogen bond interactions. Using the replica exchange method, we perform folding simulations of six small proteins (20–60 residues) with distinct native structures. In all cases, native or near-native states are reached in simulations. For three small proteins, multiple folding transitions are observed and the computationally-characterized thermodynamics are in quantitative agreement with experiments. The predictive power of all-atom DMD highlights the importance of environment-dependent hydrogen bond interactions in modeling protein folding. The developed approach can be used for accurate and rapid sampling of conformational spaces of proteins and protein-protein complexes, and applied to protein engineering and design of protein-protein interactions. PMID:18611374

  9. Heterochiral Knottin Protein: Folding and Solution Structure.

    PubMed

    Mong, Surin K; Cochran, Frank V; Yu, Hongtao; Graziano, Zachary; Lin, Yu-Shan; Cochran, Jennifer R; Pentelute, Bradley L

    2017-10-31

    Homochirality is a general feature of biological macromolecules, and Nature includes few examples of heterochiral proteins. Herein, we report on the design, chemical synthesis, and structural characterization of heterochiral proteins possessing loops of amino acids of chirality opposite to that of the rest of a protein scaffold. Using the protein Ecballium elaterium trypsin inhibitor II, we discover that selective β-alanine substitution favors the efficient folding of our heterochiral constructs. Solution nuclear magnetic resonance spectroscopy of one such heterochiral protein reveals a homogeneous global fold. Additionally, steered molecular dynamics simulation indicate β-alanine reduces the free energy required to fold the protein. We also find these heterochiral proteins to be more resistant to proteolysis than homochiral l-proteins. This work informs the design of heterochiral protein architectures containing stretches of both d- and l-amino acids.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Horowitz, Scott; Salmon, Loïc; Koldewey, Philipp

    We present that challenges in determining the structures of heterogeneous and dynamic protein complexes have greatly hampered past efforts to obtain a mechanistic understanding of many important biological processes. One such process is chaperone-assisted protein folding. Obtaining structural ensembles of chaperone–substrate complexes would ultimately reveal how chaperones help proteins fold into their native state. To address this problem, we devised a new structural biology approach based on X-ray crystallography, termed residual electron and anomalous density (READ). READ enabled us to visualize even sparsely populated conformations of the substrate protein immunity protein 7 (Im7) in complex with the Escherichia coli chaperonemore » Spy, and to capture a series of snapshots depicting the various folding states of Im7 bound to Spy. The ensemble shows that Spy-associated Im7 samples conformations ranging from unfolded to partially folded to native-like states and reveals how a substrate can explore its folding landscape while being bound to a chaperone.« less

  11. Congenital hypothyroidism mutations affect common folding and trafficking in the α/β-hydrolase fold proteins

    PubMed Central

    De Jaco, Antonella; Dubi, Noga; Camp, Shelley; Taylor, Palmer

    2017-01-01

    The α/β-hydrolase fold superfamily of proteins is composed of structurally related members that, despite great diversity in their catalytic, recognition, adhesion and chaperone functions, share a common fold governed by homologous residues and conserved disulfide bridges. Non-synonymous single nucleotide polymorphisms within the α/β-hydrolase fold domain in various family members have been found for congenital endocrine, metabolic and nervous system disorders. By examining the amino acid sequence from the various proteins, mutations were found to be prevalent in conserved residues within the α/β-hydrolase fold of the homologous proteins. This is the case for the thyroglobulin mutations linked to congenital hypothyroidism. To address whether correct folding of the common domain is required for protein export, we inserted the thyroglobulin mutations at homologous positions in two correlated but simpler α/β-hydrolase fold proteins known to be exported to the cell surface: neuroligin3 and acetylcholinesterase. Here we show that these mutations in the cholinesterase homologous region alter the folding properties of the α/β-hydrolase fold domain, which are reflected in defects in protein trafficking, folding and function, and ultimately result in retention of the partially processed proteins in the endoplasmic reticulum. Accordingly, mutations at conserved residues may be transferred amongst homologous proteins to produce common processing defects despite disparate functions, protein complexity and tissue-specific expression of the homologous proteins. More importantly, a similar assembly of the α/β-hydrolase fold domain tertiary structure among homologous members of the superfamily is required for correct trafficking of the proteins to their final destination. PMID:23035660

  12. Protein folding by NMR.

    PubMed

    Zhuravleva, Anastasia; Korzhnev, Dmitry M

    2017-05-01

    Protein folding is a highly complex process proceeding through a number of disordered and partially folded nonnative states with various degrees of structural organization. These transiently and sparsely populated species on the protein folding energy landscape play crucial roles in driving folding toward the native conformation, yet some of these nonnative states may also serve as precursors for protein misfolding and aggregation associated with a range of devastating diseases, including neuro-degeneration, diabetes and cancer. Therefore, in vivo protein folding is often reshaped co- and post-translationally through interactions with the ribosome, molecular chaperones and/or other cellular components. Owing to developments in instrumentation and methodology, solution NMR spectroscopy has emerged as the central experimental approach for the detailed characterization of the complex protein folding processes in vitro and in vivo. NMR relaxation dispersion and saturation transfer methods provide the means for a detailed characterization of protein folding kinetics and thermodynamics under native-like conditions, as well as modeling high-resolution structures of weakly populated short-lived conformational states on the protein folding energy landscape. Continuing development of isotope labeling strategies and NMR methods to probe high molecular weight protein assemblies, along with advances of in-cell NMR, have recently allowed protein folding to be studied in the context of ribosome-nascent chain complexes and molecular chaperones, and even inside living cells. Here we review solution NMR approaches to investigate the protein folding energy landscape, and discuss selected applications of NMR methodology to studying protein folding in vitro and in vivo. Together, these examples highlight a vast potential of solution NMR in providing atomistic insights into molecular mechanisms of protein folding and homeostasis in health and disease. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Atomic-level characterization of the structural dynamics of proteins.

    PubMed

    Shaw, David E; Maragakis, Paul; Lindorff-Larsen, Kresten; Piana, Stefano; Dror, Ron O; Eastwood, Michael P; Bank, Joseph A; Jumper, John M; Salmon, John K; Shan, Yibing; Wriggers, Willy

    2010-10-15

    Molecular dynamics (MD) simulations are widely used to study protein motions at an atomic level of detail, but they have been limited to time scales shorter than those of many biologically critical conformational changes. We examined two fundamental processes in protein dynamics--protein folding and conformational change within the folded state--by means of extremely long all-atom MD simulations conducted on a special-purpose machine. Equilibrium simulations of a WW protein domain captured multiple folding and unfolding events that consistently follow a well-defined folding pathway; separate simulations of the protein's constituent substructures shed light on possible determinants of this pathway. A 1-millisecond simulation of the folded protein BPTI reveals a small number of structurally distinct conformational states whose reversible interconversion is slower than local relaxations within those states by a factor of more than 1000.

  14. Developing a molecular dynamics force field for both folded and disordered protein states.

    PubMed

    Robustelli, Paul; Piana, Stefano; Shaw, David E

    2018-05-07

    Molecular dynamics (MD) simulation is a valuable tool for characterizing the structural dynamics of folded proteins and should be similarly applicable to disordered proteins and proteins with both folded and disordered regions. It has been unclear, however, whether any physical model (force field) used in MD simulations accurately describes both folded and disordered proteins. Here, we select a benchmark set of 21 systems, including folded and disordered proteins, simulate these systems with six state-of-the-art force fields, and compare the results to over 9,000 available experimental data points. We find that none of the tested force fields simultaneously provided accurate descriptions of folded proteins, of the dimensions of disordered proteins, and of the secondary structure propensities of disordered proteins. Guided by simulation results on a subset of our benchmark, however, we modified parameters of one force field, achieving excellent agreement with experiment for disordered proteins, while maintaining state-of-the-art accuracy for folded proteins. The resulting force field, a99SB- disp , should thus greatly expand the range of biological systems amenable to MD simulation. A similar approach could be taken to improve other force fields. Copyright © 2018 the Author(s). Published by PNAS.

  15. A hybrid MD-kMC algorithm for folding proteins in explicit solvent.

    PubMed

    Peter, Emanuel Karl; Shea, Joan-Emma

    2014-04-14

    We present a novel hybrid MD-kMC algorithm that is capable of efficiently folding proteins in explicit solvent. We apply this algorithm to the folding of a small protein, Trp-Cage. Different kMC move sets that capture different possible rate limiting steps are implemented. The first uses secondary structure formation as a relevant rate event (a combination of dihedral rotations and hydrogen-bonding formation and breakage). The second uses tertiary structure formation events through formation of contacts via translational moves. Both methods fold the protein, but via different mechanisms and with different folding kinetics. The first method leads to folding via a structured helical state, with kinetics fit by a single exponential. The second method leads to folding via a collapsed loop, with kinetics poorly fit by single or double exponentials. In both cases, folding times are faster than experimentally reported values, The secondary and tertiary move sets are integrated in a third MD-kMC implementation, which now leads to folding of the protein via both pathways, with single and double-exponential fits to the rates, and to folding rates in good agreement with experimental values. The competition between secondary and tertiary structure leads to a longer search for the helix-rich intermediate in the case of the first pathway, and to the emergence of a kinetically trapped long-lived molten-globule collapsed state in the case of the second pathway. The algorithm presented not only captures experimentally observed folding intermediates and kinetics, but yields insights into the relative roles of local and global interactions in determining folding mechanisms and rates.

  16. Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts.

    PubMed

    Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo

    2017-12-01

    Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent performance improvement, indicating robustness of our approach. Furthermore, bi-clustering results of the extracted features are compatible with fold hierarchy of proteins, implying that these features are fold-specific. Together, these results suggest that the features extracted from predicted contacts are orthogonal to alignment-related features, and the combination of them could greatly facilitate fold recognition at superfamily/fold levels and template-based prediction of protein structures. Source code of DeepFR is freely available through https://github.com/zhujianwei31415/deepfr, and a web server is available through http://protein.ict.ac.cn/deepfr. zheng@itp.ac.cn or dbu@ict.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  17. Hydrophobic folding units derived from dissimilar monomer structures and their interactions.

    PubMed

    Tsai, C J; Nussinov, R

    1997-01-01

    We have designed an automated procedure to cut a protein into compact hydrophobic folding units. The hydrophobic units are large enough to contain tertiary non-local interactions, reflecting potential nucleation sites during protein folding. The quality of a hydrophobic folding unit is evaluated by four criteria. The first two correspond to visual characterization of a structural domain, namely, compactness and extent of isolation. We use the definition of Zehfus and Rose (Zehfus MH, Rose GD, 1986, Biochemistry 25:35-340) to calculate the compactness of a cut protein unit. The isolation of a unit is based on the solvent accessible surface area (ASA) originally buried in the interior and exposed to the solvent after cutting. The third quantity is the hydrophobicity, equivalent to the fraction of the buried non-polar ASA with respect to the total non-polar ASA. The last criterion in the evaluation of a folding unit is the number of segments it includes. To conform with the rationale of obtaining hydrophobic units, which may relate to early folding events, the hydrophobic interactions are implicitly and explicitly applied in their generation and assessment. We follow Holm and Sander (Holm L, Sander C, 1994, Proteins 19:256-268) to reduce the multiple cutting-point problem to a one-dimensional search for all reasonable trial cuts. However, as here we focus on the hydrophobic cores, the contact matrix used to obtain the first non-trivial eigenvector contains only hydrophobic contracts, rather than all, hydrophobic and hydrophilic, interactions. This dataset of hydrophobic folding units, derived from structurally dissimilar single chain monomers, is particularly useful for investigations of the mechanism of protein folding. For cases where there are kinetic data, the one or more hydrophobic folding units generated for a protein correlate with the two or with the three-state folding process observed. We carry out extensive amino acid sequence order independent structural comparisons to generate a structurally non-redundant set of hydrophobic folding units for fold recognition and for statistical purposes.

  18. Characterisation of transition state structures for protein folding using 'high', 'medium' and 'low' {Phi}-values.

    PubMed

    Geierhaas, Christian D; Salvatella, Xavier; Clarke, Jane; Vendruscolo, Michele

    2008-03-01

    It has been suggested that Phi-values, which allow structural information about transition states (TSs) for protein folding to be obtained, are most reliably interpreted when divided into three classes (high, medium and low). High Phi-values indicate almost completely folded regions in the TS, intermediate Phi-values regions with a detectable amount of structure and low Phi-values indicate mostly unstructured regions. To explore the extent to which this classification can be used to characterise in detail the structure of TSs for protein folding, we used Phi-values divided into these classes as restraints in molecular dynamics simulations. This type of procedure is related to that used in NMR spectroscopy to define the structure of native proteins from the measurement of inter-proton distances derived from nuclear Overhauser effects. We illustrate this approach by determining the TS ensembles of five proteins and by showing that the results are similar to those obtained by using as restraints the actual numerical Phi-values measured experimentally. Our results indicate that the simultaneous consideration of a set of low-resolution Phi-values can provide sufficient information for characterising the architecture of a TS for folding of a protein.

  19. Characterisation of transition state structures for protein folding using ‘high’, ‘medium’ and ‘low’ Φ-values

    PubMed Central

    Geierhaas, Christian D.; Salvatella, Xavier; Clarke, Jane; Vendruscolo, Michele

    2008-01-01

    It has been suggested that Φ-values, which allow structural information about transition states (TSs) for protein folding to be obtained, are most reliably interpreted when divided into three classes (high, medium and low). High Φ-values indicate almost completely folded regions in the TS, intermediate Φ-values regions with a detectable amount of structure and low Φ-values indicate mostly unstructured regions. To explore the extent to which this classification can be used to characterise in detail the structure of TSs for protein folding, we used Φ-values divided into these classes as restraints in molecular dynamics simulations. This type of procedure is related to that used in NMR spectroscopy to define the structure of native proteins from the measurement of inter-proton distances derived from nuclear Overhauser effects. We illustrate this approach by determining the TS ensembles of five proteins and by showing that the results are similar to those obtained by using as restraints the actual numerical Φ-values measured experimentally. Our results indicate that the simultaneous consideration of a set of low-resolution Φ-values can provide sufficient information for characterising the architecture of a TS for folding of a protein. PMID:18299294

  20. Simulating protein folding initiation sites using an alpha-carbon-only knowledge-based force field

    PubMed Central

    Buck, Patrick M.; Bystroff, Christopher

    2015-01-01

    Protein folding is a hierarchical process where structure forms locally first, then globally. Some short sequence segments initiate folding through strong structural preferences that are independent of their three-dimensional context in proteins. We have constructed a knowledge-based force field in which the energy functions are conditional on local sequence patterns, as expressed in the hidden Markov model for local structure (HMMSTR). Carbon-alpha force field (CALF) builds sequence specific statistical potentials based on database frequencies for α-carbon virtual bond opening and dihedral angles, pairwise contacts and hydrogen bond donor-acceptor pairs, and simulates folding via Brownian dynamics. We introduce hydrogen bond donor and acceptor potentials as α-carbon probability fields that are conditional on the predicted local sequence. Constant temperature simulations were carried out using 27 peptides selected as putative folding initiation sites, each 12 residues in length, representing several different local structure motifs. Each 0.6 μs trajectory was clustered based on structure. Simulation convergence or representativeness was assessed by subdividing trajectories and comparing clusters. For 21 of the 27 sequences, the largest cluster made up more than half of the total trajectory. Of these 21 sequences, 14 had cluster centers that were at most 2.6 Å root mean square deviation (RMSD) from their native structure in the corresponding full-length protein. To assess the adequacy of the energy function on nonlocal interactions, 11 full length native structures were relaxed using Brownian dynamics simulations. Equilibrated structures deviated from their native states but retained their overall topology and compactness. A simple potential that folds proteins locally and stabilizes proteins globally may enable a more realistic understanding of hierarchical folding pathways. PMID:19137613

  1. A galaxy of folds.

    PubMed

    Alva, Vikram; Remmert, Michael; Biegert, Andreas; Lupas, Andrei N; Söding, Johannes

    2010-01-01

    Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co-occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.

  2. Probing Protein Fold Space with a Simplified Model

    PubMed Central

    Minary, Peter; Levitt, Michael

    2008-01-01

    We probe the stability and near-native energy landscape of protein fold space using powerful conformational sampling methods together with simple reduced models and statistical potentials. Fold space is represented by a set of 280 protein domains spanning all topological classes and having a wide range of lengths (0-300 residues), amino acid composition, and number of secondary structural elements. The degrees of freedom are taken as the loop torsion angles. This choice preserves the native secondary structure but allows the tertiary structure to change. The proteins are represented by three-point per residue, three-dimensional models with statistical potentials derived from a knowledge-based study of known protein structures. When this space is sampled by a combination of Parallel Tempering and Equi-Energy Monte Carlo, we find that the three-point model captures the known stability of protein native structures with stable energy basins that are near-native (all-α: 4.77 Å, all-β: 2.93 Å, α/β: 3.09 Å, α+β: 4.89 Å on average and within 6 Å for 71.41 %, 92.85 %, 94.29 % and 64.28 % for all-α, all-β, α/β and α+β, classes respectively). Denatured structures also occur and these have interesting structural properties that shed light on the different landscape characteristics of α and β folds. We find that α/β proteins with alternating α and β segments (such as the beta-barrel) are more stable than proteins in other fold classes. PMID:18054792

  3. An all-atom structure-based potential for proteins: bridging minimal models with all-atom empirical forcefields.

    PubMed

    Whitford, Paul C; Noel, Jeffrey K; Gosavi, Shachi; Schug, Alexander; Sanbonmatsu, Kevin Y; Onuchic, José N

    2009-05-01

    Protein dynamics take place on many time and length scales. Coarse-grained structure-based (Go) models utilize the funneled energy landscape theory of protein folding to provide an understanding of both long time and long length scale dynamics. All-atom empirical forcefields with explicit solvent can elucidate our understanding of short time dynamics with high energetic and structural resolution. Thus, structure-based models with atomic details included can be used to bridge our understanding between these two approaches. We report on the robustness of folding mechanisms in one such all-atom model. Results for the B domain of Protein A, the SH3 domain of C-Src Kinase, and Chymotrypsin Inhibitor 2 are reported. The interplay between side chain packing and backbone folding is explored. We also compare this model to a C(alpha) structure-based model and an all-atom empirical forcefield. Key findings include: (1) backbone collapse is accompanied by partial side chain packing in a cooperative transition and residual side chain packing occurs gradually with decreasing temperature, (2) folding mechanisms are robust to variations of the energetic parameters, (3) protein folding free-energy barriers can be manipulated through parametric modifications, (4) the global folding mechanisms in a C(alpha) model and the all-atom model agree, although differences can be attributed to energetic heterogeneity in the all-atom model, and (5) proline residues have significant effects on folding mechanisms, independent of isomerization effects. Because this structure-based model has atomic resolution, this work lays the foundation for future studies to probe the contributions of specific energetic factors on protein folding and function.

  4. An All-atom Structure-Based Potential for Proteins: Bridging Minimal Models with All-atom Empirical Forcefields

    PubMed Central

    Whitford, Paul C.; Noel, Jeffrey K.; Gosavi, Shachi; Schug, Alexander; Sanbonmatsu, Kevin Y.; Onuchic, José N.

    2012-01-01

    Protein dynamics take place on many time and length scales. Coarse-grained structure-based (Gō) models utilize the funneled energy landscape theory of protein folding to provide an understanding of both long time and long length scale dynamics. All-atom empirical forcefields with explicit solvent can elucidate our understanding of short time dynamics with high energetic and structural resolution. Thus, structure-based models with atomic details included can be used to bridge our understanding between these two approaches. We report on the robustness of folding mechanisms in one such all-atom model. Results for the B domain of Protein A, the SH3 domain of C-Src Kinase and Chymotrypsin Inhibitor 2 are reported. The interplay between side chain packing and backbone folding is explored. We also compare this model to a Cα structure-based model and an all-atom empirical forcefield. Key findings include 1) backbone collapse is accompanied by partial side chain packing in a cooperative transition and residual side chain packing occurs gradually with decreasing temperature 2) folding mechanisms are robust to variations of the energetic parameters 3) protein folding free energy barriers can be manipulated through parametric modifications 4) the global folding mechanisms in a Cα model and the all-atom model agree, although differences can be attributed to energetic heterogeneity in the all-atom model 5) proline residues have significant effects on folding mechanisms, independent of isomerization effects. Since this structure-based model has atomic resolution, this work lays the foundation for future studies to probe the contributions of specific energetic factors on protein folding and function. PMID:18837035

  5. Universality and diversity of folding mechanics for three-helix bundle proteins.

    PubMed

    Yang, Jae Shick; Wallin, Stefan; Shakhnovich, Eugene I

    2008-01-22

    In this study we evaluate, at full atomic detail, the folding processes of two small helical proteins, the B domain of protein A and the Villin headpiece. Folding kinetics are studied by performing a large number of ab initio Monte Carlo folding simulations using a single transferable all-atom potential. Using these trajectories, we examine the relaxation behavior, secondary structure formation, and transition-state ensembles (TSEs) of the two proteins and compare our results with experimental data and previous computational studies. To obtain a detailed structural information on the folding dynamics viewed as an ensemble process, we perform a clustering analysis procedure based on graph theory. Moreover, rigorous p(fold) analysis is used to obtain representative samples of the TSEs and a good quantitative agreement between experimental and simulated Phi values is obtained for protein A. Phi values for Villin also are obtained and left as predictions to be tested by future experiments. Our analysis shows that the two-helix hairpin is a common partially stable structural motif that gets formed before entering the TSE in the studied proteins. These results together with our earlier study of Engrailed Homeodomain and recent experimental studies provide a comprehensive, atomic-level picture of folding mechanics of three-helix bundle proteins.

  6. Analysis of the Free-Energy Surface of Proteins from Reversible Folding Simulations

    PubMed Central

    Allen, Lucy R.; Krivov, Sergei V.; Paci, Emanuele

    2009-01-01

    Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical λ-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins. PMID:19593364

  7. Analysis of the free-energy surface of proteins from reversible folding simulations.

    PubMed

    Allen, Lucy R; Krivov, Sergei V; Paci, Emanuele

    2009-07-01

    Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical lambda-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins.

  8. Evolution, Energy Landscapes and the Paradoxes of Protein Folding

    PubMed Central

    Wolynes, Peter G.

    2014-01-01

    Protein folding has been viewed as a difficult problem of molecular self-organization. The search problem involved in folding however has been simplified through the evolution of folding energy landscapes that are funneled. The funnel hypothesis can be quantified using energy landscape theory based on the minimal frustration principle. Strong quantitative predictions that follow from energy landscape theory have been widely confirmed both through laboratory folding experiments and from detailed simulations. Energy landscape ideas also have allowed successful protein structure prediction algorithms to be developed. The selection constraint of having funneled folding landscapes has left its imprint on the sequences of existing protein structural families. Quantitative analysis of co-evolution patterns allows us to infer the statistical characteristics of the folding landscape. These turn out to be consistent with what has been obtained from laboratory physicochemical folding experiments signalling a beautiful confluence of genomics and chemical physics. PMID:25530262

  9. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

    PubMed

    Tian, Pengfei; Best, Robert B

    2017-10-17

    Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.

  10. Using linear algebra for protein structural comparison and classification

    PubMed Central

    2009-01-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in. PMID:21637532

  11. Using linear algebra for protein structural comparison and classification.

    PubMed

    Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo

    2009-07-01

    In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

  12. Structural classification of small, disulfide-rich protein domains.

    PubMed

    Cheek, Sara; Krishna, S Sri; Grishin, Nick V

    2006-05-26

    Disulfide-rich domains are small protein domains whose global folds are stabilized primarily by the formation of disulfide bonds and, to a much lesser extent, by secondary structure and hydrophobic interactions. Disulfide-rich domains perform a wide variety of roles functioning as growth factors, toxins, enzyme inhibitors, hormones, pheromones, allergens, etc. These domains are commonly found both as independent (single-domain) proteins and as domains within larger polypeptides. Here, we present a comprehensive structural classification of approximately 3000 small, disulfide-rich protein domains. We find that these domains can be arranged into 41 fold groups on the basis of structural similarity. Our fold groups, which describe broader structural relationships than existing groupings of these domains, bring together representatives with previously unacknowledged similarities; 18 of the 41 fold groups include domains from several SCOP folds. Within the fold groups, the domains are assembled into families of homologs. We define 98 families of disulfide-rich domains, some of which include newly detected homologs, particularly among knottin-like domains. On the basis of this classification, we have examined cases of convergent and divergent evolution of functions performed by disulfide-rich proteins. Disulfide bonding patterns in these domains are also evaluated. Reducible disulfide bonding patterns are much less frequent, while symmetric disulfide bonding patterns are more common than expected from random considerations. Examples of variations in disulfide bonding patterns found within families and fold groups are discussed.

  13. A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies.

    PubMed

    Michnick, S W; Shakhnovich, E

    1998-01-01

    Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies). We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different. Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.

  14. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition

    PubMed Central

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-01-01

    Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145

  15. A semi-analytical description of protein folding that incorporates detailed geometrical information

    PubMed Central

    Suzuki, Yoko; Noel, Jeffrey K.; Onuchic, José N.

    2011-01-01

    Much has been done to study the interplay between geometric and energetic effects on the protein folding energy landscape. Numerical techniques such as molecular dynamics simulations are able to maintain a precise geometrical representation of the protein. Analytical approaches, however, often focus on the energetic aspects of folding, including geometrical information only in an average way. Here, we investigate a semi-analytical expression of folding that explicitly includes geometrical effects. We consider a Hamiltonian corresponding to a Gaussian filament with structure-based interactions. The model captures local features of protein folding often averaged over by mean-field theories, for example, loop contact formation and excluded volume. We explore the thermodynamics and folding mechanisms of beta-hairpin and alpha-helical structures as functions of temperature and Q, the fraction of native contacts formed. Excluded volume is shown to be an important component of a protein Hamiltonian, since it both dominates the cooperativity of the folding transition and alters folding mechanisms. Understanding geometrical effects in analytical formulae will help illuminate the consequences of the approximations required for the study of larger proteins. PMID:21721664

  16. Folding of the four-helix bundle FF domain from a compact on-pathway intermediate state is governed predominantly by water motion.

    PubMed

    Sekhar, Ashok; Vallurupalli, Pramodh; Kay, Lewis E

    2012-11-20

    Friction plays a critical role in protein folding. Frictional forces originating from random solvent and protein fluctuations both retard motion along the folding pathway and activate protein molecules to cross free energy barriers. Studies of friction thus may provide insights into the driving forces underlying protein conformational dynamics. However, the molecular origin of friction in protein folding remains poorly understood because, with the exception of the native conformer, there generally is little detailed structural information on the other states participating in the folding process. Here, we study the folding of the four-helix bundle FF domain that proceeds via a transiently formed, sparsely populated compact on-pathway folding intermediate whose structure was elucidated previously. Because the intermediate is stabilized by both native and nonnative interactions, friction in the folding transition between intermediate and folded states is expected to arise from intrachain reorganization in the protein. However, the viscosity dependencies of rates of folding from or unfolding to the intermediate, as established by relaxation dispersion NMR spectroscopy, clearly indicate that contributions from internal friction are small relative to those from solvent, so solvent frictional forces drive the folding process. Our results emphasize the importance of solvent dynamics in mediating the interconversion between protein configurations, even those that are highly compact, and in equilibrium folding/unfolding fluctuations in general.

  17. Characterization of the amino acid contribution to the folding degree of proteins.

    PubMed

    Estrada, Ernesto

    2004-03-01

    The folding degree index (Estrada, Bioinformatics 2002;18:697-704) is extended to account for the contribution of amino acids to folding. First, the mathematical formalism for extending the folding degree index is presented. Then, the amino acid contributions to folding degree of several proteins are used to analyze its relation to secondary structure. The possibilities of using these contributions in helping or checking the assignation of secondary structure to amino acids are also introduced. The influence of external factors to the amino acids contribution to folding degree is studied through the temperature effect on ribonuclease A. Finally, the analysis of 3D protein similarity through the use of amino acid contributions to folding degree is studied by selecting a series of lysozymes. These results are compared to that obtained by sequence alignment (2D similarity) and 3D superposition of the structures, showing the uniqueness of the current approach. Copyright 2004 Wiley-Liss, Inc.

  18. Flexibility damps macromolecular crowding effects on protein folding dynamics: Application to the murine prion protein (121-231)

    NASA Astrophysics Data System (ADS)

    Bergasa-Caceres, Fernando; Rabitz, Herschel A.

    2014-01-01

    A model of protein folding kinetics is applied to study the combined effects of protein flexibility and macromolecular crowding on protein folding rate and stability. It is found that the increase in stability and folding rate promoted by macromolecular crowding is damped for proteins with highly flexible native structures. The model is applied to the folding dynamics of the murine prion protein (121-231). It is found that the high flexibility of the native isoform of the murine prion protein (121-231) reduces the effects of macromolecular crowding on its folding dynamics. The relevance of these findings for the pathogenic mechanism are discussed.

  19. There and back again: Two views on the protein folding puzzle.

    PubMed

    Finkelstein, Alexei V; Badretdin, Azat J; Galzitskaya, Oxana V; Ivankov, Dmitry N; Bogatyreva, Natalya S; Garbuzynskiy, Sergiy O

    2017-07-01

    The ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured folding times of single-domain globular proteins range from microseconds to hours: the difference (10-11 orders of magnitude) is the same as that between the life span of a mosquito and the age of the universe. This review describes physical theories of rates of overcoming the free-energy barrier separating the natively folded (N) and unfolded (U) states of protein chains in both directions: "U-to-N" and "N-to-U". In the theory of protein folding rates a special role is played by the point of thermodynamic (and kinetic) equilibrium between the native and unfolded state of the chain; here, the theory obtains the simplest form. Paradoxically, a theoretical estimate of the folding time is easier to get from consideration of protein unfolding (the "N-to-U" transition) rather than folding, because it is easier to outline a good unfolding pathway of any structure than a good folding pathway that leads to the stable fold, which is yet unknown to the folding protein chain. And since the rates of direct and reverse reactions are equal at the equilibrium point (as follows from the physical "detailed balance" principle), the estimated folding time can be derived from the estimated unfolding time. Theoretical analysis of the "N-to-U" transition outlines the range of protein folding rates in a good agreement with experiment. Theoretical analysis of folding (the "U-to-N" transition), performed at the level of formation and assembly of protein secondary structures, outlines the upper limit of protein folding times (i.e., of the time of search for the most stable fold). Both theories come to essentially the same results; this is not a surprise, because they describe overcoming one and the same free-energy barrier, although the way to the top of this barrier from the side of the unfolded state is very different from the way from the side of the native state; and both theories agree with experiment. In addition, they predict the maximal size of protein domains that fold under solely thermodynamic (rather than kinetic) control and explain the observed maximal size of the "foldable" protein domains. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. There and back again: Two views on the protein folding puzzle

    NASA Astrophysics Data System (ADS)

    Finkelstein, Alexei V.; Badretdin, Azat J.; Galzitskaya, Oxana V.; Ivankov, Dmitry N.; Bogatyreva, Natalya S.; Garbuzynskiy, Sergiy O.

    2017-07-01

    The ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured folding times of single-domain globular proteins range from microseconds to hours: the difference (10-11 orders of magnitude) is the same as that between the life span of a mosquito and the age of the universe. This review describes physical theories of rates of overcoming the free-energy barrier separating the natively folded (N) and unfolded (U) states of protein chains in both directions: ;U-to-N; and ;N-to-U;. In the theory of protein folding rates a special role is played by the point of thermodynamic (and kinetic) equilibrium between the native and unfolded state of the chain; here, the theory obtains the simplest form. Paradoxically, a theoretical estimate of the folding time is easier to get from consideration of protein unfolding (the ;N-to-U; transition) rather than folding, because it is easier to outline a good unfolding pathway of any structure than a good folding pathway that leads to the stable fold, which is yet unknown to the folding protein chain. And since the rates of direct and reverse reactions are equal at the equilibrium point (as follows from the physical ;detailed balance; principle), the estimated folding time can be derived from the estimated unfolding time. Theoretical analysis of the ;N-to-U; transition outlines the range of protein folding rates in a good agreement with experiment. Theoretical analysis of folding (the ;U-to-N; transition), performed at the level of formation and assembly of protein secondary structures, outlines the upper limit of protein folding times (i.e., of the time of search for the most stable fold). Both theories come to essentially the same results; this is not a surprise, because they describe overcoming one and the same free-energy barrier, although the way to the top of this barrier from the side of the unfolded state is very different from the way from the side of the native state; and both theories agree with experiment. In addition, they predict the maximal size of protein domains that fold under solely thermodynamic (rather than kinetic) control and explain the observed maximal size of the ;foldable; protein domains.

  1. Confinement in nanopores can destabilize α-helix folding proteins and stabilize the β structures

    NASA Astrophysics Data System (ADS)

    Javidpour, Leili; Sahimi, Muhammad

    2011-09-01

    Protein folding in confined media has attracted wide attention over the past decade due to its importance in both in vivo and in vitro applications. Currently, it is generally believed that protein stability increases by decreasing the size of the confining medium, if its interaction with the confining walls is repulsive, and that the maximum folding temperature in confinement occurs for a pore size only slightly larger than the smallest dimension of the folded state of a protein. Protein stability in pore sizes, very close to the size of the folded state, has not however received the attention that it deserves. Using detailed, 0.3-ms-long molecular dynamics simulations, we show that proteins with an α-helix native state can have an optimal folding temperature in pore sizes that do not affect the folded-state structure. In contradiction to the current theoretical explanations, we find that the maximum folding temperature occurs in larger pores for smaller α-helices. In highly confined pores the free energy surface becomes rough, and a new barrier for protein folding may appear close to the unfolded state. In addition, in small nanopores the protein states that contain the β structures are entropically stabilized, in contrast to the bulk. As a consequence, folding rates decrease notably and the free energy surface becomes rougher. The results shed light on many recent experimental observations that cannot be explained by the current theories, and demonstrate the importance of entropic effects on proteins' misfolded states in highly confined environments. They also support the concept of passive effect of chaperonin GroEL on protein folding by preventing it from aggregation in crowded environment of biological cells, and provide deeper clues to the α → β conformational transition, believed to contribute to Alzheimer's and Parkinson's diseases. The strategy of protein and enzyme stabilization in confined media may also have to be revisited in the case of tight confinement. For in silico studies of protein folding in confined media, use of non-Go potentials may be more appropriate.

  2. When a domain isn’t a domain, and why it’s important to properly filter proteins in databases

    PubMed Central

    Towse, Clare-Louise; Daggett, Valerie

    2013-01-01

    Summary Membership in a protein domain database does not a domain make; a feature we realized when generating a consensus view of protein fold space with our Consensus Domain Dictionary (CDD). This dictionary was used to select representative structures for characterization of the protein dynameome: the Dynameomics initiative. Through this endeavor we rejected a surprising 40% of the 1695 folds in the CDD as being non-autonomous folding units. Although some of this was due to the challenges of grouping similar fold topologies, the dissonance between the cataloguing and structural qualification of protein domains remains surprising. Another potential factor is previously overlooked intrinsic disorder; predicted estimates suggest 40% of proteins to have either local or global disorder. One thing is clear, filtering a structural database and ensuring a consistent definition for protein domains is crucial, and caution is prescribed when generalizations of globular domains are drawn from unfiltered protein domain datasets. PMID:23108912

  3. Circuit topology of proteins and nucleic acids.

    PubMed

    Mashaghi, Alireza; van Wijk, Roeland J; Tans, Sander J

    2014-09-02

    Folded biomolecules display a bewildering structural complexity and diversity. They have therefore been analyzed in terms of generic topological features. For instance, folded proteins may be knotted, have beta-strands arranged into a Greek-key motif, or display high contact order. In this perspective, we present a method to formally describe the topology of all folded linear chains and hence provide a general classification and analysis framework for a range of biomolecules. Moreover, by identifying the fundamental rules that intrachain contacts must obey, the method establishes the topological constraints of folded linear chains. We also briefly illustrate how this circuit topology notion can be applied to study the equivalence of folded chains, the engineering of artificial RNA structures and DNA origami, the topological structure of genomes, and the role of topology in protein folding. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. The PYRIN domain: A member of the death domain-fold superfamily

    PubMed Central

    Fairbrother, Wayne J.; Gordon, Nathaniel C.; Humke, Eric W.; O'Rourke, Karen M.; Starovasnik, Melissa A.; Yin, Jian-Ping; Dixit, Vishva M.

    2001-01-01

    PYRIN domains were identified recently as putative protein–protein interaction domains at the N-termini of several proteins thought to function in apoptotic and inflammatory signaling pathways. The ∼95 residue PYRIN domains have no statistically significant sequence homology to proteins with known three-dimensional structure. Using secondary structure prediction and potential-based fold recognition methods, however, the PYRIN domain is predicted to be a member of the six-helix bundle death domain-fold superfamily that includes death domains (DDs), death effector domains (DEDs), and caspase recruitment domains (CARDs). Members of the death domain-fold superfamily are well established mediators of protein–protein interactions found in many proteins involved in apoptosis and inflammation, indicating further that the PYRIN domains serve a similar function. An homology model of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1, a member of the Apaf-1/Ced-4 family of proteins, was constructed using the three-dimensional structures of the FADD and p75 neurotrophin receptor DDs, and of the Apaf-1 and caspase-9 CARDs, as templates. Validation of the model using a variety of computational techniques indicates that the fold prediction is consistent with the sequence. Comparison of a circular dichroism spectrum of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1 with spectra of several proteins known to adopt the death domain-fold provides experimental support for the structure prediction. PMID:11514682

  5. A consensus view of fold space: Combining SCOP, CATH, and the Dali Domain Dictionary

    PubMed Central

    Day, Ryan; Beck, David A.C.; Armen, Roger S.; Daggett, Valerie

    2003-01-01

    We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web. PMID:14500873

  6. A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary.

    PubMed

    Day, Ryan; Beck, David A C; Armen, Roger S; Daggett, Valerie

    2003-10-01

    We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web.

  7. Conservative mutation Met8 --> Leu affects the folding process and structural stability of squash trypsin inhibitor CMTI-I.

    PubMed Central

    Zhukov, I.; Jaroszewski, L.; Bierzyński, A.

    2000-01-01

    Protein molecules can accommodate a large number of mutations without noticeable effects on their stability and folding kinetics. On the other hand, some mutations can have quite strong effects on protein conformational properties. Such mutations either destabilize secondary structures, e.g., alpha-helices, are incompatible with close packing of protein hydrophobic cores, or lead to disruption of some specific interactions such as disulfide cross links, salt bridges, hydrogen bonds, or aromatic-aromatic contacts. The Met8 --> Leu mutation in CMTI-I results in significant destabilization of the protein structure. This effect could hardly be expected since the mutation is highly conservative, and the side chain of residue 8 is situated on the protein surface. We show that the protein destabilization is caused by rearrangement of a hydrophobic cluster formed by side chains of residues 8, Ile6, and Leu17 that leads to partial breaking of a hydrogen bond formed by the amide group of Leu17 with water and to a reduction of a hydrophobic surface buried within the cluster. The mutation perturbs also the protein folding. In aerobic conditions the reduced wild-type protein folds effectively into its native structure, whereas more then 75% of the mutant molecules are trapped in various misfolded species. The main conclusion of this work is that conservative mutations of hydrophobic residues can destabilize a protein structure even if these residues are situated on the protein surface and partially accessible to water. Structural rearrangement of small hydrophobic clusters formed by such residues can lead to local changes in protein hydration, and consequently, can affect considerably protein stability and folding process. PMID:10716179

  8. Structural Transitions of Confined Model Proteins: Molecular Dynamics Simulation and Experimental Validation

    PubMed Central

    Lu, Diannan; Liu, Zheng; Wu, Jianzhong

    2006-01-01

    Proteins fold in a confined space not only in vivo, i.e., folding assisted by molecular chaperons and chaperonins in a crowded cellular medium, but also in vitro as in production of recombinant proteins. Despite extensive work on protein folding in bulk, little is known about how and to what extent the thermodynamics and kinetics of protein folding are altered by confinement. In this work, we use a Gō-like off-lattice model to investigate the folding and stability of an all β-sheet protein in spherical cages of different sizes and surface hydrophobicity. We find whereas extreme confinement inhibits correct folding, a hydrophilic cage stabilizes the protein due to restriction of the unfolded configurations. In a hydrophobic cage, however, strong attraction from the cage surface destabilizes the confined protein because of competition between self-aggregation and adsorption of hydrophobic residues. We show that the kinetics of protein collapse and folding is strongly correlated with both the cage size and the surface hydrophobicity. It is demonstrated that a cage of moderate size and hydrophobicity optimizes both the folding yield and kinetics of structural transitions. To support the simulation results, we have also investigated the refolding of hen-egg lysozyme in the presence of cetyltrimethylammoniumbromide (CTAB) surfactants that provide an effective confinement of the proteins by micellization. The influence of the surfactant hydrophobicity on the structural and biological activity of the protein is determined with circular dichroism spectrum, fluorescence emission spectrum, and biological activity assay. It is shown that, as predicted by coarse-grained simulations, CTAB micelles facilitate the collapse of denatured lysozyme, whereas the addition of β-cyclodextrin-grafted-PNIPAAm, a weakly hydrophobic stripper, dissociates CTAB micelles and promotes the conformational rearrangement and thereby gives an improved recovery of lysozyme activity. PMID:16461405

  9. Isolation, folding and structural investigations of the amino acid transporter OEP16

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ni, Da Qun; Zook, James; Klewer, Douglas A.

    2011-12-01

    Membrane proteins compose more than 30% of all proteins in the living cell. However, many membrane proteins have low abundance in the cell and cannot be isolated from natural sources in concentrations suitable for structure analysis. The overexpression, reconstitution, and stabilization of membrane proteins are complex and remain a formidable challenge in membrane protein characterization. Here we describe a novel, in vitro folding procedure for a cation-selective channel protein, the outer envelope membrane protein 16 (OEP16) of pea chloroplast, overexpressed in Escherichia coli in the form of inclusion bodies. The protein is purified and then folded with detergent on amore » Ni-NTA affinity column. Final concentrations of reconstituted OEP16 of up to 24 mg/ml have been achieved, which provides samples that are sufficient for structural studies by NMR and crystallography. Reconstitution of OEP16 in detergent micelles was monitored by circular dichroism, fluorescence, and NMR spectroscopy. Tryptophan fluorescence spectra of heterologous expressed OEP16 in micelles are similar to spectra of functionally active OEP16 in liposomes, which indicates folding of the membrane protein in detergent micelles. CD spectroscopy studies demonstrate a folded protein consisting primarily of a-helices. 15N-HSQC NMR spectra also provide evidence for a folded protein. We present here a convenient, effective and quantitative method to screen large numbers of conditions for optimal protein stability by using microdialysis chambers in combination with fluorescence spectroscopy. Recent collection of multidimensional NMR data at 500, 600 and 800 MHz demonstrated that the protein is suitable for structure determination by NMR and stable for weeks during data collection.« less

  10. Isolation, folding and structural investigations of the amino acid transporter OEP16.

    PubMed

    Ni, Da Qun; Zook, James; Klewer, Douglas A; Nieman, Ronald A; Soll, J; Fromme, Petra

    2011-12-01

    Membrane proteins compose more than 30% of all proteins in the living cell. However, many membrane proteins have low abundance in the cell and cannot be isolated from natural sources in concentrations suitable for structure analysis. The overexpression, reconstitution, and stabilization of membrane proteins are complex and remain a formidable challenge in membrane protein characterization. Here we describe a novel, in vitro folding procedure for a cation-selective channel protein, the outer envelope membrane protein 16 (OEP16) of pea chloroplast, overexpressed in Escherichia coli in the form of inclusion bodies. The protein is purified and then folded with detergent on a Ni-NTA affinity column. Final concentrations of reconstituted OEP16 of up to 24 mg/ml have been achieved, which provides samples that are sufficient for structural studies by NMR and crystallography. Reconstitution of OEP16 in detergent micelles was monitored by circular dichroism, fluorescence, and NMR spectroscopy. Tryptophan fluorescence spectra of heterologous expressed OEP16 in micelles are similar to spectra of functionally active OEP16 in liposomes, which indicates folding of the membrane protein in detergent micelles. CD spectroscopy studies demonstrate a folded protein consisting primarily of α-helices. ¹⁵N-HSQC NMR spectra also provide evidence for a folded protein. We present here a convenient, effective and quantitative method to screen large numbers of conditions for optimal protein stability by using microdialysis chambers in combination with fluorescence spectroscopy. Recent collection of multidimensional NMR data at 500, 600 and 800 MHz demonstrated that the protein is suitable for structure determination by NMR and stable for weeks during data collection. Copyright © 2011. Published by Elsevier Inc.

  11. The complex folding pathways of protein A suggest a multiple-funnelled energy landscape

    NASA Astrophysics Data System (ADS)

    St-Pierre, Jean-Francois; Mousseau, Normand; Derreumaux, Philippe

    2008-01-01

    Folding proteins into their native states requires the formation of both secondary and tertiary structures. Many questions remain, however, as to whether these form into a precise order, and various pictures have been proposed that place the emphasis on the first or the second level of structure in describing folding. One of the favorite test models for studying this question is the B domain of protein A, which has been characterized by numerous experiments and simulations. Using the activation-relaxation technique coupled with a generic energy model (optimized potential for efficient peptide structure prediction), we generate more than 50 folding trajectories for this 60-residue protein. While the folding pathways to the native state are fully consistent with the funnel-like description of the free energy landscape, we find a wide range of mechanisms in which secondary and tertiary structures form in various orders. Our nonbiased simulations also reveal the presence of a significant number of non-native β and α conformations both on and off pathway, including the visit, for a non-negligible fraction of trajectories, of fully ordered structures resembling the native state of nonhomologous proteins.

  12. The porous borders of the protein world.

    PubMed

    Cordes, Matthew H J; Stewart, Katie L

    2012-02-08

    Fold switching may play a role in the evolution of new protein folds and functions. He et al., in this issue of Structure, use protein design to illustrate that the same drastic change in a protein fold can occur via multiple different mutational pathways. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

    PubMed Central

    Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

    2001-01-01

    The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048

  14. A novel member of the split betaalphabeta fold: Solution structure of the hypothetical protein YML108W from Saccharomyces cerevisiae.

    PubMed

    Pineda-Lucena, Antonio; Liao, Jack C C; Cort, John R; Yee, Adelinda; Kennedy, Michael A; Edwards, Aled M; Arrowsmith, Cheryl H

    2003-05-01

    As part of the Northeast Structural Genomics Consortium pilot project focused on small eukaryotic proteins and protein domains, we have determined the NMR structure of the protein encoded by ORF YML108W from Saccharomyces cerevisiae. YML108W belongs to one of the numerous structural proteomics targets whose biological function is unknown. Moreover, this protein does not have sequence similarity to any other protein. The NMR structure of YML108W consists of a four-stranded beta-sheet with strand order 2143 and two alpha-helices, with an overall topology of betabetaalphabetabetaalpha. Strand beta1 runs parallel to beta4, and beta2:beta1 and beta4:beta3 pairs are arranged in an antiparallel fashion. Although this fold belongs to the split betaalphabeta family, it appears to be unique among this family; it is a novel arrangement of secondary structure, thereby expanding the universe of protein folds.

  15. Revealing the global map of protein folding space by large-scale simulations

    NASA Astrophysics Data System (ADS)

    Sinner, Claude; Lutz, Benjamin; Verma, Abhinav; Schug, Alexander

    2015-12-01

    The full characterization of protein folding is a remarkable long-standing challenge both for experiment and simulation. Working towards a complete understanding of this process, one needs to cover the full diversity of existing folds and identify the general principles driving the process. Here, we want to understand and quantify the diversity in folding routes for a large and representative set of protein topologies covering the full range from all alpha helical topologies towards beta barrels guided by the key question: Does the majority of the observed routes contribute to the folding process or only a particular route? We identified a set of two-state folders among non-homologous proteins with a sequence length of 40-120 residues. For each of these proteins, we ran native-structure based simulations both with homogeneous and heterogeneous contact potentials. For each protein, we simulated dozens of folding transitions in continuous uninterrupted simulations and constructed a large database of kinetic parameters. We investigate folding routes by tracking the formation of tertiary structure interfaces and discuss whether a single specific route exists for a topology or if all routes are equiprobable. These results permit us to characterize the complete folding space for small proteins in terms of folding barrier ΔG‡, number of routes, and the route specificity RT.

  16. Aromatic claw: A new fold with high aromatic content that evades structural prediction: Aromatic Claw

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sachleben, Joseph R.; Adhikari, Aashish N.; Gawlak, Grzegorz

    2016-11-10

    We determined the NMR structure of a highly aromatic (13%) protein of unknown function, Aq1974 from Aquifex aeolicus (PDB ID: 5SYQ). The unusual sequence of this protein has a tryptophan content five times the normal (six tryptophan residues of 114 or 5.2% while the average tryptophan content is 1.0%) with the tryptophans occurring in a WXW motif. It has no detectable sequence homology with known protein structures. Although its NMR spectrum suggested that the protein was rich in β-sheet, upon resonance assignment and solution structure determination, the protein was found to be primarily α-helical with a small two-stranded β-sheet withmore » a novel fold that we have termed an Aromatic Claw. As this fold was previously unknown and the sequence unique, we submitted the sequence to CASP10 as a target for blind structural prediction. At the end of the competition, the sequence was classified a hard template based model; the structural relationship between the template and the experimental structure was small and the predictions all failed to predict the structure. CSRosetta was found to predict the secondary structure and its packing; however, it was found that there was little correlation between CSRosetta score and the RMSD between the CSRosetta structure and the NMR determined one. This work demonstrates that even in relatively small proteins, we do not yet have the capacity to accurately predict the fold for all primary sequences. The experimental discovery of new folds helps guide the improvement of structural prediction methods.« less

  17. Direct protein photoinduced conformational changes using porphyrins.

    NASA Astrophysics Data System (ADS)

    Brancaleon, Lorenzo; Silva, Ivan; Fernandez, Nicholas; Johnson, Eric; Sansone, Samuel

    2008-03-01

    Most proteins functions depend on the interaction with other ligands. These interactions depend on uniquely structured binding sites formed by the folding of the proteins. Ligands can often prompt intended as well as ``accidental'' protein structural changes. One can foresee that the ability to prompt and control post-translational protein folding could be a powerful tool to investigate protein folding mechanisms but also to inhibit certain proteins or induce new properties to proteins. One possible way to produce such structural disruption is the combination of light and photoactive ligands. This option has been investigated in recent years by exploiting photoisomerization and other properties of non-physiological dyes. We used an alternative approach which uses porphyrins as the ``triggers'' of structural changes. The advantage of porphyrins is that they can be found naturally in living cells. The photophysical properties of porphyrins can induce local as well as long range effects on the structure of the bound protein. Porphyrins are known to produce structural changes in porphyrin-specific proteins, however the novelty of our results is that we demonstrated that these dyes can also produce structural changes in non-porphyrin-specific globular proteins. We will present an overview of our research to-date in this field and its potential applications.

  18. Antibody Epitope Analysis to Investigate Folded Structure, Allosteric Conformation, and Evolutionary Lineage of Proteins.

    PubMed

    Wong, Sienna; Jin, J-P

    2017-01-01

    Study of folded structure of proteins provides insights into their biological functions, conformational dynamics and molecular evolution. Current methods of elucidating folded structure of proteins are laborious, low-throughput, and constrained by various limitations. Arising from these methods is the need for a sensitive, quantitative, rapid and high-throughput method not only analysing the folded structure of proteins, but also to monitor dynamic changes under physiological or experimental conditions. In this focused review, we outline the foundation and limitations of current protein structure-determination methods prior to discussing the advantages of an emerging antibody epitope analysis for applications in structural, conformational and evolutionary studies of proteins. We discuss the application of this method using representative examples in monitoring allosteric conformation of regulatory proteins and the determination of the evolutionary lineage of related proteins and protein isoforms. The versatility of the method described herein is validated by the ability to modulate a variety of assay parameters to meet the needs of the user in order to monitor protein conformation. Furthermore, the assay has been used to clarify the lineage of troponin isoforms beyond what has been depicted by sequence homology alone, demonstrating the nonlinear evolutionary relationship between primary structure and tertiary structure of proteins. The antibody epitope analysis method is a highly adaptable technique of protein conformation elucidation, which can be easily applied without the need for specialized equipment or technical expertise. When applied in a systematic and strategic manner, this method has the potential to reveal novel and biomedically meaningful information for structure-function relationship and evolutionary lineage of proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  19. Investigation of protein folding by coarse-grained molecular dynamics with the UNRES force field.

    PubMed

    Maisuradze, Gia G; Senet, Patrick; Czaplewski, Cezary; Liwo, Adam; Scheraga, Harold A

    2010-04-08

    Coarse-grained molecular dynamics simulations offer a dramatic extension of the time-scale of simulations compared to all-atom approaches. In this article, we describe the use of the physics-based united-residue (UNRES) force field, developed in our laboratory, in protein-structure simulations. We demonstrate that this force field offers about a 4000-times extension of the simulation time scale; this feature arises both from averaging out the fast-moving degrees of freedom and reduction of the cost of energy and force calculations compared to all-atom approaches with explicit solvent. With massively parallel computers, microsecond folding simulation times of proteins containing about 1000 residues can be obtained in days. A straightforward application of canonical UNRES/MD simulations, demonstrated with the example of the N-terminal part of the B-domain of staphylococcal protein A (PDB code: 1BDD, a three-alpha-helix bundle), discerns the folding mechanism and determines kinetic parameters by parallel simulations of several hundred or more trajectories. Use of generalized-ensemble techniques, of which the multiplexed replica exchange method proved to be the most effective, enables us to compute thermodynamics of folding and carry out fully physics-based prediction of protein structure, in which the predicted structure is determined as a mean over the most populated ensemble below the folding-transition temperature. By using principal component analysis of the UNRES folding trajectories of the formin-binding protein WW domain (PDB code: 1E0L; a three-stranded antiparallel beta-sheet) and 1BDD, we identified representative structures along the folding pathways and demonstrated that only a few (low-indexed) principal components can capture the main structural features of a protein-folding trajectory; the potentials of mean force calculated along these essential modes exhibit multiple minima, as opposed to those along the remaining modes that are unimodal. In addition, a comparison between the structures that are representative of the minima in the free-energy profile along the essential collective coordinates of protein folding (computed by principal component analysis) and the free-energy profile projected along the virtual-bond dihedral angles gamma of the backbone revealed the key residues involved in the transitions between the different basins of the folding free-energy profile, in agreement with existing experimental data for 1E0L .

  20. Decoding Structural Properties of a Partially Unfolded Protein Substrate: En Route to Chaperone Binding.

    PubMed

    Nagpal, Suhani; Tiwari, Satyam; Mapa, Koyeli; Thukral, Lipi

    2015-01-01

    Many proteins comprising of complex topologies require molecular chaperones to achieve their unique three-dimensional folded structure. The E.coli chaperone, GroEL binds with a large number of unfolded and partially folded proteins, to facilitate proper folding and prevent misfolding and aggregation. Although the major structural components of GroEL are well defined, scaffolds of the non-native substrates that determine chaperone-mediated folding have been difficult to recognize. Here we performed all-atomistic and replica-exchange molecular dynamics simulations to dissect non-native ensemble of an obligate GroEL folder, DapA. Thermodynamics analyses of unfolding simulations revealed populated intermediates with distinct structural characteristics. We found that surface exposed hydrophobic patches are significantly increased, primarily contributed from native and non-native β-sheet elements. We validate the structural properties of these conformers using experimental data, including circular dichroism (CD), 1-anilinonaphthalene-8-sulfonic acid (ANS) binding measurements and previously reported hydrogen-deutrium exchange coupled to mass spectrometry (HDX-MS). Further, we constructed network graphs to elucidate long-range intra-protein connectivity of native and intermediate topologies, demonstrating regions that serve as central "hubs". Overall, our results implicate that genomic variations (or mutations) in the distinct regions of protein structures might disrupt these topological signatures disabling chaperone-mediated folding, leading to formation of aggregates.

  1. TOUCHSTONE II: a new approach to ab initio protein structure prediction.

    PubMed

    Zhang, Yang; Kolinski, Andrzej; Skolnick, Jeffrey

    2003-08-01

    We have developed a new combined approach for ab initio protein structure prediction. The protein conformation is described as a lattice chain connecting C(alpha) atoms, with attached C(beta) atoms and side-chain centers of mass. The model force field includes various short-range and long-range knowledge-based potentials derived from a statistical analysis of the regularities of protein structures. The combination of these energy terms is optimized through the maximization of correlation for 30 x 60,000 decoys between the root mean square deviation (RMSD) to native and energies, as well as the energy gap between native and the decoy ensemble. To accelerate the conformational search, a newly developed parallel hyperbolic sampling algorithm with a composite movement set is used in the Monte Carlo simulation processes. We exploit this strategy to successfully fold 41/100 small proteins (36 approximately 120 residues) with predicted structures having a RMSD from native below 6.5 A in the top five cluster centroids. To fold larger-size proteins as well as to improve the folding yield of small proteins, we incorporate into the basic force field side-chain contact predictions from our threading program PROSPECTOR where homologous proteins were excluded from the data base. With these threading-based restraints, the program can fold 83/125 test proteins (36 approximately 174 residues) with structures having a RMSD to native below 6.5 A in the top five cluster centroids. This shows the significant improvement of folding by using predicted tertiary restraints, especially when the accuracy of side-chain contact prediction is >20%. For native fold selection, we introduce quantities dependent on the cluster density and the combination of energy and free energy, which show a higher discriminative power to select the native structure than the previously used cluster energy or cluster size, and which can be used in native structure identification in blind simulations. These procedures are readily automated and are being implemented on a genomic scale.

  2. Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins.

    PubMed

    Schuler, Benjamin; Soranno, Andrea; Hofmann, Hagen; Nettels, Daniel

    2016-07-05

    The properties of unfolded proteins have long been of interest because of their importance to the protein folding process. Recently, the surprising prevalence of unstructured regions or entirely disordered proteins under physiological conditions has led to the realization that such intrinsically disordered proteins can be functional even in the absence of a folded structure. However, owing to their broad conformational distributions, many of the properties of unstructured proteins are difficult to describe with the established concepts of structural biology. We have thus seen a reemergence of polymer physics as a versatile framework for understanding their structure and dynamics. An important driving force for these developments has been single-molecule spectroscopy, as it allows structural heterogeneity, intramolecular distance distributions, and dynamics to be quantified over a wide range of timescales and solution conditions. Polymer concepts provide an important basis for relating the physical properties of unstructured proteins to folding and function.

  3. Absolute comparison of simulated and experimental protein-folding dynamics

    NASA Astrophysics Data System (ADS)

    Snow, Christopher D.; Nguyen, Houbi; Pande, Vijay S.; Gruebele, Martin

    2002-11-01

    Protein folding is difficult to simulate with classical molecular dynamics. Secondary structure motifs such as α-helices and β-hairpins can form in 0.1-10µs (ref. 1), whereas small proteins have been shown to fold completely in tens of microseconds. The longest folding simulation to date is a single 1-µs simulation of the villin headpiece; however, such single runs may miss many features of the folding process as it is a heterogeneous reaction involving an ensemble of transition states. Here, we have used a distributed computing implementation to produce tens of thousands of 5-20-ns trajectories (700µs) to simulate mutants of the designed mini-protein BBA5. The fast relaxation dynamics these predict were compared with the results of laser temperature-jump experiments. Our computational predictions are in excellent agreement with the experimentally determined mean folding times and equilibrium constants. The rapid folding of BBA5 is due to the swift formation of secondary structure. The convergence of experimentally and computationally accessible timescales will allow the comparison of absolute quantities characterizing in vitro and in silico (computed) protein folding.

  4. Learning To Fold Proteins Using Energy Landscape Theory

    PubMed Central

    Schafer, N.P.; Kim, B.L.; Zheng, W.; Wolynes, P.G.

    2014-01-01

    This review is a tutorial for scientists interested in the problem of protein structure prediction, particularly those interested in using coarse-grained molecular dynamics models that are optimized using lessons learned from the energy landscape theory of protein folding. We also present a review of the results of the AMH/AMC/AMW/AWSEM family of coarse-grained molecular dynamics protein folding models to illustrate the points covered in the first part of the article. Accurate coarse-grained structure prediction models can be used to investigate a wide range of conceptual and mechanistic issues outside of protein structure prediction; specifically, the paper concludes by reviewing how AWSEM has in recent years been able to elucidate questions related to the unusual kinetic behavior of artificially designed proteins, multidomain protein misfolding, and the initial stages of protein aggregation. PMID:25308991

  5. Integrated Structural Biology for α-Helical Membrane Protein Structure Determination.

    PubMed

    Xia, Yan; Fischer, Axel W; Teixeira, Pedro; Weiner, Brian; Meiler, Jens

    2018-04-03

    While great progress has been made, only 10% of the nearly 1,000 integral, α-helical, multi-span membrane protein families are represented by at least one experimentally determined structure in the PDB. Previously, we developed the algorithm BCL::MP-Fold, which samples the large conformational space of membrane proteins de novo by assembling predicted secondary structure elements guided by knowledge-based potentials. Here, we present a case study of rhodopsin fold determination by integrating sparse and/or low-resolution restraints from multiple experimental techniques including electron microscopy, electron paramagnetic resonance spectroscopy, and nuclear magnetic resonance spectroscopy. Simultaneous incorporation of orthogonal experimental restraints not only significantly improved the sampling accuracy but also allowed identification of the correct fold, which is demonstrated by a protein size-normalized transmembrane root-mean-square deviation as low as 1.2 Å. The protocol developed in this case study can be used for the determination of unknown membrane protein folds when limited experimental restraints are available. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Machinery of protein folding and unfolding.

    PubMed

    Zhang, Xiaodong; Beuron, Fabienne; Freemont, Paul S

    2002-04-01

    During the past two years, a large amount of biochemical, biophysical and low- to high-resolution structural data have provided mechanistic insights into the machinery of protein folding and unfolding. It has emerged that dual functionality in terms of folding and unfolding might exist for some systems. The majority of folding/unfolding machines adopt oligomeric ring structures in a cooperative fashion and utilise the conformational changes induced by ATP binding/hydrolysis for their specific functions.

  7. New insights into structural determinants of prion protein folding and stability.

    PubMed

    Benetti, Federico; Legname, Giuseppe

    2015-01-01

    Prions are the etiological agent of fatal neurodegenerative diseases called prion diseases or transmissible spongiform encephalopathies. These maladies can be sporadic, genetic or infectious disorders. Prions are due to post-translational modifications of the cellular prion protein leading to the formation of a β-sheet enriched conformer with altered biochemical properties. The molecular events causing prion formation in sporadic prion diseases are still elusive. Recently, we published a research elucidating the contribution of major structural determinants and environmental factors in prion protein folding and stability. Our study highlighted the crucial role of octarepeats in stabilizing prion protein; the presence of a highly enthalpically stable intermediate state in prion-susceptible species; and the role of disulfide bridge in preserving native fold thus avoiding the misfolding to a β-sheet enriched isoform. Taking advantage from these findings, in this work we present new insights into structural determinants of prion protein folding and stability.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lammert, Heiko; Noel, Jeffrey K.; Haglund, Ellinor

    The diversity in a set of protein nuclear magnetic resonance (NMR) structures provides an estimate of native state fluctuations that can be used to refine and enrich structure-based protein models (SBMs). Dynamics are an essential part of a protein’s functional native state. The dynamics in the native state are controlled by the same funneled energy landscape that guides the entire folding process. SBMs apply the principle of minimal frustration, drawn from energy landscape theory, to construct a funneled folding landscape for a given protein using only information from the native structure. On an energy landscape smoothed by evolution towards minimalmore » frustration, geometrical constraints, imposed by the native structure, control the folding mechanism and shape the native dynamics revealed by the model. Native-state fluctuations can alternatively be estimated directly from the diversity in the set of NMR structures for a protein. Based on this information, we identify a highly flexible loop in the ribosomal protein S6 and modify the contact map in a SBM to accommodate the inferred dynamics. By taking into account the probable native state dynamics, the experimental transition state is recovered in the model, and the correct order of folding events is restored. Our study highlights how the shared energy landscape connects folding and function by showing that a better description of the native basin improves the prediction of the folding mechanism.« less

  9. Tackling force-field bias in protein folding simulations: folding of Villin HP35 and Pin WW domains in explicit water.

    PubMed

    Mittal, Jeetain; Best, Robert B

    2010-08-04

    The ability to fold proteins on a computer has highlighted the fact that existing force fields tend to be biased toward a particular type of secondary structure. Consequently, force fields for folding simulations are often chosen according to the native structure, implying that they are not truly "transferable." Here we show that, while the AMBER ff03 potential is known to favor helical structures, a simple correction to the backbone potential (ff03( *)) results in an unbiased energy function. We take as examples the 35-residue alpha-helical Villin HP35 and 37 residue beta-sheet Pin WW domains, which had not previously been folded with the same force field. Starting from unfolded configurations, simulations of both proteins in Amber ff03( *) in explicit solvent fold to within 2.0 A RMSD of the experimental structures. This demonstrates that a simple backbone correction results in a more transferable force field, an important requirement if simulations are to be used to interpret folding mechanism. 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  10. Application of Time-Resolved Tryptophan Phosphorescence Spectroscopy to Protein Folding Studies.

    NASA Astrophysics Data System (ADS)

    Subramaniam, Vinod

    This thesis presents studies of the protein folding problem, one of the most significant questions in contemporary biophysics. Sensitive biophysical techniques, including room temperature tryptophan phosphorescence, which reports on the local environment of the residue, and the lability of proteins to denaturation, a global parameter, were used to assess the validity of the traditional assumption that the biologically active state of a protein is the 'native' state, and to determine whether the pathways of folding in vitro lead to the folded state achieved in vivo. Phosphorescence techniques have also been extended to study, for the first time, emission from tryptophan residues engineered into specific positions as reporters of protein structure. During in vitro refolding of E. coli alkaline phosphatase and bovine 13-lactoglobulin, significant differences were found between the refolded proteins and the native conformations, which have no apparent effect on the biological functions. Slow conformational transitions, termed 'annealing,' that occur long after the return of enzyme activity of alkaline phosphatase are manifested in the retarded recovery of phosphorescence intensity, lifetime, and protein lability. While 'annealing' is not observed for beta -lactoglobulin, both phosphorescence and lability experiments reveal changes in the structure of the refolded protein, even though its biological activity, retinol binding, is fully recovered. This result suggests that the pathways of folding in vitro need not lead to the structure formed in vivo. We have used phosphorescence techniques to study the refolding of ribonuclease T1, which exhibits slow kinetics characteristic of proline isomerization. Furthermore, the ability to extract structural information from phosphorescent tryptophan probes engineered into selected regions represents an important advance in studying protein structure; we have reported the first such results from a mutant staphylococcal nuclease. The refolding data have been interpreted in the context of recent theoretical work on rugged energy landscape models of protein folding. Our results suggest that the barriers to folding can be as large as ~ 20 kcal-mol^{-1}, and imply that the conventional definition of the 'native' state as the biologically active conformation may need revision to acknowledge that the active state may represent a long-lived intermediate on the pathway to the native structure.

  11. Comparative analysis of the folding dynamics and kinetics of an engineered knotted protein and its variants derived from HP0242 of Helicobacter pylori

    NASA Astrophysics Data System (ADS)

    Wang, Liang-Wei; Liu, Yu-Nan; Lyu, Ping-Chiang; Jackson, Sophie E.; Hsu, Shang-Te Danny

    2015-09-01

    Understanding the mechanism by which a polypeptide chain thread itself spontaneously to attain a knotted conformation has been a major challenge in the field of protein folding. HP0242 is a homodimeric protein from Helicobacter pylori with intertwined helices to form a unique pseudo-knotted folding topology. A tandem HP0242 repeat has been constructed to become the first engineered trefoil-knotted protein. Its small size renders it a model system for computational analyses to examine its folding and knotting pathways. Here we report a multi-parametric study on the folding stability and kinetics of a library of HP0242 variants, including the trefoil-knotted tandem HP0242 repeat, using far-UV circular dichroism and fluorescence spectroscopy. Equilibrium chemical denaturation of HP0242 variants shows the presence of highly populated dimeric and structurally heterogeneous folding intermediates. Such equilibrium folding intermediates retain significant amount of helical structures except those at the N- and C-terminal regions in the native structure. Stopped-flow fluorescence measurements of HP0242 variants show that spontaneous refolding into knotted structures can be achieved within seconds, which is several orders of magnitude faster than previously observed for other knotted proteins. Nevertheless, the complex chevron plots indicate that HP0242 variants are prone to misfold into kinetic traps, leading to severely rolled-over refolding arms. The experimental observations are in general agreement with the previously reported molecular dynamics simulations. Based on our results, kinetic folding pathways are proposed to qualitatively describe the complex folding processes of HP0242 variants.

  12. Competing Pathways and Multiple Folding Nuclei in a Large Multidomain Protein, Luciferase.

    PubMed

    Scholl, Zackary N; Yang, Weitao; Marszalek, Piotr E

    2017-05-09

    Proteins obtain their final functional configuration through incremental folding with many intermediate steps in the folding pathway. If known, these intermediate steps could be valuable new targets for designing therapeutics and the sequence of events could elucidate the mechanism of refolding. However, determining these intermediate steps is hardly an easy feat, and has been elusive for most proteins, especially large, multidomain proteins. Here, we effectively map part of the folding pathway for the model large multidomain protein, Luciferase, by combining single-molecule force-spectroscopy experiments and coarse-grained simulation. Single-molecule refolding experiments reveal the initial nucleation of folding while simulations corroborate these stable core structures of Luciferase, and indicate the relative propensities for each to propagate to the final folded native state. Both experimental refolding and Monte Carlo simulations of Markov state models generated from simulation reveal that Luciferase most often folds along a pathway originating from the nucleation of the N-terminal domain, and that this pathway is the least likely to form nonnative structures. We then engineer truncated variants of Luciferase whose sequences corresponded to the putative structure from simulation and we use atomic force spectroscopy to determine their unfolding and stability. These experimental results corroborate the structures predicted from the folding simulation and strongly suggest that they are intermediates along the folding pathway. Taken together, our results suggest that initial Luciferase refolding occurs along a vectorial pathway and also suggest a mechanism that chaperones may exploit to prevent misfolding. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  13. Coarse-Grained Simulations of Membrane Insertion and Folding of Small Helical Proteins Using the CABS Model.

    PubMed

    Pulawski, Wojciech; Jamroz, Michal; Kolinski, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2016-11-28

    The CABS coarse-grained model is a well-established tool for modeling globular proteins (predicting their structure, dynamics, and interactions). Here we introduce an extension of the CABS representation and force field (CABS-membrane) to the modeling of the effect of the biological membrane environment on the structure of membrane proteins. We validate the CABS-membrane model in folding simulations of 10 short helical membrane proteins not using any knowledge about their structure. The simulations start from random protein conformations placed outside the membrane environment and allow for full flexibility of the modeled proteins during their spontaneous insertion into the membrane. In the resulting trajectories, we have found models close to the experimental membrane structures. We also attempted to select the correctly folded models using simple filtering followed by structural clustering combined with reconstruction to the all-atom representation and all-atom scoring. The CABS-membrane model is a promising approach for further development toward modeling of large protein-membrane systems.

  14. Meta-structure correlation in protein space unveils different selection rules for folded and intrinsically disordered proteins.

    PubMed

    Naranjo, Yandi; Pons, Miquel; Konrat, Robert

    2012-01-01

    The number of existing protein sequences spans a very small fraction of sequence space. Natural proteins have overcome a strong negative selective pressure to avoid the formation of insoluble aggregates. Stably folded globular proteins and intrinsically disordered proteins (IDPs) use alternative solutions to the aggregation problem. While in globular proteins folding minimizes the access to aggregation prone regions, IDPs on average display large exposed contact areas. Here, we introduce the concept of average meta-structure correlation maps to analyze sequence space. Using this novel conceptual view we show that representative ensembles of folded and ID proteins show distinct characteristics and respond differently to sequence randomization. By studying the way evolutionary constraints act on IDPs to disable a negative function (aggregation) we might gain insight into the mechanisms by which function-enabling information is encoded in IDPs.

  15. Complete Reversible Refolding of a G-Protein Coupled Receptor on a Solid Support

    PubMed Central

    Di Bartolo, Natalie; Compton, Emma L. R.; Warne, Tony; Edwards, Patricia C.; Tate, Christopher G.; Schertler, Gebhard F. X.; Booth, Paula J.

    2016-01-01

    The factors defining the correct folding and stability of integral membrane proteins are poorly understood. Folding of only a few select membrane proteins has been scrutinised, leaving considerable deficiencies in knowledge for large protein families, such as G protein coupled receptors (GPCRs). Complete reversible folding, which is problematic for any membrane protein, has eluded this dominant receptor family. Moreover, attempts to recover receptors from denatured states are inefficient, yielding at best 40–70% functional protein. We present a method for the reversible unfolding of an archetypal family member, the β1-adrenergic receptor, and attain 100% recovery of the folded, functional state, in terms of ligand binding, compared to receptor which has not been subject to any unfolding and retains its original, folded structure. We exploit refolding on a solid support, which could avoid unwanted interactions and aggregation that occur in bulk solution. We determine the changes in structure and function upon unfolding and refolding. Additionally, we employ a method that is relatively new to membrane protein folding; pulse proteolysis. Complete refolding of β1-adrenergic receptor occurs in n-decyl-β-D-maltoside (DM) micelles from a urea-denatured state, as shown by regain of its original helical structure, ligand binding and protein fluorescence. The successful refolding strategy on a solid support offers a defined method for the controlled refolding and recovery of functional GPCRs and other membrane proteins that suffer from instability and irreversible denaturation once isolated from their native membranes. PMID:26982879

  16. Alternative modes of client binding enable functional plasticity of Hsp70

    NASA Astrophysics Data System (ADS)

    Mashaghi, Alireza; Bezrukavnikov, Sergey; Minde, David P.; Wentink, Anne S.; Kityk, Roman; Zachmann-Brand, Beate; Mayer, Matthias P.; Kramer, Günter; Bukau, Bernd; Tans, Sander J.

    2016-11-01

    The Hsp70 system is a central hub of chaperone activity in all domains of life. Hsp70 performs a plethora of tasks, including folding assistance, protection against aggregation, protein trafficking, and enzyme activity regulation, and interacts with non-folded chains, as well as near-native, misfolded, and aggregated proteins. Hsp70 is thought to achieve its many physiological roles by binding peptide segments that extend from these different protein conformers within a groove that can be covered by an ATP-driven helical lid. However, it has been difficult to test directly how Hsp70 interacts with protein substrates in different stages of folding and how it affects their structure. Moreover, recent indications of diverse lid conformations in Hsp70-substrate complexes raise the possibility of additional interaction mechanisms. Addressing these issues is technically challenging, given the conformational dynamics of both chaperone and client, the transient nature of their interaction, and the involvement of co-chaperones and the ATP hydrolysis cycle. Here, using optical tweezers, we show that the bacterial Hsp70 homologue (DnaK) binds and stabilizes not only extended peptide segments, but also partially folded and near-native protein structures. The Hsp70 lid and groove act synergistically when stabilizing folded structures: stabilization is abolished when the lid is truncated and less efficient when the groove is mutated. The diversity of binding modes has important consequences: Hsp70 can both stabilize and destabilize folded structures, in a nucleotide-regulated manner; like Hsp90 and GroEL, Hsp70 can affect the late stages of protein folding; and Hsp70 can suppress aggregation by protecting partially folded structures as well as unfolded protein chains. Overall, these findings in the DnaK system indicate an extension of the Hsp70 canonical model that potentially affects a wide range of physiological roles of the Hsp70 system.

  17. Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets.

    PubMed

    Deiana, Antonio; Giansanti, Andrea

    2010-04-21

    Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated.

  18. Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets

    PubMed Central

    2010-01-01

    Background Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. Results In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Conclusions Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated. PMID:20409339

  19. Prelude and Fugue, predicting local protein structure, early folding regions and structural weaknesses.

    PubMed

    Kwasigroch, Jean Marc; Rooman, Marianne

    2006-07-15

    Prelude&Fugue are bioinformatics tools aiming at predicting the local 3D structure of a protein from its amino acid sequence in terms of seven backbone torsion angle domains, using database-derived potentials. Prelude(&Fugue) computes all lowest free energy conformations of a protein or protein region, ranked by increasing energy, and possibly satisfying some interresidue distance constraints specified by the user. (Prelude&)Fugue detects sequence regions whose predicted structure is significantly preferred relative to other conformations in the absence of tertiary interactions. These programs can be used for predicting secondary structure, tertiary structure of short peptides, flickering early folding sequences and peptides that adopt a preferred conformation in solution. They can also be used for detecting structural weaknesses, i.e. sequence regions that are not optimal with respect to the tertiary fold. http://babylone.ulb.ac.be/Prelude_and_Fugue.

  20. Hydrogen bonds are a primary driving force for de novo protein folding

    DOE PAGES

    Lee, Schuyler; Wang, Chao; Liu, Haolin; ...

    2017-11-10

    The protein-folding mechanism remains a major puzzle in life science. Purified soluble activation-induced cytidine deaminase (AID) is one of the most difficult proteins to obtain. Starting from inclusion bodies containing a C-terminally truncated version of AID (residues 1–153; AID 153 ), an optimized in vitro folding procedure was derived to obtain large amounts of AID 153 , which led to crystals with good quality and to final structural determination. Interestingly, it was found that the final refolding yield of the protein is proline residue-dependent. The difference in the distribution of cis and trans configurations of proline residues in the proteinmore » after complete denaturation is a major determining factor of the final yield. A point mutation of one of four proline residues to an asparagine led to a near-doubling of the yield of refolded protein after complete denaturation. It was concluded that the driving force behind protein folding could not overcome the cis -to- trans proline isomerization, or vice versa , during the protein-folding process. Furthermore, it was found that successful refolding of proteins optimally occurs at high pH values, which may mimic protein folding in vivo . It was found that high pH values could induce the polarization of peptide bonds, which may trigger the formation of protein secondary structures through hydrogen bonds. It is proposed that a hydrophobic environment coupled with negative charges is essential for protein folding. Combined with our earlier discoveries on protein-unfolding mechanisms, it is proposed that hydrogen bonds are a primary driving force for de novo protein folding.« less

  1. Hydrogen bonds are a primary driving force for de novo protein folding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Schuyler; Wang, Chao; Liu, Haolin

    The protein-folding mechanism remains a major puzzle in life science. Purified soluble activation-induced cytidine deaminase (AID) is one of the most difficult proteins to obtain. Starting from inclusion bodies containing a C-terminally truncated version of AID (residues 1–153; AID 153 ), an optimized in vitro folding procedure was derived to obtain large amounts of AID 153 , which led to crystals with good quality and to final structural determination. Interestingly, it was found that the final refolding yield of the protein is proline residue-dependent. The difference in the distribution of cis and trans configurations of proline residues in the proteinmore » after complete denaturation is a major determining factor of the final yield. A point mutation of one of four proline residues to an asparagine led to a near-doubling of the yield of refolded protein after complete denaturation. It was concluded that the driving force behind protein folding could not overcome the cis -to- trans proline isomerization, or vice versa , during the protein-folding process. Furthermore, it was found that successful refolding of proteins optimally occurs at high pH values, which may mimic protein folding in vivo . It was found that high pH values could induce the polarization of peptide bonds, which may trigger the formation of protein secondary structures through hydrogen bonds. It is proposed that a hydrophobic environment coupled with negative charges is essential for protein folding. Combined with our earlier discoveries on protein-unfolding mechanisms, it is proposed that hydrogen bonds are a primary driving force for de novo protein folding.« less

  2. Classification of proteins: available structural space for molecular modeling.

    PubMed

    Andreeva, Antonina

    2012-01-01

    The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.

  3. Solitons and protein folding: An In Silico experiment

    NASA Astrophysics Data System (ADS)

    Ilieva, N.; Dai, J.; Sieradzan, A.; Niemi, A.

    2015-10-01

    Protein folding [1] is the process of formation of a functional 3D structure from a random coil — the shape in which amino-acid chains leave the ribosome. Anfinsen's dogma states that the native 3D shape of a protein is completely determined by protein's amino acid sequence. Despite the progress in understanding the process rate and the success in folding prediction for some small proteins, with presently available physics-based methods it is not yet possible to reliably deduce the shape of a biologically active protein from its amino acid sequence. The protein-folding problem endures as one of the most important unresolved problems in science; it addresses the origin of life itself. Furthermore, a wrong fold is a common cause for a protein to lose its function or even endanger the living organism. Soliton solutions of a generalized discrete non-linear Schrödinger equation (GDNLSE) obtained from the energy function in terms of bond and torsion angles κ and τ provide a constructive theoretical framework for describing protein folds and folding patterns [2]. Here we study the dynamics of this process by means of molecular-dynamics simulations. The soliton manifestation is the pattern helix-loop-helix in the secondary structure of the protein, which explains the importance of understanding loop formation in helical proteins. We performed in silico experiments for unfolding one subunit of the core structure of gp41 from the HIV envelope glycoprotein (PDB ID: 1AIK [3]) by molecular-dynamics simulations with the MD package GROMACS. We analyzed 80 ns trajectories, obtained with one united-atom and two different all-atom force fields, to justify the side-chain orientation quantification scheme adopted in the studies and to eliminate force-field based artifacts. Our results are compatible with the soliton model of protein folding and provide first insight into soliton-formation dynamics.

  4. SARS-unique fold in the Rousettus bat coronavirus HKU9.

    PubMed

    Hammond, Robert G; Tan, Xuan; Johnson, Margaret A

    2017-09-01

    The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.

  5. Classification of the treble clef zinc finger: noteworthy lessons for structure and function evolution.

    PubMed

    Kaur, Gurmeet; Subramanian, Srikrishna

    2016-08-26

    Treble clef (TC) zinc fingers constitute a large fold-group of structural zinc-binding protein domains that mediate numerous cellular functions. We have analysed the sequence, structure, and function relationships among all TCs in the Protein Data Bank. This led to the identification of novel TCs, such as lsr2, YggX and TFIIIC τ 60 kDa subunit, and prediction of a nuclease-like function for the DUF1364 family. The structural malleability of TCs is evident from the many examples with variations to the core structural elements of the fold. We observe domains wherein the structural core of the TC fold is circularly permuted, and also some examples where the overall fold resembles both the TC motif and another unrelated fold. All extant TC families do not share a monophyletic origin, as several TC proteins are known to have been present in the last universal common ancestor and the last eukaryotic common ancestor. We identify several TCs where the zinc-chelating site and residues are not merely responsible for structure stabilization but also perform other functions, such as being redox active in C1B domain of protein kinase C, a nucleophilic acceptor in Ada and catalytic in organomercurial lyase, MerB.

  6. Classification of the treble clef zinc finger: noteworthy lessons for structure and function evolution

    NASA Astrophysics Data System (ADS)

    Kaur, Gurmeet; Subramanian, Srikrishna

    2016-08-01

    Treble clef (TC) zinc fingers constitute a large fold-group of structural zinc-binding protein domains that mediate numerous cellular functions. We have analysed the sequence, structure, and function relationships among all TCs in the Protein Data Bank. This led to the identification of novel TCs, such as lsr2, YggX and TFIIIC τ 60 kDa subunit, and prediction of a nuclease-like function for the DUF1364 family. The structural malleability of TCs is evident from the many examples with variations to the core structural elements of the fold. We observe domains wherein the structural core of the TC fold is circularly permuted, and also some examples where the overall fold resembles both the TC motif and another unrelated fold. All extant TC families do not share a monophyletic origin, as several TC proteins are known to have been present in the last universal common ancestor and the last eukaryotic common ancestor. We identify several TCs where the zinc-chelating site and residues are not merely responsible for structure stabilization but also perform other functions, such as being redox active in C1B domain of protein kinase C, a nucleophilic acceptor in Ada and catalytic in organomercurial lyase, MerB.

  7. Modulation of the multistate folding of designed TPR proteins through intrinsic and extrinsic factors

    PubMed Central

    Phillips, J J; Javadi, Y; Millership, C; Main, E R G

    2012-01-01

    Tetratricopeptide repeats (TPRs) are a class of all alpha-helical repeat proteins that are comprised of 34-aa helix-turn-helix motifs. These stack together to form nonglobular structures that are stabilized by short-range interactions from residues close in primary sequence. Unlike globular proteins, they have few, if any, long-range nonlocal stabilizing interactions. Several studies on designed TPR proteins have shown that this modular structure is reflected in their folding, that is, modular multistate folding is observed as opposed to two-state folding. Here we show that TPR multistate folding can be suppressed to approximate two-state folding through modulation of intrinsic stability or extrinsic environmental variables. This modulation was investigated by comparing the thermodynamic unfolding under differing buffer regimes of two distinct series of consensus-designed TPR proteins, which possess different intrinsic stabilities. A total of nine proteins of differing sizes and differing consensus TPR motifs were each thermally and chemically denatured and their unfolding monitored using differential scanning calorimetry (DSC) and CD/fluorescence, respectively. Analyses of both the DSC and chemical denaturation data show that reducing the total stability of each protein and repeat units leads to observable two-state unfolding. These data highlight the intimate link between global and intrinsic repeat stability that governs whether folding proceeds by an observably two-state mechanism, or whether partial unfolding yields stable intermediate structures which retain sufficient stability to be populated at equilibrium. PMID:22170589

  8. A novel structural tree for wrap-proteins, a subclass of (α+β)-proteins.

    PubMed

    Boshkova, Eugenia A; Gordeev, Alexey B; Efimov, Alexander V

    2014-01-01

    In this paper, a novel structural subclass of (α+β)-proteins is presented. A characteristic feature of these proteins and domains is that they consist of strongly twisted and coiled β-sheets wrapped around one or two α-helices, so they are referred to here as wrap-proteins. It is shown that overall folds of the wrap-proteins can be obtained by stepwise addition of α-helices and/or β-strands to the strongly twisted and coiled β-hairpin taken as the starting structure in modeling. As a result of modeling, a structural tree for the wrap-proteins was constructed that includes 201 folds of which 49 occur in known nonhomologous proteins.

  9. Deconvoluting Protein (Un)folding Structural Ensembles Using X-Ray Scattering, Nuclear Magnetic Resonance Spectroscopy and Molecular Dynamics Simulation

    PubMed Central

    Nasedkin, Alexandr; Marcellini, Moreno; Religa, Tomasz L.; Freund, Stefan M.; Menzel, Andreas; Fersht, Alan R.; Jemth, Per; van der Spoel, David; Davidsson, Jan

    2015-01-01

    The folding and unfolding of protein domains is an apparently cooperative process, but transient intermediates have been detected in some cases. Such (un)folding intermediates are challenging to investigate structurally as they are typically not long-lived and their role in the (un)folding reaction has often been questioned. One of the most well studied (un)folding pathways is that of Drosophila melanogaster Engrailed homeodomain (EnHD): this 61-residue protein forms a three helix bundle in the native state and folds via a helical intermediate. Here we used molecular dynamics simulations to derive sample conformations of EnHD in the native, intermediate, and unfolded states and selected the relevant structural clusters by comparing to small/wide angle X-ray scattering data at four different temperatures. The results are corroborated using residual dipolar couplings determined by NMR spectroscopy. Our results agree well with the previously proposed (un)folding pathway. However, they also suggest that the fully unfolded state is present at a low fraction throughout the investigated temperature interval, and that the (un)folding intermediate is highly populated at the thermal midpoint in line with the view that this intermediate can be regarded to be the denatured state under physiological conditions. Further, the combination of ensemble structural techniques with MD allows for determination of structures and populations of multiple interconverting structures in solution. PMID:25946337

  10. Deconvoluting Protein (Un)folding Structural Ensembles Using X-Ray Scattering, Nuclear Magnetic Resonance Spectroscopy and Molecular Dynamics Simulation.

    PubMed

    Nasedkin, Alexandr; Marcellini, Moreno; Religa, Tomasz L; Freund, Stefan M; Menzel, Andreas; Fersht, Alan R; Jemth, Per; van der Spoel, David; Davidsson, Jan

    2015-01-01

    The folding and unfolding of protein domains is an apparently cooperative process, but transient intermediates have been detected in some cases. Such (un)folding intermediates are challenging to investigate structurally as they are typically not long-lived and their role in the (un)folding reaction has often been questioned. One of the most well studied (un)folding pathways is that of Drosophila melanogaster Engrailed homeodomain (EnHD): this 61-residue protein forms a three helix bundle in the native state and folds via a helical intermediate. Here we used molecular dynamics simulations to derive sample conformations of EnHD in the native, intermediate, and unfolded states and selected the relevant structural clusters by comparing to small/wide angle X-ray scattering data at four different temperatures. The results are corroborated using residual dipolar couplings determined by NMR spectroscopy. Our results agree well with the previously proposed (un)folding pathway. However, they also suggest that the fully unfolded state is present at a low fraction throughout the investigated temperature interval, and that the (un)folding intermediate is highly populated at the thermal midpoint in line with the view that this intermediate can be regarded to be the denatured state under physiological conditions. Further, the combination of ensemble structural techniques with MD allows for determination of structures and populations of multiple interconverting structures in solution.

  11. Insights into the fold organization of TIM barrel from interaction energy based structure networks.

    PubMed

    Vijayabaskar, M S; Vishveshwara, Saraswathi

    2012-01-01

    There are many well-known examples of proteins with low sequence similarity, adopting the same structural fold. This aspect of sequence-structure relationship has been extensively studied both experimentally and theoretically, however with limited success. Most of the studies consider remote homology or "sequence conservation" as the basis for their understanding. Recently "interaction energy" based network formalism (Protein Energy Networks (PENs)) was developed to understand the determinants of protein structures. In this paper we have used these PENs to investigate the common non-covalent interactions and their collective features which stabilize the TIM barrel fold. We have also developed a method of aligning PENs in order to understand the spatial conservation of interactions in the fold. We have identified key common interactions responsible for the conservation of the TIM fold, despite high sequence dissimilarity. For instance, the central beta barrel of the TIM fold is stabilized by long-range high energy electrostatic interactions and low-energy contiguous vdW interactions in certain families. The other interfaces like the helix-sheet or the helix-helix seem to be devoid of any high energy conserved interactions. Conserved interactions in the loop regions around the catalytic site of the TIM fold have also been identified, pointing out their significance in both structural and functional evolution. Based on these investigations, we have developed a novel network based phylogenetic analysis for remote homologues, which can perform better than sequence based phylogeny. Such an analysis is more meaningful from both structural and functional evolutionary perspective. We believe that the information obtained through the "interaction conservation" viewpoint and the subsequently developed method of structure network alignment, can shed new light in the fields of fold organization and de novo computational protein design.

  12. Topological switching between an alpha-beta parallel protein and a remarkably helical molten globule.

    PubMed

    Nabuurs, Sanne M; Westphal, Adrie H; aan den Toorn, Marije; Lindhoud, Simon; van Mierlo, Carlo P M

    2009-06-17

    Partially folded protein species transiently exist during folding of most proteins. Often these species are molten globules, which may be on- or off-pathway to native protein. Molten globules have a substantial amount of secondary structure but lack virtually all the tertiary side-chain packing characteristic of natively folded proteins. These ensembles of interconverting conformers are prone to aggregation and potentially play a role in numerous devastating pathologies, and thus attract considerable attention. The molten globule that is observed during folding of apoflavodoxin from Azotobacter vinelandii is off-pathway, as it has to unfold before native protein can be formed. Here we report that this species can be trapped under nativelike conditions by substituting amino acid residue F44 by Y44, allowing spectroscopic characterization of its conformation. Whereas native apoflavodoxin contains a parallel beta-sheet surrounded by alpha-helices (i.e., the flavodoxin-like or alpha-beta parallel topology), it is shown that the molten globule has a totally different topology: it is helical and contains no beta-sheet. The presence of this remarkably nonnative species shows that single polypeptide sequences can code for distinct folds that swap upon changing conditions. Topological switching between unrelated protein structures is likely a general phenomenon in the protein structure universe.

  13. Blind test of physics-based prediction of protein structures.

    PubMed

    Shell, M Scott; Ozkan, S Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A

    2009-02-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences.

  14. Blind Test of Physics-Based Prediction of Protein Structures

    PubMed Central

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Sang Beom; Dsilva, Carmeline J.; Debenedetti, Pablo G., E-mail: pdebene@princeton.edu

    Understanding the mechanisms by which proteins fold from disordered amino-acid chains to spatially ordered structures remains an area of active inquiry. Molecular simulations can provide atomistic details of the folding dynamics which complement experimental findings. Conventional order parameters, such as root-mean-square deviation and radius of gyration, provide structural information but fail to capture the underlying dynamics of the protein folding process. It is therefore advantageous to adopt a method that can systematically analyze simulation data to extract relevant structural as well as dynamical information. The nonlinear dimensionality reduction technique known as diffusion maps automatically embeds the high-dimensional folding trajectories inmore » a lower-dimensional space from which one can more easily visualize folding pathways, assuming the data lie approximately on a lower-dimensional manifold. The eigenvectors that parametrize the low-dimensional space, furthermore, are determined systematically, rather than chosen heuristically, as is done with phenomenological order parameters. We demonstrate that diffusion maps can effectively characterize the folding process of a Trp-cage miniprotein. By embedding molecular dynamics simulation trajectories of Trp-cage folding in diffusion maps space, we identify two folding pathways and intermediate structures that are consistent with the previous studies, demonstrating that this technique can be employed as an effective way of analyzing and constructing protein folding pathways from molecular simulations.« less

  16. PREFACE Protein folding: lessons learned and new frontiers Protein folding: lessons learned and new frontiers

    NASA Astrophysics Data System (ADS)

    Pappu, Rohit V.; Nussinov, Ruth

    2009-03-01

    In appropriate physiological milieux proteins spontaneously fold into their functional three-dimensional structures. The amino acid sequences of functional proteins contain all the information necessary to specify the folds. This remarkable observation has spawned research aimed at answering two major questions. (1) Of all the conceivable structures that a protein can adopt, why is the ensemble of native-like structures the most favorable? (2) What are the paths by which proteins manage to robustly and reproducibly fold into their native structures? Anfinsen's thermodynamic hypothesis has guided the pursuit of answers to the first question whereas Levinthal's paradox has influenced the development of models for protein folding dynamics. Decades of work have led to significant advances in the folding problem. Mean-field models have been developed to capture our current, coarse grain understanding of the driving forces for protein folding. These models are being used to predict three-dimensional protein structures from sequence and stability profiles as a function of thermodynamic and chemical perturbations. Impressive strides have also been made in the field of protein design, also known as the inverse folding problem, thereby testing our understanding of the determinants of the fold specificities of different sequences. Early work on protein folding pathways focused on the specific sequence of events that could lead to a simplification of the search process. However, unifying principles proved to be elusive. Proteins that show reversible two-state folding-unfolding transitions turned out to be a gift of natural selection. Focusing on these simple systems helped researchers to uncover general principles regarding the origins of cooperativity in protein folding thermodynamics and kinetics. On the theoretical front, concepts borrowed from polymer physics and the physics of spin glasses led to the development of a framework based on energy landscape theories. These theories predict that evolved sequences (functional proteins as opposed to random sequences) find their native folds by minimizing geometric (topological) frustration (i.e. avoiding entropic bottlenecks/kinetic traps). In some cases, following a dominant pathway is the optimal way to minimize frustration, whereas in extreme cases, proteins may fold without encountering bottlenecks. Experimental studies of two-state proteins led in turn to the development of quantitative descriptors that have allowed specific testing of theoretical predictions. These include methods such as phi value analysis to characterize transition state ensembles and descriptors that measure the effects of geometry/topology on folding rates. Interestingly, there exists a striking inverse correlation between the relative contact order (the distance in sequence space between spatially proximal contacts made in the native state) and the folding rates of several two-state proteins. The relative contact order provides a rough estimate of the net entropic cost associated with realizing the folded state, and theories have been developed to explain the observed correlation between the contact order and folding rates. Despite its maturity as a field, there are several areas that come under the rubric of protein folding that are just beginning to receive attention. For example, how do complications in vivo such as macromolecular crowding, confinement, the presence of cosolutes, membrane anchoring, and tethering to surfaces influence protein stabilities and folding dynamics? While we are accustomed to studying proteins at concentrations that are amenable to investigation via probes whose signal intensities grow with protein concentration, this does not make these readouts relevant to the in vivo setting. In cells, protein concentrations are tightly regulated and are likely to be orders of magnitude lower than what we are accustomed to using within in vitro experimental setups. Protein folding in vivo is a complex multi-scale dynamical problem when one considers the synergies between protein expression, spontaneous folding, chaperonin-assisted folding, protein targeting, the kinetics of post-translational modifications, protein degradation, and of course the drive to avoid aggregation. Further, there is growing recognition that cells not only tolerate but select for proteins that are intrinsically disordered. These proteins are essential for many crucial activities, and yet their inability to fold in isolation makes them prone to proteolytic processing and aggregation. In the series of papers that make up this special focus on protein folding in physical biology, leading researchers provide insights into diverse cross-sections of problems in protein folding. Barrick provides a concise review of what we have learned from the study of two-state folders and draws attention to how several unanswered questions are being approached using studies on large repeat proteins. Dissecting the contribution of hydration-mediated interactions to driving forces for protein folding and assembly has been extremely challenging. There is renewed interest in using hydrostatic pressure as a tool to access folding intermediates and decipher the role of partially hydrated states in folding, misfolding, and aggregation. Silva and Foguel review many of the nuances that have been uncovered by perturbing hydrostatic pressure as a thermodynamic parameter. As noted above, protein folding in vivo is expected to be considerably more complex than the folding of two-state proteins in dilute solutions. Lucent et al review the state-of-the-art in the development of quantitative theories to explain chaperonin-assisted folding in vivo. Additionally, they highlight unanswered questions pertaining to the processing of unfolded/misfolded proteins by the chaperone machinery. Zhuang et al present results that focus on the effects of surface tethering on transition state ensembles and folding mechanisms of a model two-state protein. Their results are important because several proteins in vivo fold while being anchored to membranes. Finally, several neurodegenerative and systemic diseases are associated with the aggregation of intrinsically disordered polypeptides. The search for cures in these debilitating and fatal diseases has focused attention on shared attributes in aggregation mechanisms of different proteins and the possibility of identifying druggable targets from mechanistic studies. Abedini and Raleigh review common features gleaned from mechanistic studies of the aggregation of several intrinsically disordered proteins. They propose that the population of helical intermediates and their stabilization via interactions with membranes might be an important route by which the process of aggregation leads to toxicity. The five papers that form this protein folding focus cover specific sub-topics within the larger field of protein folding. They address current questions and emphasize the importance of the growing and productive interface between the physical sciences and biology. We hope that these papers will stimulate much discussion and more importantly advances in the areas highlighted by the contributors.

  17. Dynamics of protein folding: probing the kinetic network of folding-unfolding transitions with experiment and theory.

    PubMed

    Buchner, Ginka S; Murphy, Ronan D; Buchete, Nicolae-Viorel; Kubelka, Jan

    2011-08-01

    The problem of spontaneous folding of amino acid chains into highly organized, biologically functional three-dimensional protein structures continues to challenge the modern science. Understanding how proteins fold requires characterization of the underlying energy landscapes as well as the dynamics of the polypeptide chains in all stages of the folding process. In recent years, important advances toward these goals have been achieved owing to the rapidly growing interdisciplinary interest and significant progress in both experimental techniques and theoretical methods. Improvements in the experimental time resolution led to determination of the timescales of the important elementary events in folding, such as formation of secondary structure and tertiary contacts. Sensitive single molecule methods made possible probing the distributions of the unfolded and folded states and following the folding reaction of individual protein molecules. Discovery of proteins that fold in microseconds opened the possibility of atomic-level theoretical simulations of folding and their direct comparisons with experimental data, as well as of direct experimental observation of the barrier-less folding transition. The ultra-fast folding also brought new questions, concerning the intrinsic limits of the folding rates and experimental signatures of barrier-less "downhill" folding. These problems will require novel approaches for even more detailed experimental investigations of the folding dynamics as well as for the analysis of the folding kinetic data. For theoretical simulations of folding, a main challenge is how to extract the relevant information from overwhelmingly detailed atomistic trajectories. New theoretical methods have been devised to allow a systematic approach towards a quantitative analysis of the kinetic network of folding-unfolding transitions between various configuration states of a protein, revealing the transition states and the associated folding pathways at multiple levels, from atomistic to coarse-grained representations. This article is part of a Special Issue entitled: Protein Dynamics: Experimental and Computational Approaches. Copyright © 2010 Elsevier B.V. All rights reserved.

  18. Global analysis of protein folding using massively parallel design, synthesis and testing

    PubMed Central

    Rocklin, Gabriel J.; Chidyausiku, Tamuka M.; Goreshnik, Inna; Ford, Alex; Houliston, Scott; Lemak, Alexander; Carter, Lauren; Ravichandran, Rashmi; Mulligan, Vikram K.; Chevalier, Aaron; Arrowsmith, Cheryl H.; Baker, David

    2017-01-01

    Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Though these forces are “encoded” in the thousands of known protein structures, “decoding” them is challenging due to the complexity of natural proteins that have evolved for function, not stability. Here we combine computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for over 15,000 de novo designed miniproteins, 1,000 natural proteins, 10,000 point-mutants, and 30,000 negative control sequences, identifying over 2,500 new stable designed proteins in four basic folds. This scale—three orders of magnitude greater than that of previous studies of design or folding—enabled us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment, and promises to transform computational protein design into a data-driven science. PMID:28706065

  19. Decoding Structural Properties of a Partially Unfolded Protein Substrate: En Route to Chaperone Binding

    PubMed Central

    Nagpal, Suhani; Tiwari, Satyam; Mapa, Koyeli; Thukral, Lipi

    2015-01-01

    Many proteins comprising of complex topologies require molecular chaperones to achieve their unique three-dimensional folded structure. The E.coli chaperone, GroEL binds with a large number of unfolded and partially folded proteins, to facilitate proper folding and prevent misfolding and aggregation. Although the major structural components of GroEL are well defined, scaffolds of the non-native substrates that determine chaperone-mediated folding have been difficult to recognize. Here we performed all-atomistic and replica-exchange molecular dynamics simulations to dissect non-native ensemble of an obligate GroEL folder, DapA. Thermodynamics analyses of unfolding simulations revealed populated intermediates with distinct structural characteristics. We found that surface exposed hydrophobic patches are significantly increased, primarily contributed from native and non-native β-sheet elements. We validate the structural properties of these conformers using experimental data, including circular dichroism (CD), 1-anilinonaphthalene-8-sulfonic acid (ANS) binding measurements and previously reported hydrogen-deutrium exchange coupled to mass spectrometry (HDX-MS). Further, we constructed network graphs to elucidate long-range intra-protein connectivity of native and intermediate topologies, demonstrating regions that serve as central “hubs”. Overall, our results implicate that genomic variations (or mutations) in the distinct regions of protein structures might disrupt these topological signatures disabling chaperone-mediated folding, leading to formation of aggregates. PMID:26394388

  20. Aromatic Cluster Sensor of Protein Folding: Near-UV Electronic Circular Dichroism Bands Assigned to Fold Compactness.

    PubMed

    Farkas, Viktor; Jákli, Imre; Tóth, Gábor K; Perczel, András

    2016-09-19

    Both far- and near-UV electronic circular dichroism (ECD) spectra have bands sensitive to thermal unfolding of Trp and Tyr residues containing proteins. Beside spectral changes at 222 nm reporting secondary structural variations (far-UV range), L b bands (near-UV range) are applicable as 3D-fold sensors of protein's core structure. In this study we show that both L b (Tyr) and L b (Trp) ECD bands could be used as sensors of fold compactness. ECD is a relative method and thus requires NMR referencing and cross-validation, also provided here. The ensemble of 204 ECD spectra of Trp-cage miniproteins is analysed as a training set for "calibrating" Trp↔Tyr folded systems of known NMR structure. While in the far-UV ECD spectra changes are linear as a function of the temperature, near-UV ECD data indicate a non-linear and thus, cooperative unfolding mechanism of these proteins. Ensemble of ECD spectra deconvoluted gives both conformational weights and insight to a protein folding↔unfolding mechanism. We found that the L b 293 band is reporting on the 3D-structure compactness. In addition, the pure near-UV ECD spectrum of the unfolded state is described here for the first time. Thus, ECD folding information now validated can be applied with confidence in a large thermal window (5≤T≤85 °C) compared to NMR for studying the unfolding of Trp↔Tyr residue pairs. In conclusion, folding propensities of important proteins (RNA polymerase II, ubiquitin protein ligase, tryptase-inhibitor etc.) can now be analysed with higher confidence. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Protein structure recognition: From eigenvector analysis to structural threading method

    NASA Astrophysics Data System (ADS)

    Cao, Haibo

    In this work, we try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. We found a strong correlation between amino acid sequence and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, we give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part include discussions of interactions among amino acids residues, lattice HP model, and the designablity principle. In the second part, we try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in our eigenvector study of protein contact matrix. We believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, we discuss a threading method based on the correlation between amino acid sequence and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, we list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  2. Comparison of successive transition states for folding reveals alternative early folding pathways of two homologous proteins

    PubMed Central

    Calosci, Nicoletta; Chi, Celestine N.; Richter, Barbara; Camilloni, Carlo; Engström, Åke; Eklund, Lars; Travaglini-Allocatelli, Carlo; Gianni, Stefano; Vendruscolo, Michele; Jemth, Per

    2008-01-01

    The energy landscape theory provides a general framework for describing protein folding reactions. Because a large number of studies, however, have focused on two-state proteins with single well-defined folding pathways and without detectable intermediates, the extent to which free energy landscapes are shaped up by the native topology at the early stages of the folding process has not been fully characterized experimentally. To this end, we have investigated the folding mechanisms of two homologous three-state proteins, PTP-BL PDZ2 and PSD-95 PDZ3, and compared the early and late transition states on their folding pathways. Through a combination of Φ value analysis and molecular dynamics simulations we obtained atomic-level structures of the transition states of these homologous three-state proteins and found that the late transition states are much more structurally similar than the early ones. Our findings thus reveal that, while the native state topology defines essentially in a unique way the late stages of folding, it leaves significant freedom to the early events, a result that reflects the funneling of the free energy landscape toward the native state. PMID:19033470

  3. Examining a Thermodynamic Order Parameter of Protein Folding.

    PubMed

    Chong, Song-Ho; Ham, Sihyun

    2018-05-08

    Dimensionality reduction with a suitable choice of order parameters or reaction coordinates is commonly used for analyzing high-dimensional time-series data generated by atomistic biomolecular simulations. So far, geometric order parameters, such as the root mean square deviation, fraction of native amino acid contacts, and collective coordinates that best characterize rare or large conformational transitions, have been prevailing in protein folding studies. Here, we show that the solvent-averaged effective energy, which is a thermodynamic quantity but unambiguously defined for individual protein conformations, serves as a good order parameter of protein folding. This is illustrated through the application to the folding-unfolding simulation trajectory of villin headpiece subdomain. We rationalize the suitability of the effective energy as an order parameter by the funneledness of the underlying protein free energy landscape. We also demonstrate that an improved conformational space discretization is achieved by incorporating the effective energy. The most distinctive feature of this thermodynamic order parameter is that it works in pointing to near-native folded structures even when the knowledge of the native structure is lacking, and the use of the effective energy will also find applications in combination with methods of protein structure prediction.

  4. Efficient molecular mechanics simulations of the folding, orientation, and assembly of peptides in lipid bilayers using an implicit atomic solvation model

    NASA Astrophysics Data System (ADS)

    Bordner, Andrew J.; Zorman, Barry; Abagyan, Ruben

    2011-10-01

    Membrane proteins comprise a significant fraction of the proteomes of sequenced organisms and are the targets of approximately half of marketed drugs. However, in spite of their prevalence and biomedical importance, relatively few experimental structures are available due to technical challenges. Computational simulations can potentially address this deficit by providing structural models of membrane proteins. Solvation within the spatially heterogeneous membrane/solvent environment provides a major component of the energetics driving protein folding and association within the membrane. We have developed an implicit solvation model for membranes that is both computationally efficient and accurate enough to enable molecular mechanics predictions for the folding and association of peptides within the membrane. We derived the new atomic solvation model parameters using an unbiased fitting procedure to experimental data and have applied it to diverse problems in order to test its accuracy and to gain insight into membrane protein folding. First, we predicted the positions and orientations of peptides and complexes within the lipid bilayer and compared the simulation results with solid-state NMR structures. Additionally, we performed folding simulations for a series of host-guest peptides with varying propensities to form alpha helices in a hydrophobic environment and compared the structures with experimental measurements. We were also able to successfully predict the structures of amphipathic peptides as well as the structures for dimeric complexes of short hexapeptides that have experimentally characterized propensities to form beta sheets within the membrane. Finally, we compared calculated relative transfer energies with data from experiments measuring the effects of mutations on the free energies of translocon-mediated insertion of proteins into lipid bilayers and of combined folding and membrane insertion of a beta barrel protein.

  5. Cooperativity and modularity in protein folding

    PubMed Central

    Sasai, Masaki; Chikenji, George; Terada, Tomoki P.

    2016-01-01

    A simple statistical mechanical model proposed by Wako and Saitô has explained the aspects of protein folding surprisingly well. This model was systematically applied to multiple proteins by Muñoz and Eaton and has since been referred to as the Wako-Saitô-Muñoz-Eaton (WSME) model. The success of the WSME model in explaining the folding of many proteins has verified the hypothesis that the folding is dominated by native interactions, which makes the energy landscape globally biased toward native conformation. Using the WSME and other related models, Saitô emphasized the importance of the hierarchical pathway in protein folding; folding starts with the creation of contiguous segments having a native-like configuration and proceeds as growth and coalescence of these segments. The Φ-values calculated for barnase with the WSME model suggested that segments contributing to the folding nucleus are similar to the structural modules defined by the pattern of native atomic contacts. The WSME model was extended to explain folding of multi-domain proteins having a complex topology, which opened the way to comprehensively understanding the folding process of multi-domain proteins. The WSME model was also extended to describe allosteric transitions, indicating that the allosteric structural movement does not occur as a deterministic sequential change between two conformations but as a stochastic diffusive motion over the dynamically changing energy landscape. Statistical mechanical viewpoint on folding, as highlighted by the WSME model, has been renovated in the context of modern methods and ideas, and will continue to provide insights on equilibrium and dynamical features of proteins. PMID:28409080

  6. Oxidative Folding and N-terminal Cyclization of Onconase+

    PubMed Central

    Welker, Ervin; Hathaway, Laura; Xu, Guoqiang; Narayan, Mahesh; Pradeep, Lovy; Shin, Hang-Cheol; Scheraga, Harold A.

    2008-01-01

    Cyclization of the N-terminal glutamine residue to pyroglutamic acid in onconase, an anti-cancer chemotherapeutic agent, increases the activity and stability of the protein. Here, we examine the correlated effects of the folding/unfolding process and the formation of this N-terminal pyroglutamic acid. The results in this study indicate that cyclization of the N-terminal glutamine has no significant effect on the rate of either reductive unfolding or oxidative folding of the protein. Both the cyclized and uncyclized proteins seem to follow the same oxidative folding pathways; however, cyclization altered the relative flux of the protein in these two pathways by increasing the rate of formation of a kinetically trapped intermediate. Glutaminyl cyclase (QC) catalyzed the cyclization of the unfolded, reduced protein, but had no effect on the disulfide-intact, uncyclized, folded protein. The structured intermediates of uncyclized onconase were also resistant to QC-catalysis, consistent with their having a native-like fold. These observations suggest that, in vivo, cyclization takes place during the initial stages of oxidative folding, specifically, before the formation of structured intermediates. The competition between oxidative folding and QC-mediated cyclization suggests that QC-catalyzed cyclization of the N-terminal glutamine in onconase occurs in the endoplasmic reticulum, probably co-translationally. PMID:17439243

  7. Effects of Polymer Hydrophobicity on Protein Structure and Aggregation Kinetics in Crowded Milieu.

    PubMed

    Breydo, Leonid; Sales, Amanda E; Frege, Telma; Howell, Mark C; Zaslavsky, Boris Y; Uversky, Vladimir N

    2015-05-19

    We examined the effects of water-soluble polymers of various degrees of hydrophobicity on the folding and aggregation of proteins. The polymers we chose were polyethylene glycol (PEG) and UCON (1:1 copolymer of ethylene glycol and propylene glycol). The presence of additional methyl groups in UCON makes it more hydrophobic than PEG. Our earlier analysis revealed that similarly sized PEG and UCON produced different changes in the solvent properties of water in their solutions and induced morphologically different α-synuclein aggregates [Ferreira, L. A., et al. (2015) Role of solvent properties of aqueous media in macromolecular crowding effects. J. Biomol. Struct. Dyn., in press]. To improve our understanding of molecular mechanisms defining behavior of proteins in a crowded environment, we tested the effects of these polymers on secondary and tertiary structure and aromatic residue solvent accessibility of 10 proteins [five folded proteins, two hybrid proteins; i.e., protein containing ordered and disordered domains, and three intrinsically disordered proteins (IDPs)] and on the aggregation kinetics of insulin and α-synuclein. We found that effects of both polymers on secondary and tertiary structures of folded and hybrid proteins were rather limited with slight unfolding observed in some cases. Solvent accessibility of aromatic residues was significantly increased for the majority of the studied proteins in the presence of UCON but not PEG. PEG also accelerated the aggregation of protein into amyloid fibrils, whereas UCON promoted aggregation to amyloid oligomers instead. These results indicate that even a relatively small change in polymer structure leads to a significant change in the effect of this polymer on protein folding and aggregation. This is an indication that protein folding and especially aggregation are highly sensitive to the presence of other macromolecules, and an excluded volume effect is insufficient to describe their effect.

  8. Transient intermediates are populated in the folding pathways of single-domain two-state folding protein L

    NASA Astrophysics Data System (ADS)

    Maity, Hiranmay; Reddy, Govardhan

    2018-04-01

    Small single-domain globular proteins, which are believed to be dominantly two-state folders, played an important role in elucidating various aspects of the protein folding mechanism. However, recent single molecule fluorescence resonance energy transfer experiments [H. Y. Aviram et al. J. Chem. Phys. 148, 123303 (2018)] on a single-domain two-state folding protein L showed evidence for the population of an intermediate state and it was suggested that in this state, a β-hairpin present near the C-terminal of the native protein state is unfolded. We performed molecular dynamics simulations using a coarse-grained self-organized-polymer model with side chains to study the folding pathways of protein L. In agreement with the experiments, an intermediate is populated in the simulation folding pathways where the C-terminal β-hairpin detaches from the rest of the protein structure. The lifetime of this intermediate structure increased with the decrease in temperature. In low temperature conditions, we also observed a second intermediate state, which is globular with a significant fraction of the native-like tertiary contacts satisfying the features of a dry molten globule.

  9. Cooperative Subunit Refolding of a Light-Harvesting Protein through a Self-Chaperone Mechanism.

    PubMed

    Laos, Alistair J; Dean, Jacob C; Toa, Zi S D; Wilk, Krystyna E; Scholes, Gregory D; Curmi, Paul M G; Thordarson, Pall

    2017-07-10

    The fold of a protein is encoded by its amino acid sequence, but how complex multimeric proteins fold and assemble into functional quaternary structures remains unclear. Here we show that two structurally different phycobiliproteins refold and reassemble in a cooperative manner from their unfolded polypeptide subunits, without biological chaperones. Refolding was confirmed by ultrafast broadband transient absorption and two-dimensional electronic spectroscopy to probe internal chromophores as a marker of quaternary structure. Our results demonstrate a cooperative, self-chaperone refolding mechanism, whereby the β-subunits independently refold, thereby templating the folding of the α-subunits, which then chaperone the assembly of the native complex, quantitatively returning all coherences. Our results indicate that subunit self-chaperoning is a robust mechanism for heteromeric protein folding and assembly that could also be applied in self-assembled synthetic hierarchical systems. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Structured and Unstructured Binding of an Intrinsically Disordered Protein as Revealed by Atomistic Simulations.

    PubMed

    Ithuralde, Raúl Esteban; Roitberg, Adrián Enrique; Turjanski, Adrián Gustavo

    2016-07-20

    Intrinsically disordered proteins (IDPs) are a set of proteins that lack a definite secondary structure in solution. IDPs can acquire tertiary structure when bound to their partners; therefore, the recognition process must also involve protein folding. The nature of the transition state (TS), structured or unstructured, determines the binding mechanism. The characterization of the TS has become a major challenge for experimental techniques and molecular simulations approaches since diffusion, recognition, and binding is coupled to folding. In this work we present atomistic molecular dynamics (MD) simulations that sample the free energy surface of the coupled folding and binding of the transcription factor c-myb to the cotranscription factor CREB binding protein (CBP). This process has been recently studied and became a model to study IDPs. Despite the plethora of available information, we still do not know how c-myb binds to CBP. We performed a set of atomistic biased MD simulations running a total of 15.6 μs. Our results show that c-myb folds very fast upon binding to CBP with no unique pathway for binding. The process can proceed through both structured or unstructured TS's with similar probabilities. This finding reconciles previous seemingly different experimental results. We also performed Go-type coarse-grained MD of several structured and unstructured models that indicate that coupled folding and binding follows a native contact mechanism. To the best of our knowledge, this is the first atomistic MD simulation that samples the free energy surface of the coupled folding and binding processes of IDPs.

  11. Assessing the Potential of Folded Globular Polyproteins As Hydrogel Building Blocks

    PubMed Central

    2016-01-01

    The native states of proteins generally have stable well-defined folded structures endowing these biomolecules with specific functionality and molecular recognition abilities. Here we explore the potential of using folded globular polyproteins as building blocks for hydrogels. Photochemically cross-linked hydrogels were produced from polyproteins containing either five domains of I27 ((I27)5), protein L ((pL)5), or a 1:1 blend of these proteins. SAXS analysis showed that (I27)5 exists as a single rod-like structure, while (pL)5 shows signatures of self-aggregation in solution. SANS measurements showed that both polyprotein hydrogels have a similar nanoscopic structure, with protein L hydrogels being formed from smaller and more compact clusters. The polyprotein hydrogels showed small energy dissipation in a load/unload cycle, which significantly increased when the hydrogels were formed in the unfolded state. This study demonstrates the use of folded proteins as building blocks in hydrogels, and highlights the potential versatility that can be offered in tuning the mechanical, structural, and functional properties of polyproteins. PMID:28006103

  12. NIAS-Server: Neighbors Influence of Amino acids and Secondary Structures in Proteins.

    PubMed

    Borguesan, Bruno; Inostroza-Ponta, Mario; Dorn, Márcio

    2017-03-01

    The exponential growth in the number of experimentally determined three-dimensional protein structures provide a new and relevant knowledge about the conformation of amino acids in proteins. Only a few of probability densities of amino acids are publicly available for use in structure validation and prediction methods. NIAS (Neighbors Influence of Amino acids and Secondary structures) is a web-based tool used to extract information about conformational preferences of amino acid residues and secondary structures in experimental-determined protein templates. This information is useful, for example, to characterize folds and local motifs in proteins, molecular folding, and can help the solution of complex problems such as protein structure prediction, protein design, among others. The NIAS-Server and supplementary data are available at http://sbcb.inf.ufrgs.br/nias .

  13. Evolutionary Strategies for Protein Folding

    NASA Astrophysics Data System (ADS)

    Murthy Gopal, Srinivasa; Wenzel, Wolfgang

    2006-03-01

    The free energy approach for predicting the protein tertiary structure describes the native state of a protein as the global minimum of an appropriate free-energy forcefield. The low-energy region of the free-energy landscape of a protein is extremely rugged. Efficient optimization methods must therefore speed up the search for the global optimum by avoiding high energy transition states, adapt large scale moves or accept unphysical intermediates. Here we investigate an evolutionary strategies(ES) for optimizing a protein conformation in our all-atom free-energy force field([1],[2]). A set of random conformations is evolved using an ES to get a diverse population containing low energy structure. The ES is shown to balance energy improvement and yet maintain diversity in structures. The ES is implemented as a master-client model for distributed computing. Starting from random structures and by using this optimization technique, we were able to fold a 20 amino-acid helical protein and 16 amino-acid beta hairpin[3]. We compare ES to basin hopping method. [1]T. Herges and W. Wenzel,Biophys.J. 87,3100(2004) [2] A. Verma and W. Wenzel Stabilization and folding of beta-sheet and alpha-helical proteins in an all-atom free energy model(submitted)(2005) [3] S. M. Gopal and W. Wenzel Evolutionary Strategies for Protein Folding (in preparation)

  14. Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.

    PubMed

    Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi

    2016-05-01

    Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.

  15. Folding propensity of intrinsically disordered proteins by osmotic stress

    DOE PAGES

    Mansouri, Amanda L.; Grese, Laura N.; Rowe, Erica L.; ...

    2016-10-11

    Proteins imparted with intrinsic disorder conduct a range of essential cellular functions. To better understand the folding and hydration properties of intrinsically disordered proteins (IDPs), we used osmotic stress to induce conformational changes in nuclear co-activator binding domain (NCBD) and activator for thyroid hormone and retinoid receptor (ACTR). Osmotic stress was applied by the addition of small and polymeric osmolytes, where we discovered that water contributions to NCBD folding always exceeded those for ACTR. Both NCBD and ACTR were found to gain a-helical structure with increasing osmotic stress, consistent with their folding upon NCBD/ACTR complex formation. Using small-angle neutron scatteringmore » (SANS), we further characterized NCBD structural changes with the osmolyte ethylene glycol. Here a large reduction in overall size initially occurred before substantial secondary structural change. In conclusion, by focusing on folding propensity, and linked hydration changes, we uncover new insights that may be important for how IDP folding contributes to binding.« less

  16. Folding propensity of intrinsically disordered proteins by osmotic stress

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mansouri, Amanda L.; Grese, Laura N.; Rowe, Erica L.

    Proteins imparted with intrinsic disorder conduct a range of essential cellular functions. To better understand the folding and hydration properties of intrinsically disordered proteins (IDPs), we used osmotic stress to induce conformational changes in nuclear co-activator binding domain (NCBD) and activator for thyroid hormone and retinoid receptor (ACTR). Osmotic stress was applied by the addition of small and polymeric osmolytes, where we discovered that water contributions to NCBD folding always exceeded those for ACTR. Both NCBD and ACTR were found to gain a-helical structure with increasing osmotic stress, consistent with their folding upon NCBD/ACTR complex formation. Using small-angle neutron scatteringmore » (SANS), we further characterized NCBD structural changes with the osmolyte ethylene glycol. Here a large reduction in overall size initially occurred before substantial secondary structural change. In conclusion, by focusing on folding propensity, and linked hydration changes, we uncover new insights that may be important for how IDP folding contributes to binding.« less

  17. Natural triple beta-stranded fibrous folds.

    PubMed

    Mitraki, Anna; Papanikolopoulou, Katerina; Van Raaij, Mark J

    2006-01-01

    A distinctive family of beta-structured folds has recently been described for fibrous proteins from viruses. Virus fibers are usually involved in specific host-cell recognition. They are asymmetric homotrimeric proteins consisting of an N-terminal virus-binding tail, a central shaft or stalk domain, and a C-terminal globular receptor-binding domain. Often they are entirely or nearly entirely composed of beta-structure. Apart from their biological relevance and possible gene therapy applications, their shape, stability, and rigidity suggest they may be useful as blueprints for biomechanical design. Folding and unfolding studies suggest their globular C-terminal domain may fold first, followed by a "zipping-up" of the shaft domains. The C-terminal domains appear to be important for registration because peptides corresponding to shaft domains alone aggregate into nonnative fibers and/or amyloid structures. C-terminal domains can be exchanged between different fibers and the resulting chimeric proteins are useful as a way to solve structures of unknown parts of the shaft domains. The following natural triple beta-stranded fibrous folds have been discovered by X-ray crystallography: the triple beta-spiral, triple beta-helix, and T4 short tail fiber fold. All have a central longitudinal hydrophobic core and extensive intermonomer polar and nonpolar interactions. Now that a reasonable body of structural and folding knowledge has been assembled about these fibrous proteins, the next challenge and opportunity is to start using this information in medical and industrial applications such as gene therapy and nanotechnology.

  18. Identification of kinetically hot residues in proteins.

    PubMed Central

    Demirel, M. C.; Atilgan, A. R.; Jernigan, R. L.; Erman, B.; Bahar, I.

    1998-01-01

    A number of recent studies called attention to the presence of kinetically important residues underlying the formation and stabilization of folding nuclei in proteins, and to the possible existence of a correlation between conserved residues and those participating in the folding nuclei. Here, we use the Gaussian network model (GNM), which recently proved useful in describing the dynamic characteristics of proteins for identifying the kinetically hot residues in folded structures. These are the residues involved in the highest frequency fluctuations near the native state coordinates. Their high frequency is a manifestation of the steepness of the energy landscape near their native state positions. The theory is applied to a series of proteins whose kinetically important residues have been extensively explored: chymotrypsin inhibitor 2, cytochrome c, and related C2 proteins. Most of the residues previously pointed out to underlie the folding process of these proteins, and to be critically important for the stabilization of the tertiary fold, are correctly identified, indicating a correlation between the kinetic hot spots and the early forming structural elements in proteins. Additionally, a strong correlation between kinetically hot residues and loci of conserved residues is observed. Finally, residues that may be important for the stability of the tertiary structure of CheY are proposed. PMID:9865946

  19. Folding Free Energy Landscape of the Decapeptide Chignolin

    NASA Astrophysics Data System (ADS)

    Dou, Xianghua; Wang, Jihua

    Chignolin is an artificially designed ten-residue (GYDPETGTWG) folded peptide, which is the smallest protein and provides a good template for protein folding. In this work, we completed four explicit water molecular dynamics simulations of Chignolin folding using GROMOS and OPLS-AA force fields from extended initial states without any experiment informations. The four-folding free energy landscapes of the peptide has been drawn. The folded state of Chignolin has been successfully predicated based on the free energy landscapes. The four independent simulations gave similar results. (i) The four free energy landscapes have common characters. They are fairly smooth, barrierless, funnel-like and downhill without intermediate state, which consists with the experiment. (ii) The different extended initial structures converge at similar folded structures with the lowest free energy under GROMOS and OPLS-AA force fields. In the GROMOS force field, the backbone RMSD of the folded structures from the NMR native structure of Chignolin is only 0.114 nm, which is a stable structure in this force field. In the OPLS-AA force field, the similar results have been obtained. In addition, the smallest RMSD structure is in better agreement with the NMR native structure but unlikely stable in the force field.

  20. Design and structure of an equilibrium protein folding intermediate: a hint into dynamical regions of proteins.

    PubMed

    Ayuso-Tejedor, Sara; Angarica, Vladimir Espinosa; Bueno, Marta; Campos, Luis A; Abián, Olga; Bernadó, Pau; Sancho, Javier; Jiménez, M Angeles

    2010-07-23

    Partly unfolded protein conformations close to the native state may play important roles in protein function and in protein misfolding. Structural analyses of such conformations which are essential for their fully physicochemical understanding are complicated by their characteristic low populations at equilibrium. We stabilize here with a single mutation the equilibrium intermediate of apoflavodoxin thermal unfolding and determine its solution structure by NMR. It consists of a large native region identical with that observed in the X-ray structure of the wild-type protein plus an unfolded region. Small-angle X-ray scattering analysis indicates that the calculated ensemble of structures is consistent with the actual degree of expansion of the intermediate. The unfolded region encompasses discontinuous sequence segments that cluster in the 3D structure of the native protein forming the FMN cofactor binding loops and the binding site of a variety of partner proteins. Analysis of the apoflavodoxin inner interfaces reveals that those becoming destabilized in the intermediate are more polar than other inner interfaces of the protein. Natively folded proteins contain hydrophobic cores formed by the packing of hydrophobic surfaces, while natively unfolded proteins are rich in polar residues. The structure of the apoflavodoxin thermal intermediate suggests that the regions of natively folded proteins that are easily responsive to thermal activation may contain cores of intermediate hydrophobicity. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  1. Energy landscape of knotted protein folding

    PubMed Central

    Sułkowska, Joanna I.; Noel, Jeffrey K.; Onuchic, Jose N.

    2012-01-01

    Recent experiments have conclusively shown that proteins are able to fold from an unknotted, denatured polypeptide to the knotted, native state without the aid of chaperones. These experiments are consistent with a growing body of theoretical work showing that a funneled, minimally frustrated energy landscape is sufficient to fold small proteins with complex topologies. Here, we present a theoretical investigation of the folding of a knotted protein, 2ouf, engineered in the laboratory by a domain fusion that mimics an evolutionary pathway for knotted proteins. Unlike a previously studied knotted protein of similar length, we see reversible folding/knotting and a surprising lack of deep topological traps with a coarse-grained structure-based model. Our main interest is to investigate how evolution might further select the geometry and stiffness of the threading region of the newly fused protein. We compare the folding of the wild-type protein to several mutants. Similarly to the wild-type protein, all mutants show robust and reversible folding, and knotting coincides with the transition state ensemble. As observed experimentally, our simulations show that the knotted protein folds about ten times slower than an unknotted construct with an identical contact map. Simulated folding kinetics reflect the experimentally observed rollover in the folding limbs of chevron plots. Successful folding of the knotted protein is restricted to a narrow range of temperature as compared to the unknotted protein and fits of the kinetic folding data below folding temperature suggest slow, nondiffusive dynamics for the knotted protein. PMID:22891304

  2. Structure of human POFUT2: insights into thrombospondin type 1 repeat fold and O-fucosylation

    PubMed Central

    Chen, Chun-I; Keusch, Jeremy J; Klein, Dominique; Hess, Daniel; Hofsteenge, Jan; Gut, Heinz

    2012-01-01

    Protein O-fucosylation is a post-translational modification found on serine/threonine residues of thrombospondin type 1 repeats (TSR). The fucose transfer is catalysed by the enzyme protein O-fucosyltransferase 2 (POFUT2) and >40 human proteins contain the TSR consensus sequence for POFUT2-dependent fucosylation. To better understand O-fucosylation on TSR, we carried out a structural and functional analysis of human POFUT2 and its TSR substrate. Crystal structures of POFUT2 reveal a variation of the classical GT-B fold and identify sugar donor and TSR acceptor binding sites. Structural findings are correlated with steady-state kinetic measurements of wild-type and mutant POFUT2 and TSR and give insight into the catalytic mechanism and substrate specificity. By using an artificial mini-TSR substrate, we show that specificity is not primarily encoded in the TSR protein sequence but rather in the unusual 3D structure of a small part of the TSR. Our findings uncover that recognition of distinct conserved 3D fold motifs can be used as a mechanism to achieve substrate specificity by enzymes modifying completely folded proteins of very wide sequence diversity and biological function. PMID:22588082

  3. The mechanism of folding robustness revealed by the crystal structure of extra-superfolder GFP.

    PubMed

    Choi, Jae Young; Jang, Tae-Ho; Park, Hyun Ho

    2017-01-01

    Stability of green fluorescent protein (GFP) is sometimes important for a proper practical application of this protein. Random mutagenesis and targeted mutagenesis have been used to create better-folded variants of GFP, including recently reported extra-superfolder GFP. Our aim was to determine the crystal structure of extra-superfolder GFP, which is more robustly folded and stable than GFP and superfolder GFP. The structural and structure-based mutagenesis analyses revealed that some of the mutations that created extra-superfolder GFP (F46L, E126K, N149K, and S208L) contribute to folding robustness by stabilizing extra-superfolder GFP with various noncovalent bonds. © 2016 Federation of European Biochemical Societies.

  4. Use of designed sequences in protein structure recognition.

    PubMed

    Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran

    2018-05-09

    Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.

  5. Characterization of protein folding by a Φ-value calculation with a statistical-mechanical model.

    PubMed

    Wako, Hiroshi; Abe, Haruo

    2016-01-01

    The Φ-value analysis approach provides information about transition-state structures along the folding pathway of a protein by measuring the effects of an amino acid mutation on folding kinetics. Here we compared the theoretically calculated Φ values of 27 proteins with their experimentally observed Φ values; the theoretical values were calculated using a simple statistical-mechanical model of protein folding. The theoretically calculated Φ values reflected the corresponding experimentally observed Φ values with reasonable accuracy for many of the proteins, but not for all. The correlation between the theoretically calculated and experimentally observed Φ values strongly depends on whether the protein-folding mechanism assumed in the model holds true in real proteins. In other words, the correlation coefficient can be expected to illuminate the folding mechanisms of proteins, providing the answer to the question of which model more accurately describes protein folding: the framework model or the nucleation-condensation model. In addition, we tried to characterize protein folding with respect to various properties of each protein apart from the size and fold class, such as the free-energy profile, contact-order profile, and sensitivity to the parameters used in the Φ-value calculation. The results showed that any one of these properties alone was not enough to explain protein folding, although each one played a significant role in it. We have confirmed the importance of characterizing protein folding from various perspectives. Our findings have also highlighted that protein folding is highly variable and unique across different proteins, and this should be considered while pursuing a unified theory of protein folding.

  6. Characterization of protein folding by a Φ-value calculation with a statistical-mechanical model

    PubMed Central

    Wako, Hiroshi; Abe, Haruo

    2016-01-01

    The Φ-value analysis approach provides information about transition-state structures along the folding pathway of a protein by measuring the effects of an amino acid mutation on folding kinetics. Here we compared the theoretically calculated Φ values of 27 proteins with their experimentally observed Φ values; the theoretical values were calculated using a simple statistical-mechanical model of protein folding. The theoretically calculated Φ values reflected the corresponding experimentally observed Φ values with reasonable accuracy for many of the proteins, but not for all. The correlation between the theoretically calculated and experimentally observed Φ values strongly depends on whether the protein-folding mechanism assumed in the model holds true in real proteins. In other words, the correlation coefficient can be expected to illuminate the folding mechanisms of proteins, providing the answer to the question of which model more accurately describes protein folding: the framework model or the nucleation-condensation model. In addition, we tried to characterize protein folding with respect to various properties of each protein apart from the size and fold class, such as the free-energy profile, contact-order profile, and sensitivity to the parameters used in the Φ-value calculation. The results showed that any one of these properties alone was not enough to explain protein folding, although each one played a significant role in it. We have confirmed the importance of characterizing protein folding from various perspectives. Our findings have also highlighted that protein folding is highly variable and unique across different proteins, and this should be considered while pursuing a unified theory of protein folding. PMID:28409079

  7. Large scale ab initio modeling of structurally uncharacterized antimicrobial peptides reveals known and novel folds.

    PubMed

    Kozic, Mara; Fox, Stephen J; Thomas, Jens M; Verma, Chandra S; Rigden, Daniel J

    2018-05-01

    Antimicrobial resistance within a wide range of infectious agents is a severe and growing public health threat. Antimicrobial peptides (AMPs) are among the leading alternatives to current antibiotics, exhibiting broad spectrum activity. Their activity is determined by numerous properties such as cationic charge, amphipathicity, size, and amino acid composition. Currently, only around 10% of known AMP sequences have experimentally solved structures. To improve our understanding of the AMP structural universe we have carried out large scale ab initio 3D modeling of structurally uncharacterized AMPs that revealed similarities between predicted folds of the modeled sequences and structures of characterized AMPs. Two of the peptides whose models matched known folds are Lebocin Peptide 1A (LP1A) and Odorranain M, predicted to form β-hairpins but, interestingly, to lack the intramolecular disulfide bonds, cation-π or aromatic interactions that generally stabilize such AMP structures. Other examples include Ponericin Q42, Latarcin 4a, Kassinatuerin 1, Ceratotoxin D, and CPF-B1 peptide, which have α-helical folds, as well as mixed αβ folds of human Histatin 2 peptide and Garvicin A which are, to the best of our knowledge, the first linear αββ fold AMPs lacking intramolecular disulfide bonds. In addition to fold matches to experimentally derived structures, unique folds were also obtained, namely for Microcin M and Ipomicin. These results help in understanding the range of protein scaffolds that naturally bear antimicrobial activity and may facilitate protein design efforts towards better AMPs. © 2018 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

  8. [Artificial Cysteine Bridges on the Surface of Green Fluorescent Protein Affect Hydration of Its Transition and Intermediate States].

    PubMed

    Melnik, T N; Nagibina, G S; Surin, A K; Glukhova, K A; Melnik, B S

    2018-01-01

    Studying the effect of cysteine bridges on different energy levels of multistage folding proteins will enable a better understanding of the process of folding and functioning of globular proteins. In particular, it will create prospects for directed change in the stability and rate of protein folding. In this work, using the method of differential scanning microcalorimetry, we have studied the effect of three cysteine bridges introduced in different structural elements of the green fluorescent protein on the denaturation enthalpies, activation energies, and heat-capacity increments when this protein passes from native to intermediate and transition states. The studies have allowed us to confirm that, with this protein denaturation, the process hardly damages the structure initially, but then changes occur in the protein structure in the region of 4-6 beta sheets. The cysteine bridge introduced in this region decreases the hydration of the second transition state and increases the hydration of the second intermediate state during the thermal denaturation of the green fluorescent protein.

  9. Protein collapse is encoded in the folded state architecture.

    PubMed

    Samanta, Himadri S; Zhuravlev, Pavel I; Hinczewski, Michael; Hori, Naoto; Chakrabarti, Shaon; Thirumalai, D

    2017-05-21

    Folded states of single domain globular proteins are compact with high packing density. The radius of gyration, R g , of both the folded and unfolded states increase as N ν where N is the number of amino acids in the protein. The values of the Flory exponent ν are, respectively, ≈⅓ and ≈0.6 in the folded and unfolded states, coinciding with those for homopolymers. However, the extent of compaction of the unfolded state of a protein under low denaturant concentration (collapsibility), conditions favoring the formation of the folded state, is unknown. We develop a theory that uses the contact map of proteins as input to quantitatively assess collapsibility of proteins. Although collapsibility is universal, the propensity to be compact depends on the protein architecture. Application of the theory to over two thousand proteins shows that collapsibility depends not only on N but also on the contact map reflecting the native structure. A major prediction of the theory is that β-sheet proteins are far more collapsible than structures dominated by α-helices. The theory and the accompanying simulations, validating the theoretical predictions, provide insights into the differing conclusions reached using different experimental probes assessing the extent of compaction of proteins. By calculating the criterion for collapsibility as a function of protein length we provide quantitative insights into the reasons why single domain proteins are small and the physical reasons for the origin of multi-domain proteins. Collapsibility of non-coding RNA molecules is similar β-sheet proteins structures adding support to "Compactness Selection Hypothesis".

  10. Universal partitioning of the hierarchical fold network of 50-residue segments in proteins

    PubMed Central

    Ito, Jun-ichi; Sonobe, Yuki; Ikeda, Kazuyoshi; Tomii, Kentaro; Higo, Junichi

    2009-01-01

    Background Several studies have demonstrated that protein fold space is structured hierarchically and that power-law statistics are satisfied in relation between the numbers of protein families and protein folds (or superfamilies). We examined the internal structure and statistics in the fold space of 50 amino-acid residue segments taken from various protein folds. We used inter-residue contact patterns to measure the tertiary structural similarity among segments. Using this similarity measure, the segments were classified into a number (Kc) of clusters. We examined various Kc values for the clustering. The special resolution to differentiate the segment tertiary structures increases with increasing Kc. Furthermore, we constructed networks by linking structurally similar clusters. Results The network was partitioned persistently into four regions for Kc ≥ 1000. This main partitioning is consistent with results of earlier studies, where similar partitioning was reported in classifying protein domain structures. Furthermore, the network was partitioned naturally into several dozens of sub-networks (i.e., communities). Therefore, intra-sub-network clusters were mutually connected with numerous links, although inter-sub-network ones were rarely done with few links. For Kc ≥ 1000, the major sub-networks were about 40; the contents of the major sub-networks were conserved. This sub-partitioning is a novel finding, suggesting that the network is structured hierarchically: Segments construct a cluster, clusters form a sub-network, and sub-networks constitute a region. Additionally, the network was characterized by non-power-law statistics, which is also a novel finding. Conclusion Main findings are: (1) The universe of 50 residue segments found here was characterized by non-power-law statistics. Therefore, the universe differs from those ever reported for the protein domains. (2) The 50-residue segments were partitioned persistently and universally into some dozens (ca. 40) of major sub-networks, irrespective of the number of clusters. (3) These major sub-networks encompassed 90% of all segments. Consequently, the protein tertiary structure is constructed using the dozens of elements (sub-networks). PMID:19454039

  11. WeFold: A Coopetition for Protein Structure Prediction

    PubMed Central

    Khoury, George A.; Liwo, Adam; Khatib, Firas; Zhou, Hongyi; Chopra, Gaurav; Bacardit, Jaume; Bortot, Leandro O.; Faccioli, Rodrigo A.; Deng, Xin; He, Yi; Krupa, Pawel; Li, Jilong; Mozolewska, Magdalena A.; Sieradzan, Adam K.; Smadbeck, James; Wirecki, Tomasz; Cooper, Seth; Flatten, Jeff; Xu, Kefan; Baker, David; Cheng, Jianlin; Delbem, Alexandre C. B.; Floudas, Christodoulos A.; Keasar, Chen; Levitt, Michael; Popović, Zoran; Scheraga, Harold A.; Skolnick, Jeffrey; Crivelli, Silvia N.; Players, Foldit

    2014-01-01

    The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by thirteen labs. During the collaboration, the labs were simultaneously competing with each other. Here, we present the first attempt at “coopetition” in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org. PMID:24677212

  12. My 65 years in protein chemistry.

    PubMed

    Scheraga, Harold A

    2015-05-01

    This is a tour of a physical chemist through 65 years of protein chemistry from the time when emphasis was placed on the determination of the size and shape of the protein molecule as a colloidal particle, with an early breakthrough by James Sumner, followed by Linus Pauling and Fred Sanger, that a protein was a real molecule, albeit a macromolecule. It deals with the recognition of the nature and importance of hydrogen bonds and hydrophobic interactions in determining the structure, properties, and biological function of proteins until the present acquisition of an understanding of the structure, thermodynamics, and folding pathways from a linear array of amino acids to a biological entity. Along the way, with a combination of experiment and theoretical interpretation, a mechanism was elucidated for the thrombin-induced conversion of fibrinogen to a fibrin blood clot and for the oxidative-folding pathways of ribonuclease A. Before the atomic structure of a protein molecule was determined by x-ray diffraction or nuclear magnetic resonance spectroscopy, experimental studies of the fundamental interactions underlying protein structure led to several distance constraints which motivated the theoretical approach to determine protein structure, and culminated in the Empirical Conformational Energy Program for Peptides (ECEPP), an all-atom force field, with which the structures of fibrous collagen-like proteins and the 46-residue globular staphylococcal protein A were determined. To undertake the study of larger globular proteins, a physics-based coarse-grained UNited-RESidue (UNRES) force field was developed, and applied to the protein-folding problem in terms of structure, thermodynamics, dynamics, and folding pathways. Initially, single-chain and, ultimately, multiple-chain proteins were examined, and the methodology was extended to protein-protein interactions and to nucleic acids and to protein-nucleic acid interactions. The ultimate results led to an understanding of a variety of biological processes underlying natural and disease phenomena.

  13. Folding pathway of a multidomain protein depends on its topology of domain connectivity

    PubMed Central

    Inanami, Takashi; Terada, Tomoki P.; Sasai, Masaki

    2014-01-01

    How do the folding mechanisms of multidomain proteins depend on protein topology? We addressed this question by developing an Ising-like structure-based model and applying it for the analysis of free-energy landscapes and folding kinetics of an example protein, Escherichia coli dihydrofolate reductase (DHFR). DHFR has two domains, one comprising discontinuous N- and C-terminal parts and the other comprising a continuous middle part of the chain. The simulated folding pathway of DHFR is a sequential process during which the continuous domain folds first, followed by the discontinuous domain, thereby avoiding the rapid decrease in conformation entropy caused by the association of the N- and C-terminal parts during the early phase of folding. Our simulated results consistently explain the observed experimental data on folding kinetics and predict an off-pathway structural fluctuation at equilibrium. For a circular permutant for which the topological complexity of wild-type DHFR is resolved, the balance between energy and entropy is modulated, resulting in the coexistence of the two folding pathways. This coexistence of pathways should account for the experimentally observed complex folding behavior of the circular permutant. PMID:25267632

  14. Only Five of 10 Strictly Conserved Disulfide Bonds Are Essential for Folding and Eight for Function of the HIV-1 Envelope Glycoprotein

    PubMed Central

    van Anken, Eelco; Sanders, Rogier W.; Liscaljet, I. Marije; Land, Aafke; Bontjer, Ilja; Tillemans, Sonja; Nabatov, Alexey A.; Paxton, William A.; Berkhout, Ben

    2008-01-01

    Protein folding in the endoplasmic reticulum goes hand in hand with disulfide bond formation, and disulfide bonds are considered key structural elements for a protein's folding and function. We used the HIV-1 Envelope glycoprotein to examine in detail the importance of its 10 completely conserved disulfide bonds. We systematically mutated the cysteines in its ectodomain, assayed the mutants for oxidative folding, transport, and incorporation into the virus, and tested fitness of mutant viruses. We found that the protein was remarkably tolerant toward manipulation of its disulfide-bonded structure. Five of 10 disulfide bonds were dispensable for folding. Two of these were even expendable for viral replication in cell culture, indicating that the relevance of these disulfide bonds becomes manifest only during natural infection. Our findings refine old paradigms on the importance of disulfide bonds for proteins. PMID:18653472

  15. Influence of the native topology on the folding barrier for small proteins

    NASA Astrophysics Data System (ADS)

    Prieto, Lidia; Rey, Antonio

    2007-11-01

    The possibility of downhill instead of two-state folding for proteins has been a very controversial topic which arose from recent experimental studies. From the theoretical side, this question has also been accomplished in different ways. Given the experimental observation that a relationship exists between the native structure topology of a protein and the kinetic and thermodynamic properties of its folding process, Gō-type potentials are an appropriate way to approach this problem. In this work, we employ an interaction potential from this family to get a better insight on the topological characteristics of the native state that may somehow determine the presence of a thermodynamic barrier in the folding pathway. The results presented here show that, indeed, the native topology of a small protein has a great influence on its folding behavior, mostly depending on the proportion of local and long range contacts the protein has in its native structure. Furthermore, when all the interactions present contribute in a balanced way, the transition results to be cooperative. Otherwise, the tendency to a downhill folding behavior increases.

  16. Structural Bridges through Fold Space.

    PubMed

    Edwards, Hannah; Deane, Charlotte M

    2015-09-01

    Several protein structure classification schemes exist that partition the protein universe into structural units called folds. Yet these schemes do not discuss how these units sit relative to each other in a global structure space. In this paper we construct networks that describe such global relationships between folds in the form of structural bridges. We generate these networks using four different structural alignment methods across multiple score thresholds. The networks constructed using the different methods remain a similar distance apart regardless of the probability threshold defining a structural bridge. This suggests that at least some structural bridges are method specific and that any attempt to build a picture of structural space should not be reliant on a single structural superposition method. Despite these differences all representations agree on an organisation of fold space into five principal community structures: all-α, all-β sandwiches, all-β barrels, α/β and α + β. We project estimated fold ages onto the networks and find that not only are the pairings of unconnected folds associated with higher age differences than bridged folds, but this difference increases with the number of networks displaying an edge. We also examine different centrality measures for folds within the networks and how these relate to fold age. While these measures interpret the central core of fold space in varied ways they all identify the disposition of ancestral folds to fall within this core and that of the more recently evolved structures to provide the peripheral landscape. These findings suggest that evolutionary information is encoded along these structural bridges. Finally, we identify four highly central pivotal folds representing dominant topological features which act as key attractors within our landscapes.

  17. Structure-based barcoding of proteins.

    PubMed

    Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin

    2014-01-01

    A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.

  18. Detecting Selection on Protein Stability through Statistical Mechanical Models of Folding and Evolution

    PubMed Central

    Bastolla, Ugo

    2014-01-01

    The properties of biomolecules depend both on physics and on the evolutionary process that formed them. These two points of view produce a powerful synergism. Physics sets the stage and the constraints that molecular evolution has to obey, and evolutionary theory helps in rationalizing the physical properties of biomolecules, including protein folding thermodynamics. To complete the parallelism, protein thermodynamics is founded on the statistical mechanics in the space of protein structures, and molecular evolution can be viewed as statistical mechanics in the space of protein sequences. In this review, we will integrate both points of view, applying them to detecting selection on the stability of the folded state of proteins. We will start discussing positive design, which strengthens the stability of the folded against the unfolded state of proteins. Positive design justifies why statistical potentials for protein folding can be obtained from the frequencies of structural motifs. Stability against unfolding is easier to achieve for longer proteins. On the contrary, negative design, which consists in destabilizing frequently formed misfolded conformations, is more difficult to achieve for longer proteins. The folding rate can be enhanced by strengthening short-range native interactions, but this requirement contrasts with negative design, and evolution has to trade-off between them. Finally, selection can accelerate functional movements by favoring low frequency normal modes of the dynamics of the native state that strongly correlate with the functional conformation change. PMID:24970217

  19. An Evolutionarily Structured Universe of Protein Architecture

    PubMed Central

    Caetano-Anollés, Gustavo; Caetano-Anollés, Derek

    2003-01-01

    Protein structural diversity encompasses a finite set of architectural designs. Embedded in these topologies are evolutionary histories that we here uncover using cladistic principles and measurements of protein-fold usage and sharing. The reconstructed phylogenies are inherently rooted and depict histories of protein and proteome diversification. Proteome phylogenies showed two monophyletic sister-groups delimiting Bacteria and Archaea, and a topology rooted in Eucarya. This suggests three dramatic evolutionary events and a common ancestor with a eukaryotic-like, gene-rich, and relatively modern organization. Conversely, a general phylogeny of protein architectures showed that structural classes of globular proteins appeared early in evolution and in defined order, the α/β class being the first. Although most ancestral folds shared a common architecture of barrels or interleaved β-sheets and α-helices, many were clearly derived, such as polyhedral folds in the all-α class and β-sandwiches, β-propellers, and β-prisms in all-β proteins. We also describe transformation pathways of architectures that are prevalently used in nature. For example, β-barrels with increased curl and stagger were favored evolutionary outcomes in the all-β class. Interestingly, we found cases where structural change followed the α-to-β tendency uncovered in the tree of architectures. Lastly, we traced the total number of enzymatic functions associated with folds in the trees and show that there is a general link between structure and enzymatic function. PMID:12840035

  20. Dynamical Coupling of Intrinsically Disordered Proteins and Their Hydration Water: Comparison with Folded Soluble and Membrane Proteins

    PubMed Central

    Gallat, F.-X.; Laganowsky, A.; Wood, K.; Gabel, F.; van Eijck, L.; Wuttke, J.; Moulin, M.; Härtlein, M.; Eisenberg, D.; Colletier, J.-P.; Zaccai, G.; Weik, M.

    2012-01-01

    Hydration water is vital for various macromolecular biological activities, such as specific ligand recognition, enzyme activity, response to receptor binding, and energy transduction. Without hydration water, proteins would not fold correctly and would lack the conformational flexibility that animates their three-dimensional structures. Motions in globular, soluble proteins are thought to be governed to a certain extent by hydration-water dynamics, yet it is not known whether this relationship holds true for other protein classes in general and whether, in turn, the structural nature of a protein also influences water motions. Here, we provide insight into the coupling between hydration-water dynamics and atomic motions in intrinsically disordered proteins (IDP), a largely unexplored class of proteins that, in contrast to folded proteins, lack a well-defined three-dimensional structure. We investigated the human IDP tau, which is involved in the pathogenic processes accompanying Alzheimer disease. Combining neutron scattering and protein perdeuteration, we found similar atomic mean-square displacements over a large temperature range for the tau protein and its hydration water, indicating intimate coupling between them. This is in contrast to the behavior of folded proteins of similar molecular weight, such as the globular, soluble maltose-binding protein and the membrane protein bacteriorhodopsin, which display moderate to weak coupling, respectively. The extracted mean square displacements also reveal a greater motional flexibility of IDP compared with globular, folded proteins and more restricted water motions on the IDP surface. The results provide evidence that protein and hydration-water motions mutually affect and shape each other, and that there is a gradient of coupling across different protein classes that may play a functional role in macromolecular activity in a cellular context. PMID:22828339

  1. Protein backbone engineering as a strategy to advance foldamers toward the frontier of protein-like tertiary structure.

    PubMed

    Reinert, Zachary E; Horne, W Seth

    2014-11-28

    A variety of non-biological structural motifs have been incorporated into the backbone of natural protein sequences. In parallel work, diverse unnatural oligomers of de novo design (termed "foldamers") have been developed that fold in defined ways. In this Perspective article, we survey foundational studies on protein backbone engineering, with a focus on alterations made in the context of complex tertiary folds. We go on to summarize recent work illustrating the potential promise of these methods to provide a general framework for the construction of foldamer mimics of protein tertiary structures.

  2. Analysis of periplasmic sensor domains from Anaeromyxobacter dehalogenans 2CP-C: Structure of one sensor domain from a histidine kinase and another from a chemotaxis protein

    PubMed Central

    Pokkuluri, P Raj; Dwulit-Smith, Jeff; Duke, Norma E; Wilton, Rosemarie; Mack, Jamey C; Bearden, Jessica; Rakowski, Ella; Babnigg, Gyorgy; Szurmant, Hendrik; Joachimiak, Andrzej; Schiffer, Marianne

    2013-01-01

    Anaeromyxobacter dehalogenans is a δ-proteobacterium found in diverse soils and sediments. It is of interest in bioremediation efforts due to its dechlorination and metal-reducing capabilities. To gain an understanding on A. dehalogenans' abilities to adapt to diverse environments we analyzed its signal transduction proteins. The A. dehalogenans genome codes for a large number of sensor histidine kinases (HK) and methyl-accepting chemotaxis proteins (MCP); among these 23 HK and 11 MCP proteins have a sensor domain in the periplasm. These proteins most likely contribute to adaptation to the organism's surroundings. We predicted their three-dimensional folds and determined the structures of two of the periplasmic sensor domains by X-ray diffraction. Most of the domains are predicted to have either PAS-like or helical bundle structures, with two predicted to have solute-binding protein fold, and another predicted to have a 6-phosphogluconolactonase like fold. Atomic structures of two sensor domains confirmed the respective fold predictions. The Adeh_2942 sensor (HK) was found to have a helical bundle structure, and the Adeh_3718 sensor (MCP) has a PAS-like structure. Interestingly, the Adeh_3718 sensor has an acetate moiety bound in a binding site typical for PAS-like domains. Future work is needed to determine whether Adeh_3718 is involved in acetate sensing by A. dehalogenans. PMID:23897711

  3. Stabilities and Dynamics of Protein Folding Nuclei by Molecular Dynamics Simulation

    NASA Astrophysics Data System (ADS)

    Song, Yong-Shun; Zhou, Xin; Zheng, Wei-Mou; Wang, Yan-Ting

    2017-07-01

    To understand how the stabilities of key nuclei fragments affect protein folding dynamics, we simulate by molecular dynamics (MD) simulation in aqueous solution four fragments cut out of a protein G, including one α-helix (seqB: KVFKQYAN), two β-turns (seqA: LNGKTLKG and seqC: YDDATKTF), and one β-strand (seqD: DGEWTYDD). The Markov State Model clustering method combined with the coarse-grained conformation letters method are employed to analyze the data sampled from 2-μs equilibrium MD simulation trajectories. We find that seqA and seqB have more stable structures than their native structures which become metastable when cut out of the protein structure. As expected, seqD alone is flexible and does not have a stable structure. Throughout our simulations, the native structure of seqC is stable but cannot be reached if starting from a structure other than the native one, implying a funnel-shape free energy landscape of seqC in aqueous solution. All the above results suggest that different nuclei have different formation dynamics during protein folding, which may have a major contribution to the hierarchy of protein folding dynamics. Supported by the National Basic Research Program of China under Grant No. 2013CB932804, the National Natural Science Foundation of China under Grant No. 11421063, and the CAS Biophysics Interdisciplinary Innovation Team Project

  4. Elucidating Peptide and Protein Structure and Dynamics: UV Resonance Raman Spectroscopy

    PubMed Central

    Oladepo, Sulayman A.; Xiong, Kan; Hong, Zhenmin; Asher, Sanford A.

    2011-01-01

    UV resonance Raman spectroscopy (UVRR) is a powerful method that has the requisite selectivity and sensitivity to incisively monitor biomolecular structure and dynamics in solution. In this perspective, we highlight applications of UVRR for studying peptide and protein structure and the dynamics of protein and peptide folding. UVRR spectral monitors of protein secondary structure, such as the Amide III3 band and the Cα-H band frequencies and intensities can be used to determine Ramachandran Ψ angle distributions for peptide bonds. These incisive, quantitative glimpses into conformation can be combined with kinetic T-jump methodologies to monitor the dynamics of biomolecular conformational transitions. The resulting UVRR structural insight is impressive in that it allows differentiation of, for example, different α-helix-like states that enable differentiating π- and 310- states from pure α-helices. These approaches can be used to determine the Gibbs free energy landscape of individual peptide bonds along the most important protein (un)folding coordinate. Future work will find spectral monitors that probe peptide bond activation barriers that control protein (un)folding mechanisms. In addition, UVRR studies of sidechain vibrations will probe the role of side chains in determining protein secondary, tertiary and quaternary structures. PMID:21379371

  5. Investigating homology between proteins using energetic profiles.

    PubMed

    Wrabl, James O; Hilser, Vincent J

    2010-03-26

    Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology.

  6. Picosecond to nanosecond dynamics provide a source of conformational entropy for protein folding.

    PubMed

    Stadler, Andreas M; Demmel, Franz; Ollivier, Jacques; Seydel, Tilo

    2016-08-03

    Myoglobin can be trapped in fully folded structures, partially folded molten globules, and unfolded states under stable equilibrium conditions. Here, we report an experimental study on the conformational dynamics of different folded conformational states of apo- and holomyoglobin in solution. Global protein diffusion and internal molecular motions were probed by neutron time-of-flight and neutron backscattering spectroscopy on the picosecond and nanosecond time scales. Global protein diffusion was found to depend on the α-helical content of the protein suggesting that charges on the macromolecule increase the short-time diffusion of protein. With regard to the molten globules, a gel-like phase due to protein entanglement and interactions with neighbouring macromolecules was visible due to a reduction of the global diffusion coefficients on the nanosecond time scale. Diffusion coefficients, residence and relaxation times of internal protein dynamics and root mean square displacements of localised internal motions were determined for the investigated structural states. The difference in conformational entropy ΔSconf of the protein between the unfolded and the partially or fully folded conformations was extracted from the measured root mean square displacements. Using thermodynamic parameters from the literature and the experimentally determined ΔSconf values we could identify the entropic contribution of the hydration shell ΔShydr of the different folded states. Our results point out the relevance of conformational entropy of the protein and the hydration shell for stability and folding of myoglobin.

  7. Effective Potentials for Folding Proteins

    NASA Astrophysics Data System (ADS)

    Chen, Nan-Yow; Su, Zheng-Yao; Mou, Chung-Yu

    2006-02-01

    A coarse-grained off-lattice model that is not biased in any way to the native state is proposed to fold proteins. To predict the native structure in a reasonable time, the model has included the essential effects of water in an effective potential. Two new ingredients, the dipole-dipole interaction and the local hydrophobic interaction, are introduced and are shown to be as crucial as the hydrogen bonding. The model allows successful folding of the wild-type sequence of protein G and may have provided important hints to the study of protein folding.

  8. Protein folding: understanding the role of water and the low Reynolds number environment as the peptide chain emerges from the ribosome and folds.

    PubMed

    Sen, Siddhartha; Voorheis, H Paul

    2014-12-21

    The mechanism of protein folding during early stages of the process has three determinants. First, moving water molecules obey the rules of low Reynolds number physics without an inertial component. Molecular movement is instantaneous and size insensitive. Proteins emerging from the ribosome move and rotate without an external force if they change shape, forming and propagating helical structures that increases translocational efficiency. Forward motion ceases when the shape change or propelling force ceases. Second, application of quantum field theory to water structure predicts the spontaneous formation of low density coherent units of fixed size that expel dissolved atmospheric gases. Structured water layers with both coherent and non-coherent domains, form a sheath around the new protein. The surface of exposed hydrophobic amino acids is protected from water contact by small nanobubbles of dissolved atmospheric gases, 5 or 6 molecules on average, that vibrate, attracting even widely separated resonating nanobubbles. This force results from quantum effects, appearing only when the system is within and interacts with an oscillating electromagnetic field. The newly recognized quantum force sharply bends the peptide and is part of a dynamic field determining the pathway of protein folding. Third, the force initiating the tertiary folding of proteins arises from twists at the position of each hydrophobic amino acid, that minimizes surface exposure of the hydrophobic amino acids and propagates along the protein. When the total bend reaches 360°, the leading segment of water sheath intersects the trailing segment. This steric self-intersection expels water from overlapping segments of the sheath and by Newton׳s second law moves the polypeptide chain in an opposite direction. Consequently, with very few exceptions that we enumerate and discuss, tertiary structures are absent from proteins without hydrophobic amino acids, which control the early stages of protein folding and the overall shape of protein. Consequently, proteins only adopt a limited number of forms. The formation of quaternary structures is not necessarily prevented by the absence of hydrophobic amino acids. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. A similarity measure for partially folded proteins: application to unfolded and native-like conformational fluctuations

    NASA Astrophysics Data System (ADS)

    Larios, Edgar; Yang, Wei Y.; Schulten, K.; Gruebele, M.

    2004-12-01

    Computing the root-mean-square deviation (RMSD) of a partially folded protein structure from the folded state requires the two structures to be translationally and rotationally aligned. We examine the constraint matrix L that preserves orthogonality of the rotation matrix during minimization of the RMSD. L is proportional to the sensitivity of the RMSD to the rotational alignment matrix. Its trace yields an isotropic reaction coordinate, while its off-diagonal matrix elements are related to the moment of inertia derivative tensor that encodes anisotropic information about the structure. We use L to compare λ-repressor fragment 6-85 (λ 6-85) to several partially folded structures obtained from molecular dynamics simulation (MD), and find that L as a reaction coordinate indeed encodes some information about protein topology. We also apply C α RMSD, L and tryptophan sidechain mobility as criteria for native state structural fluctuations of several λ 6-85 mutants. The mutants' denaturation curves and fluorescence quenching are measured experimentally for comparison. The results are in accord with a recent proposal that structural fluctuations near the chromophore can induce increased native state fluorescence or hyperfluorescence during unfolding of proteins.

  10. My 65 years in protein chemistry

    PubMed Central

    Scheraga, Harold A.

    2015-01-01

    This is a tour of a physical chemist through 65 years of protein chemistry from the time when emphasis was placed on the determination of the size and shape of the protein molecule as a colloidal particle, with an early breakthrough by James Sumner, followed by Linus Pauling and Fred Sanger, that a protein was a real molecule, albeit a macromolecule. It deals with the recognition of the nature and importance of hydrogen bonds and hydrophobic interactions in determining the structure, properties, and biological function of proteins until the present acquisition of an understanding of the structure, thermodynamics, and folding pathways from a linear array of amino acids to a biological entity. Along the way, with a combination of experiment and theoretical interpretation, a mechanism was elucidated for the thrombin-induced conversion of fibrinogen to a fibrin blood clot and for the oxidative-folding pathways of ribonuclease A. Before the atomic structure of a protein molecule was determined by x-ray diffraction or nuclear magnetic resonance spectroscopy, experimental studies of the fundamental interactions underlying protein structure led to several distance constraints which motivated the theoretical approach to determine protein structure, and culminated in the Empirical Conformational Energy Program for Peptides (ECEPP), an all-atom force field, with which the structures of fibrous collagen-like proteins and the 46-residue globular staphylococcal protein A were determined. To undertake the study of larger globular proteins, a physics-based coarse-grained UNited-RESidue (UNRES) force field was developed, and applied to the protein-folding problem in terms of structure, thermodynamics, dynamics, and folding pathways. Initially, single-chain and, ultimately, multiple-chain proteins were examined, and the methodology was extended to protein–protein interactions and to nucleic acids and to protein–nucleic acid interactions. The ultimate results led to an understanding of a variety of biological processes underlying natural and disease phenomena. PMID:25850343

  11. Hidden Structural Codes in Protein Intrinsic Disorder.

    PubMed

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  12. Protein structure prediction with local adjust tabu search algorithm

    PubMed Central

    2014-01-01

    Background Protein folding structure prediction is one of the most challenging problems in the bioinformatics domain. Because of the complexity of the realistic protein structure, the simplified structure model and the computational method should be adopted in the research. The AB off-lattice model is one of the simplification models, which only considers two classes of amino acids, hydrophobic (A) residues and hydrophilic (B) residues. Results The main work of this paper is to discuss how to optimize the lowest energy configurations in 2D off-lattice model and 3D off-lattice model by using Fibonacci sequences and real protein sequences. In order to avoid falling into local minimum and faster convergence to the global minimum, we introduce a novel method (SATS) to the protein structure problem, which combines simulated annealing algorithm and tabu search algorithm. Various strategies, such as the new encoding strategy, the adaptive neighborhood generation strategy and the local adjustment strategy, are adopted successfully for high-speed searching the optimal conformation corresponds to the lowest energy of the protein sequences. Experimental results show that some of the results obtained by the improved SATS are better than those reported in previous literatures, and we can sure that the lowest energy folding state for short Fibonacci sequences have been found. Conclusions Although the off-lattice models is not very realistic, they can reflect some important characteristics of the realistic protein. It can be found that 3D off-lattice model is more like native folding structure of the realistic protein than 2D off-lattice model. In addition, compared with some previous researches, the proposed hybrid algorithm can more effectively and more quickly search the spatial folding structure of a protein chain. PMID:25474708

  13. Resolution of the unfolded state.

    NASA Astrophysics Data System (ADS)

    Beaucage, Gregory

    2008-03-01

    The unfolded states in proteins and nucleic acids remain weakly understood despite their importance to protein folding; misfolding diseases (Parkinson's & Alzheimer's); natively unfolded proteins (˜ 30% of eukaryotic proteins); and to understanding ribozymes. Research has been hindered by the inability to quantify the residual (native) structure present in an unfolded protein or nucleic acid. Here, a scaling model is proposed to quantify the degree of folding and the unfolded state (Beaucage, 2004, 2007). The model takes a global view of protein structure and can be applied to a number of analytic methods and to simulations. Three examples are given of application to small-angle scattering from pressure induced unfolding of SNase (Panick, 1998), from acid unfolded Cyt c (Kataoka, 1993) and from folding of Azoarcus ribozyme (Perez-Salas, 2004). These examples quantitatively show 3 characteristic unfolded states for proteins, the statistical nature of a folding pathway and the relationship between extent of folding and chain size during folding for charge driven folding in RNA. Beaucage, G., Biophys. J., in press (2007). Beaucage, G., Phys. Rev. E. 70, 031401 (2004). Kataoka, M., Y. Hagihara, K. Mihara, Y. Goto J. Mol. Biol. 229, 591 (1993). Panick, G., R. Malessa, R. Winter, G. Rapp, K. J. Frye, C. A. Royer J. Mol. Biol. 275, 389 (1998). Perez-Salas U. A., P. Rangan, S. Krueger, R. M. Briber, D. Thirumalai, S. A. Woodson, Biochemistry 43 1746 (2004).

  14. Conservation of protein structure over four billion years.

    PubMed

    Ingles-Prieto, Alvaro; Ibarra-Molero, Beatriz; Delgado-Delgado, Asuncion; Perez-Jimenez, Raul; Fernandez, Julio M; Gaucher, Eric A; Sanchez-Ruiz, Jose M; Gavira, Jose A

    2013-09-03

    Little is known about the evolution of protein structures and the degree of protein structure conservation over planetary time scales. Here, we report the X-ray crystal structures of seven laboratory resurrections of Precambrian thioredoxins dating up to approximately four billion years ago. Despite considerable sequence differences compared with extant enzymes, the ancestral proteins display the canonical thioredoxin fold, whereas only small structural changes have occurred over four billion years. This remarkable degree of structure conservation since a time near the last common ancestor of life supports a punctuated-equilibrium model of structure evolution in which the generation of new folds occurs over comparatively short periods and is followed by long periods of structural stasis. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Mining sequential patterns for protein fold recognition.

    PubMed

    Exarchos, Themis P; Papaloukas, Costas; Lampros, Christos; Fotiadis, Dimitrios I

    2008-02-01

    Protein data contain discriminative patterns that can be used in many beneficial applications if they are defined correctly. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. Protein classification in terms of fold recognition plays an important role in computational protein analysis, since it can contribute to the determination of the function of a protein whose structure is unknown. Specifically, one of the most efficient SPM algorithms, cSPADE, is employed for the analysis of protein sequence. A classifier uses the extracted sequential patterns to classify proteins in the appropriate fold category. For training and evaluating the proposed method we used the protein sequences from the Protein Data Bank and the annotation of the SCOP database. The method exhibited an overall accuracy of 25% in a classification problem with 36 candidate categories. The classification performance reaches up to 56% when the five most probable protein folds are considered.

  16. Peppytides: Interactive Models of Polypeptide Chains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zuckermann, Ron; Chakraborty, Promita; Derisi, Joe

    2014-01-21

    Peppytides are scaled, 3D-printed models of polypeptide chains that can be folded into accurate protein structures. Designed and created by Berkeley Lab Researcher, Promita Chakraborty, and Berkeley Lab Senior Scientist, Dr. Ron Zuckermann, Peppytides are accurate physical models of polypeptide chains that anyone can interact with and fold intro various protein structures - proving to be a great educational tool, resulting in a deeper understanding of these fascinating structures and how they function. Build your own Peppytide model and learn about how nature's machines fold into their intricate architectures!

  17. Peppytides: Interactive Models of Polypeptide Chains

    ScienceCinema

    Zuckermann, Ron; Chakraborty, Promita; Derisi, Joe

    2018-06-08

    Peppytides are scaled, 3D-printed models of polypeptide chains that can be folded into accurate protein structures. Designed and created by Berkeley Lab Researcher, Promita Chakraborty, and Berkeley Lab Senior Scientist, Dr. Ron Zuckermann, Peppytides are accurate physical models of polypeptide chains that anyone can interact with and fold intro various protein structures - proving to be a great educational tool, resulting in a deeper understanding of these fascinating structures and how they function. Build your own Peppytide model and learn about how nature's machines fold into their intricate architectures!

  18. Structure of GroEL in Complex with an Early Folding Intermediate of Alanine Glyoxylate Aminotransferase*

    PubMed Central

    Albert, Armando; Yunta, Cristina; Arranz, Rocío; Peña, Álvaro; Salido, Eduardo; Valpuesta, José María; Martín-Benito, Jaime

    2010-01-01

    Primary hyperoxaluria type 1 is a rare autosomal recessive disease caused by mutations in the alanine glyoxylate aminotransferase gene (AGXT). We have previously shown that P11L and I340M polymorphisms together with I244T mutation (AGXT-LTM) represent a conformational disease that could be amenable to pharmacological intervention. Thus, the study of the folding mechanism of AGXT is crucial to understand the molecular basis of the disease. Here, we provide biochemical and structural data showing that AGXT-LTM is able to form non-native folding intermediates. The three-dimensional structure of a complex between the bacterial chaperonin GroEL and a folding intermediate of AGXT-LTM mutant has been solved by cryoelectron microscopy. The electron density map shows the protein substrate in a non-native extended conformation that crosses the GroEL central cavity. Addition of ATP to the complex induces conformational changes on the chaperonin and the internalization of the protein substrate into the folding cavity. The structure provides a three-dimensional picture of an in vivo early ATP-dependent step of the folding reaction cycle of the chaperonin and supports a GroEL functional model in which the chaperonin promotes folding of the AGXT-LTM mutant protein through forced unfolding mechanism. PMID:20056599

  19. Structure of GroEL in complex with an early folding intermediate of alanine glyoxylate aminotransferase.

    PubMed

    Albert, Armando; Yunta, Cristina; Arranz, Rocío; Peña, Alvaro; Salido, Eduardo; Valpuesta, José María; Martín-Benito, Jaime

    2010-02-26

    Primary hyperoxaluria type 1 is a rare autosomal recessive disease caused by mutations in the alanine glyoxylate aminotransferase gene (AGXT). We have previously shown that P11L and I340M polymorphisms together with I244T mutation (AGXT-LTM) represent a conformational disease that could be amenable to pharmacological intervention. Thus, the study of the folding mechanism of AGXT is crucial to understand the molecular basis of the disease. Here, we provide biochemical and structural data showing that AGXT-LTM is able to form non-native folding intermediates. The three-dimensional structure of a complex between the bacterial chaperonin GroEL and a folding intermediate of AGXT-LTM mutant has been solved by cryoelectron microscopy. The electron density map shows the protein substrate in a non-native extended conformation that crosses the GroEL central cavity. Addition of ATP to the complex induces conformational changes on the chaperonin and the internalization of the protein substrate into the folding cavity. The structure provides a three-dimensional picture of an in vivo early ATP-dependent step of the folding reaction cycle of the chaperonin and supports a GroEL functional model in which the chaperonin promotes folding of the AGXT-LTM mutant protein through forced unfolding mechanism.

  20. Exploration of the relationship between topology and designability of conformations

    NASA Astrophysics Data System (ADS)

    Leelananda, Sumudu P.; Towfic, Fadi; Jernigan, Robert L.; Kloczkowski, Andrzej

    2011-06-01

    Protein structures are evolutionarily more conserved than sequences, and sequences with very low sequence identity frequently share the same fold. This leads to the concept of protein designability. Some folds are more designable and lots of sequences can assume that fold. Elucidating the relationship between protein sequence and the three-dimensional (3D) structure that the sequence folds into is an important problem in computational structural biology. Lattice models have been utilized in numerous studies to model protein folds and predict the designability of certain folds. In this study, all possible compact conformations within a set of two-dimensional and 3D lattice spaces are explored. Complementary interaction graphs are then generated for each conformation and are described using a set of graph features. The full HP sequence space for each lattice model is generated and contact energies are calculated by threading each sequence onto all the possible conformations. Unique conformation giving minimum energy is identified for each sequence and the number of sequences folding to each conformation (designability) is obtained. Machine learning algorithms are used to predict the designability of each conformation. We find that the highly designable structures can be distinguished from other non-designable conformations based on certain graphical geometric features of the interactions. This finding confirms the fact that the topology of a conformation is an important determinant of the extent of its designability and suggests that the interactions themselves are important for determining the designability.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ilieva, N., E-mail: nevena.ilieva@parallel.bas.bg; Dai, J., E-mail: daijing491@gmail.com; Sieradzan, A., E-mail: adams86@wp.pl

    Protein folding [1] is the process of formation of a functional 3D structure from a random coil — the shape in which amino-acid chains leave the ribosome. Anfinsen’s dogma states that the native 3D shape of a protein is completely determined by protein’s amino acid sequence. Despite the progress in understanding the process rate and the success in folding prediction for some small proteins, with presently available physics-based methods it is not yet possible to reliably deduce the shape of a biologically active protein from its amino acid sequence. The protein-folding problem endures as one of the most important unresolvedmore » problems in science; it addresses the origin of life itself. Furthermore, a wrong fold is a common cause for a protein to lose its function or even endanger the living organism. Soliton solutions of a generalized discrete non-linear Schrödinger equation (GDNLSE) obtained from the energy function in terms of bond and torsion angles κ and τ provide a constructive theoretical framework for describing protein folds and folding patterns [2]. Here we study the dynamics of this process by means of molecular-dynamics simulations. The soliton manifestation is the pattern helix–loop–helix in the secondary structure of the protein, which explains the importance of understanding loop formation in helical proteins. We performed in silico experiments for unfolding one subunit of the core structure of gp41 from the HIV envelope glycoprotein (PDB ID: 1AIK [3]) by molecular-dynamics simulations with the MD package GROMACS. We analyzed 80 ns trajectories, obtained with one united-atom and two different all-atom force fields, to justify the side-chain orientation quantification scheme adopted in the studies and to eliminate force-field based artifacts. Our results are compatible with the soliton model of protein folding and provide first insight into soliton-formation dynamics.« less

  2. Analysis of Translocation-Competent Secretory Proteins by HDX-MS.

    PubMed

    Tsirigotaki, A; Papanastasiou, M; Trelle, M B; Jørgensen, T J D; Economou, A

    2017-01-01

    Protein folding is an intricate and precise process in living cells. Most exported proteins evade cytoplasmic folding, become targeted to the membrane, and then trafficked into/across membranes. Their targeting and translocation-competent states are nonnatively folded. However, once they reach the appropriate cellular compartment, they can fold to their native states. The nonnative states of preproteins remain structurally poorly characterized since increased disorder, protein sizes, aggregation propensity, and the observation timescale are often limiting factors for typical structural approaches such as X-ray crystallography and NMR. Here, we present an alternative approach for the in vitro analysis of nonfolded translocation-competent protein states and their comparison with their native states. We make use of hydrogen/deuterium exchange coupled with mass spectrometry (HDX-MS), a method based on differentiated isotope exchange rates in structured vs unstructured protein states/regions, and highly dynamic vs more rigid regions. We present a complete structural characterization pipeline, starting from the preparation of the polypeptides to data analysis and interpretation. Proteolysis and mass spectrometric conditions for the analysis of the labeled proteins are discussed, followed by the analysis and interpretation of HDX-MS data. We highlight the suitability of HDX-MS for identifying short structured regions within otherwise highly flexible protein states, as illustrated by an exported protein example, experimentally tested in our lab. Finally, we discuss statistical analysis in comparative HDX-MS. The protocol is applicable to any protein and protein size, exhibiting slow or fast loss of translocation competence. It could be easily adapted to more complex assemblies, such as the interaction of chaperones with nonnative protein states. © 2017 Elsevier Inc. All rights reserved.

  3. New Protein Mimetics: The Zinc Finger Motif as a Locked-In Tertiary Fold.

    PubMed

    Tuchscherer, Gabriele; Lehmann, Christian; Mathieu, Marc

    1998-11-16

    The principle of a molecular kit is used for the covalent assembly of secondary structure forming peptide blocks to predetermined packing topologies. The resulting locked-in folds (LIFs; depicted schematically) are readily accessible and bypass the intriguing folding problem of linear peptide chains. This strategy allows, for example, mimicking of the essential structural and functional features of zinc finger proteins. © 1998 WILEY-VCH Verlag GmbH, Weinheim, Fed. Rep. of Germany.

  4. Single-molecule investigation of G-quadruplex folds of the human telomere sequence in a protein nanocavity

    PubMed Central

    An, Na; Fleming, Aaron M.; Middleton, Eric G.; Burrows, Cynthia J.

    2014-01-01

    Human telomeric DNA consists of tandem repeats of the sequence 5′-TTAGGG-3′ that can fold into various G-quadruplexes, including the hybrid, basket, and propeller folds. In this report, we demonstrate use of the α-hemolysin ion channel to analyze these subtle topological changes at a nanometer scale by providing structure-dependent electrical signatures through DNA–protein interactions. Whereas the dimensions of hybrid and basket folds allowed them to enter the protein vestibule, the propeller fold exceeds the size of the latch region, producing only brief collisions. After attaching a 25-mer poly-2′-deoxyadenosine extension to these structures, unraveling kinetics also were evaluated. Both the locations where the unfolding processes occur and the molecular shapes of the G-quadruplexes play important roles in determining their unfolding profiles. These results provide insights into the application of α-hemolysin as a molecular sieve to differentiate nanostructures as well as the potential technical hurdles DNA secondary structures may present to nanopore technology. PMID:25225404

  5. Multiple functional roles of the accessory I-domain of bacteriophage P22 coat protein revealed by NMR structure and cryoEM modeling

    PubMed Central

    Rizzo, Alessandro A.; Suhanovsky, Margaret M.; Baker, Matthew L.; Fraser, LaTasha C.R.; Jones, Lisa M.; Rempel, Don L.; Gross, Michael L.; Chiu, Wah; Alexandrescu, Andrei T.; Teschke, Carolyn M.

    2014-01-01

    SUMMARY Some capsid proteins built on the ubiquitous HK97-fold have accessory domains that impart specific functions. Bacteriophage P22 coat protein has a unique inserted I-domain. Two prior I-domain models from sub-nanometer cryoEM reconstructions differed substantially. Therefore, the NMR structure of the I-domain was determined, which also was used to improve cryoEM models of coat protein. The I-domain has an anti-parallel 6-stranded β-barrel fold, previously not observed in HK97-fold accessory domains. The D-loop, which is dynamic both in the isolated I-domain and intact monomeric coat protein, forms stabilizing salt bridges between adjacent capsomers in procapsids. A newly described S-loop is important for capsid size determination, likely through intra-subunit interactions. Ten of eighteen coat protein temperature-sensitive-folding substitutions are in the I-domain, indicating its importance in folding and stability. Several are found on a positively charged face of the β-barrel that anchors the I-domain to a negatively charged surface of the coat protein HK97-core. PMID:24836025

  6. Multiple functional roles of the accessory I-domain of bacteriophage P22 coat protein revealed by NMR structure and CryoEM modeling.

    PubMed

    Rizzo, Alessandro A; Suhanovsky, Margaret M; Baker, Matthew L; Fraser, LaTasha C R; Jones, Lisa M; Rempel, Don L; Gross, Michael L; Chiu, Wah; Alexandrescu, Andrei T; Teschke, Carolyn M

    2014-06-10

    Some capsid proteins built on the ubiquitous HK97-fold have accessory domains imparting specific functions. Bacteriophage P22 coat protein has a unique insertion domain (I-domain). Two prior I-domain models from subnanometer cryoelectron microscopy (cryoEM) reconstructions differed substantially. Therefore, the I-domain's nuclear magnetic resonance structure was determined and also used to improve cryoEM models of coat protein. The I-domain has an antiparallel six-stranded β-barrel fold, not previously observed in HK97-fold accessory domains. The D-loop, which is dynamic in the isolated I-domain and intact monomeric coat protein, forms stabilizing salt bridges between adjacent capsomers in procapsids. The S-loop is important for capsid size determination, likely through intrasubunit interactions. Ten of 18 coat protein temperature-sensitive-folding substitutions are in the I-domain, indicating its importance in folding and stability. Several are found on a positively charged face of the β-barrel that anchors the I-domain to a negatively charged surface of the coat protein HK97-core. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Protein Folding—How and Why: By Hydrogen Exchange, Fragment Separation, and Mass Spectrometry

    PubMed Central

    Englander, S. Walter; Mayne, Leland; Kan, Zhong-Yuan; Hu, Wenbing

    2017-01-01

    Advanced hydrogen exchange (HX) methodology can now determine the structure of protein folding intermediates and their progression in folding pathways. Key developments over time include the HX pulse labeling method with nuclear magnetic resonance analysis, development of the fragment separation method, the addition to it of mass spectrometric (MS) analysis, and recent improvements in the HX MS technique and data analysis. Also, the discovery of protein foldons and their role supplies an essential interpretive link. Recent work using HX pulse labeling with HX MS analysis finds that a number of proteins fold by stepping through a reproducible sequence of native-like intermediates in an ordered pathway. The stepwise nature of the pathway is dictated by the cooperative foldon unit construction of the protein. The pathway order is determined by a sequential stabilization principle; prior native-like structure guides the formation of adjacent native-like structure. This view does not match the funneled energy landscape paradigm of a very large number of folding tracks, which was framed before foldons were known. PMID:27145881

  8. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  9. Evaluating Protein Structure and Dynamics Using Co-Solvents, Photochemical Triggers, and Site-Specific Spectroscopic Probes

    NASA Astrophysics Data System (ADS)

    Abaskharon, Rachel M.

    As ubiquitous and diverse biopolymers, proteins are dynamic molecules that are constantly engaging in inter- and intramolecular interactions responsible for their structure, fold, and function. Because of this, gaining a comprehensive understanding of the factors that control protein conformation and dynamics remains elusive as current experimental techniques often lack the ability to initiate and probe a specific interaction or conformational transition. For this reason, this thesis aims to develop methods to control and monitor protein conformations, conformational transitions, and dynamics in a site-specific manner, as well as to understand how specific and non-specific interactions affect the protein folding energy landscape. First, by using the co-solvent, trifluoroethanol (TFE), we show that the rate at which a peptide folds can be greatly impacted and thus controlled by the excluded volume effect. Secondly, we demonstrate the utility of several light-responsive molecules and reactions as methods to manipulate and investigate protein-folding processes. Using an azobenzene linker as a photo-initiator, we are able to increase the folding rate of a protein system by an order of magnitude by channeling a sub-population through a parallel, faster folding pathway. Additionally, we utilize a tryptophan-mediated electron transfer process to a nearby disulfide bond to strategically unfold a protein molecule with ultraviolet light. We also demonstrate the potential of two ruthenium polypyridyl complexes as ultrafast phototriggers of protein reactions. Finally, we develop several site-specific spectroscopic probes of protein structure and environment. Specifically, we demonstrate that a 13C-labeled aspartic acid residue constitutes a useful site-specific infrared probe for investigating salt-bridges and hydration dynamics of proteins, particularly in proteins containing several acidic amino acids. We also show that a proline-derivative, 4-oxoproline, possesses novel infrared properties that can be exploited to monitor the cis-trans isomerization process of individual proline residues in proteins.

  10. Foldability of a Natural De Novo Evolved Protein.

    PubMed

    Bungard, Dixie; Copple, Jacob S; Yan, Jing; Chhun, Jimmy J; Kumirov, Vlad K; Foy, Scott G; Masel, Joanna; Wysocki, Vicki H; Cordes, Matthew H J

    2017-11-07

    The de novo evolution of protein-coding genes from noncoding DNA is emerging as a source of molecular innovation in biology. Studies of random sequence libraries, however, suggest that young de novo proteins will not fold into compact, specific structures typical of native globular proteins. Here we show that Bsc4, a functional, natural de novo protein encoded by a gene that evolved recently from noncoding DNA in the yeast S. cerevisiae, folds to a partially specific three-dimensional structure. Bsc4 forms soluble, compact oligomers with high β sheet content and a hydrophobic core, and undergoes cooperative, reversible denaturation. Bsc4 lacks a specific quaternary state, however, existing instead as a continuous distribution of oligomer sizes, and binds dyes indicative of amyloid oligomers or molten globules. The combination of native-like and non-native-like properties suggests a rudimentary fold that could potentially act as a functional intermediate in the emergence of new folded proteins de novo. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Observing a late folding intermediate of Ubiquitin at atomic resolution by NMR

    PubMed Central

    Surana, Parag

    2016-01-01

    Abstract The study of intermediates in the protein folding pathway provides a wealth of information about the energy landscape. The intermediates also frequently initiate pathogenic fibril formations. While observing the intermediates is difficult due to their transient nature, extreme conditions can partially unfold the proteins and provide a glimpse of the intermediate states. Here, we observe the high resolution structure of a hydrophobic core mutant of Ubiquitin at an extreme acidic pH by nuclear magnetic resonance (NMR) spectroscopy. In the structure, the native secondary and tertiary structure is conserved for a major part of the protein. However, a long loop between the beta strands β3 and β5 is partially unfolded. The altered structure is supported by fluorescence data and the difference in free energies between the native state and the intermediate is reflected in the denaturant induced melting curves. The unfolded region includes amino acids that are critical for interaction with cofactors as well as for assembly of poly‐Ubiquitin chains. The structure at acidic pH resembles a late folding intermediate of Ubiquitin and indicates that upon stabilization of the protein's core, the long loop converges on the core in the final step of the folding process. PMID:27111887

  12. Navigating ligand protein binding free energy landscapes: universality and diversity of protein folding and molecular recognition mechanisms

    NASA Astrophysics Data System (ADS)

    Verkhivker, Gennady M.; Rejto, Paul A.; Bouzida, Djamal; Arthurs, Sandra; Colson, Anthony B.; Freer, Stephan T.; Gehlhaar, Daniel K.; Larson, Veda; Luty, Brock A.; Marrone, Tami; Rose, Peter W.

    2001-03-01

    Thermodynamic and kinetic aspects of ligand-protein binding are studied for the methotrexate-dihydrofolate reductase system from the binding free energy profile constructed as a function of the order parameter. Thermodynamic stability of the native complex and a cooperative transition to the unique native structure suggest the nucleation kinetic mechanism at the equilibrium transition temperature. Structural properties of the transition state ensemble and the ensemble of nucleation conformations are determined by kinetic simulations of the transmission coefficient and ligand-protein association pathways. Structural analysis of the transition states and the nucleation conformations reconciles different views on the nucleation mechanism in protein folding.

  13. Coordinating Subdomains of Ferritin Protein Cages with Catalysis and Biomineralization viewed from the C4 Cage Axes

    PubMed Central

    Theil, Elizabeth C.; Turano, Paola; Ghini, Veronica; Allegrozzi, Marco; Bernacchioni, Caterina

    2014-01-01

    Integrated ferritin protein cage function is the reversible synthesis of protein-caged, solid Fe2O3•H2O minerals from Fe2+, for metabolic iron concentrates and oxidant protection; biomineral order varies in different ferritin proteins. The conserved 4, 3, 2 geometric symmetry of ferritin protein cages, parallels subunit dimer, trimer and tetramer interfaces, and coincides with function at several cage axes. Multiple subdomains distributed in the self- assembling ferritin nanocages have functional relationships to cage symmetry such as Fe2+ transport though ion channels (3-fold symmetry), biomineral nucleation/order (4-fold symmetry) and mineral dissolution (3-fold symmetry) studied in ferritin variants. Cage subunit dimers (2-fold symmetry) influence iron oxidation and mineral dissolution, based on effects of natural or synthetic subunit dimer crosslinks. 2Fe2+/O2 catalysis in ferritin occurs in single subunits, but with cooperativity (n=3) that is possibly related to the structure/function of the ion channels, which are constructed from segments of 3 subunits. Here, we study 2Fe2+ + O2 protein catalysis (diferric peroxo formation) and dissolution of ferritin Fe2O3•H2O biominerals in variants with altered subunit interfaces for trimers (ion channels), E130I, and external dimer surfaces (E88A) as controls, and altered tetramer subunit interfaces (L165I and H169F). The results extend observations on the functional importance of structure at ferritin protein 2-fold and 3-fold cage axes to show function at ferritin 4-fold cage axes. Here, conserved amino acids facilitate dissolution of ferritin protein-caged iron biominerals. Biological and nanotechnological uses of ferritin protein cage 4-fold symmetry and solid state mineral properties remain largely unexplored. PMID:24504941

  14. Mechanical Modeling and Computer Simulation of Protein Folding

    ERIC Educational Resources Information Center

    Prigozhin, Maxim B.; Scott, Gregory E.; Denos, Sharlene

    2014-01-01

    In this activity, science education and modern technology are bridged to teach students at the high school and undergraduate levels about protein folding and to strengthen their model building skills. Students are guided from a textbook picture of a protein as a rigid crystal structure to a more realistic view: proteins are highly dynamic…

  15. Algorithm for selection of optimized EPR distance restraints for de novo protein structure determination

    PubMed Central

    Kazmier, Kelli; Alexander, Nathan S.; Meiler, Jens; Mchaourab, Hassane S.

    2010-01-01

    A hybrid protein structure determination approach combining sparse Electron Paramagnetic Resonance (EPR) distance restraints and Rosetta de novo protein folding has been previously demonstrated to yield high quality models (Alexander et al., 2008). However, widespread application of this methodology to proteins of unknown structures is hindered by the lack of a general strategy to place spin label pairs in the primary sequence. In this work, we report the development of an algorithm that optimally selects spin labeling positions for the purpose of distance measurements by EPR. For the α-helical subdomain of T4 lysozyme (T4L), simulated restraints that maximize sequence separation between the two spin labels while simultaneously ensuring pairwise connectivity of secondary structure elements yielded vastly improved models by Rosetta folding. 50% of all these models have the correct fold compared to only 21% and 8% correctly folded models when randomly placed restraints or no restraints are used, respectively. Moreover, the improvements in model quality require a limited number of optimized restraints, the number of which is determined by the pairwise connectivities of T4L α-helices. The predicted improvement in Rosetta model quality was verified by experimental determination of distances between spin labels pairs selected by the algorithm. Overall, our results reinforce the rationale for the combined use of sparse EPR distance restraints and de novo folding. By alleviating the experimental bottleneck associated with restraint selection, this algorithm sets the stage for extending computational structure determination to larger, traditionally elusive protein topologies of critical structural and biochemical importance. PMID:21074624

  16. Conformational dynamics of a protein in the folded and the unfolded state

    NASA Astrophysics Data System (ADS)

    Fitter, Jörg

    2003-08-01

    In a quasielastic neutron scattering experiment, the picosecond dynamics of α-amylase was investigated for the folded and the unfolded state of the protein. In order to ensure a reasonable interpretation of the internal protein dynamics, the protein was measured in D 2O-buffer solution. The much higher structural flexibility of the pH induced unfolded state as compared to the native folded state was quantified using a simple analytical model, describing a local diffusion inside a sphere. In terms of this model the conformational volume, which is explored mainly by confined protein side-chain movements, is parameterized by the radius of a sphere (folded state, r=1.2 Å; unfolded state, 1.8 Å). Differences in conformational dynamics between the folded and the unfolded state of a protein are of fundamental interest in the field of protein science, because they are assumed to play an important role for the thermodynamics of folding/unfolding transition and for protein stability.

  17. Single-molecule chemo-mechanical unfolding reveals multiple transition state barriers in a small single-domain protein

    NASA Astrophysics Data System (ADS)

    Guinn, Emily J.; Jagannathan, Bharat; Marqusee, Susan

    2015-04-01

    A fundamental question in protein folding is whether proteins fold through one or multiple trajectories. While most experiments indicate a single pathway, simulations suggest proteins can fold through many parallel pathways. Here, we use a combination of chemical denaturant, mechanical force and site-directed mutations to demonstrate the presence of multiple unfolding pathways in a simple, two-state folding protein. We show that these multiple pathways have structurally different transition states, and that seemingly small changes in protein sequence and environment can strongly modulate the flux between the pathways. These results suggest that in vivo, the crowded cellular environment could strongly influence the mechanisms of protein folding and unfolding. Our study resolves the apparent dichotomy between experimental and theoretical studies, and highlights the advantage of using a multipronged approach to reveal the complexities of a protein's free-energy landscape.

  18. Identification of a key structural element for protein folding within beta-hairpin turns.

    PubMed

    Kim, Jaewon; Brych, Stephen R; Lee, Jihun; Logan, Timothy M; Blaber, Michael

    2003-05-09

    Specific residues in a polypeptide may be key contributors to the stability and foldability of the unique native structure. Identification and prediction of such residues is, therefore, an important area of investigation in solving the protein folding problem. Atypical main-chain conformations can help identify strains within a folded protein, and by inference, positions where unique amino acids may have a naturally high frequency of occurrence due to favorable contributions to stability and folding. Non-Gly residues located near the left-handed alpha-helical region (L-alpha) of the Ramachandran plot are a potential indicator of structural strain. Although many investigators have studied mutations at such positions, no consistent energetic or kinetic contributions to stability or folding have been elucidated. Here we report a study of the effects of Gly, Ala and Asn substitutions found within the L-alpha region at a characteristic position in defined beta-hairpin turns within human acidic fibroblast growth factor, and demonstrate consistent effects upon stability and folding kinetics. The thermodynamic and kinetic data are compared to available data for similar mutations in other proteins, with excellent agreement. The results have identified that Gly at the i+3 position within a subset of beta-hairpin turns is a key contributor towards increasing the rate of folding to the native state of the polypeptide while leaving the rate of unfolding largely unchanged.

  19. Exploring the sequence-structure protein landscape in the glycosyltransferase family

    PubMed Central

    Zhang, Ziding; Kochhar, Sunil; Grigorov, Martin

    2003-01-01

    To understand the molecular basis of glycosyltransferases’ (GTFs) catalytic mechanism, extensive structural information is required. Here, fold recognition methods were employed to assign 3D protein shapes (folds) to the currently known GTF sequences, available in public databases such as GenBank and Swissprot. First, GTF sequences were retrieved and classified into clusters, based on sequence similarity only. Intracluster sequence similarity was chosen sufficiently high to ensure that the same fold is found within a given cluster. Then, a representative sequence from each cluster was selected to compose a subset of GTF sequences. The members of this reduced set were processed by three different fold recognition methods: 3D-PSSM, FUGUE, and GeneFold. Finally, the results from different fold recognition methods were analyzed and compared to sequence-similarity search methods (i.e., BLAST and PSI-BLAST). It was established that the folds of about 70% of all currently known GTF sequences can be confidently assigned by fold recognition methods, a value which is higher than the fold identification rate based on sequence comparison alone (48% for BLAST and 64% for PSI-BLAST). The identified folds were submitted to 3D clustering, and we found that most of the GTF sequences adopt the typical GTF A or GTF B folds. Our results indicate a lack of evidence that new GTF folds (i.e., folds other than GTF A and B) exist. Based on cases where fold identification was not possible, we suggest several sequences as the most promising targets for a structural genomics initiative focused on the GTF protein family. PMID:14500887

  20. RNA folding: structure prediction, folding kinetics and ion electrostatics.

    PubMed

    Tan, Zhijie; Zhang, Wenbing; Shi, Yazhou; Wang, Fenghua

    2015-01-01

    Beyond the "traditional" functions such as gene storage, transport and protein synthesis, recent discoveries reveal that RNAs have important "new" biological functions including the RNA silence and gene regulation of riboswitch. Such functions of noncoding RNAs are strongly coupled to the RNA structures and proper structure change, which naturally leads to the RNA folding problem including structure prediction and folding kinetics. Due to the polyanionic nature of RNAs, RNA folding structure, stability and kinetics are strongly coupled to the ion condition of solution. The main focus of this chapter is to review the recent progress in the three major aspects in RNA folding problem: structure prediction, folding kinetics and ion electrostatics. This chapter will introduce both the recent experimental and theoretical progress, while emphasize the theoretical modelling on the three aspects in RNA folding.

  1. Experimental support for the foldability-function tradeoff hypothesis: segregation of the folding nucleus and functional regions in fibroblast growth factor-1.

    PubMed

    Longo, Liam; Lee, Jihun; Blaber, Michael

    2012-12-01

    The acquisition of function is often associated with destabilizing mutations, giving rise to the stability-function tradeoff hypothesis. To test whether function is also accommodated at the expense of foldability, fibroblast growth factor-1 (FGF-1) was subjected to a comprehensive φ-value analysis at each of the 11 turn regions. FGF-1, a β-trefoil fold, represents an excellent model system with which to evaluate the influence of function on foldability: because of its threefold symmetric structure, analysis of FGF-1 allows for direct comparisons between symmetry-related regions of the protein that are associated with function to those that are not; thus, a structural basis for regions of foldability can potentially be identified. The resulting φ-value distribution of FGF-1 is highly polarized, with the majority of positions described as either folded-like or denatured-like in the folding transition state. Regions important for folding are shown to be asymmetrically distributed within the protein architecture; furthermore, regions associated with function (i.e., heparin-binding affinity and receptor-binding affinity) are localized to regions of the protein that fold after barrier crossing (late in the folding pathway). These results provide experimental support for the foldability-function tradeoff hypothesis in the evolution of FGF-1. Notably, the results identify the potential for folding redundancy in symmetric protein architecture with important implications for protein evolution and design. Copyright © 2012 The Protein Society.

  2. Balancing Force Field Protein–Lipid Interactions To Capture Transmembrane Helix–Helix Association

    PubMed Central

    2018-01-01

    Atomistic simulations have recently been shown to be sufficiently accurate to reversibly fold globular proteins and have provided insights into folding mechanisms. Gaining similar understanding from simulations of membrane protein folding and association would be of great medical interest. All-atom simulations of the folding and assembly of transmembrane protein domains are much more challenging, not least due to very slow diffusion within the lipid bilayer membrane. Here, we focus on a simple and well-characterized prototype of membrane protein folding and assembly, namely the dimerization of glycophorin A, a homodimer of single transmembrane helices. We have determined the free energy landscape for association of the dimer using the CHARMM36 force field. We find that the native structure is a metastable state, but not stable as expected from experimental estimates of the dissociation constant and numerous experimental structures obtained under a variety of conditions. We explore two straightforward approaches to address this problem and demonstrate that they result in stable dimers with dissociation constants consistent with experimental data. PMID:29424543

  3. Crystal structure of YHI9, the yeast member of the phenazine biosynthesis PhzF enzyme superfamily.

    PubMed

    Liger, Dominique; Quevillon-Cheruel, Sophie; Sorel, Isabelle; Bremang, Michael; Blondeau, Karine; Aboulfath, Ilham; Janin, Joël; van Tilbeurgh, Herman; Leulliot, Nicolas

    2005-09-01

    In the Pseudomonas bacterial genomes, the PhzF proteins are involved in the production of phenazine derivative antibiotic and antifungal compounds. The PhzF superfamily however also encompasses proteins in all genomes from bacteria to eukaryotes, for which no function has been assigned. We have determined the three dimensional crystal structure at 2.05 A resolution of YHI9, the yeast member of the PhzF family. YHI9 has a fold similar to bacterial diaminopimelate epimerase, revealing a bimodular structure with an internal symmetry. Residue conservation identifies a putative active site at the interface between the two domains. Evolution of this protein by gene duplication, gene fusion and domain swapping from an ancestral gene containing the "hot dog" fold, identifies the protein as a "kinked double hot dog" fold. Copyright 2005 Wiley-Liss, Inc.

  4. Random single amino acid deletion sampling unveils structural tolerance and the benefits of helical registry shift on GFP folding and structure.

    PubMed

    Arpino, James A J; Reddington, Samuel C; Halliwell, Lisa M; Rizkallah, Pierre J; Jones, D Dafydd

    2014-06-10

    Altering a protein's backbone through amino acid deletion is a common evolutionary mutational mechanism, but is generally ignored during protein engineering primarily because its effect on the folding-structure-function relationship is difficult to predict. Using directed evolution, enhanced green fluorescent protein (EGFP) was observed to tolerate residue deletion across the breadth of the protein, particularly within short and long loops, helical elements, and at the termini of strands. A variant with G4 removed from a helix (EGFP(G4Δ)) conferred significantly higher cellular fluorescence. Folding analysis revealed that EGFP(G4Δ) retained more structure upon unfolding and refolded with almost 100% efficiency but at the expense of thermodynamic stability. The EGFP(G4Δ) structure revealed that G4 deletion caused a beneficial helical registry shift resulting in a new polar interaction network, which potentially stabilizes a cis proline peptide bond and links secondary structure elements. Thus, deletion mutations and registry shifts can enhance proteins through structural rearrangements not possible by substitution mutations alone. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Sequentially distant but structurally similar proteins exhibit fold specific patterns based on their biophysical properties.

    PubMed

    Rajendran, Senthilnathan; Jothi, Arunachalam

    2018-05-16

    The Three-dimensional structure of a protein depends on the interaction between their amino acid residues. These interactions are in turn influenced by various biophysical properties of the amino acids. There are several examples of proteins that share the same fold but are very dissimilar at the sequence level. For proteins to share a common fold some crucial interactions should be maintained despite insignificant sequence similarity. Since the interactions are because of the biophysical properties of the amino acids, we should be able to detect descriptive patterns for folds at such a property level. In this line, the main focus of our research is to analyze such proteins and to characterize them in terms of their biophysical properties. Protein structures with sequence similarity lesser than 40% were selected for ten different subfolds from three different mainfolds (according to CATH classification) and were used for this analysis. We used the normalized values of the 49 physio-chemical, energetic and conformational properties of amino acids. We characterize the folds based on the average biophysical property values. We also observed a fold specific correlational behavior of biophysical properties despite a very low sequence similarity in our data. We further trained three different binary classification models (Naive Bayes-NB, Support Vector Machines-SVM and Bayesian Generalized Linear Model-BGLM) which could discriminate mainfold based on the biophysical properties. We also show that among the three generated models, the BGLM classifier model was able to discriminate protein sequences coming under all beta category with 81.43% accuracy and all alpha, alpha-beta proteins with 83.37% accuracy. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Insights into the folding pathway of the Engrailed Homeodomain protein using replica exchange molecular dynamics simulations.

    PubMed

    Koulgi, Shruti; Sonavane, Uddhavesh; Joshi, Rajendra

    2010-11-01

    Protein folding studies were carried out by performing microsecond time scale simulations on the ultrafast/fast folding protein Engrailed Homeodomain (EnHD) from Drosophila melanogaster. It is a three-helix bundle protein consisting of 54 residues (PDB ID: 1ENH). The positions of the helices are 8-20 (Helix I), 26-36 (Helix II) and 40-53 (Helix III). The second and third helices together form a Helix-Turn-Helix (HTH) motif which belongs to the family of DNA binding proteins. The molecular dynamics (MD) simulations were performed using replica exchange molecular dynamics (REMD). REMD is a method that involves simulating a protein at different temperatures and performing exchanges at regular time intervals. These exchanges were accepted or rejected based on the Metropolis criterion. REMD was performed using the AMBER FF03 force field with the generalised Born solvation model for the temperature range 286-373 K involving 30 replicas. The extended conformation of the protein was used as the starting structure. A simulation of 600 ns per replica was performed resulting in an overall simulation time of 18 μs. The protein was seen to fold close to the native state with backbone root mean square deviation (RMSD) of 3.16 Å. In this low RMSD structure, the Helix I was partially formed with a backbone RMSD of 3.37 Å while HTH motif had an RMSD of 1.81 Å. Analysis suggests that EnHD folds to its native structure via an intermediate in which the HTH motif is formed. The secondary structure development occurs first followed by tertiary packing. The results were in good agreement with the experimental findings. Copyright © 2010 Elsevier Inc. All rights reserved.

  7. Small Scaffolds, Big Potential: Developing Miniature Proteins as Therapeutic Agents.

    PubMed

    Holub, Justin M

    2017-09-01

    Preclinical Research Miniature proteins are a class of oligopeptide characterized by their short sequence lengths and ability to adopt well-folded, three-dimensional structures. Because of their biomimetic nature and synthetic tractability, miniature proteins have been used to study a range of biochemical processes including fast protein folding, signal transduction, catalysis and molecular transport. Recently, miniature proteins have been gaining traction as potential therapeutic agents because their small size and ability to fold into defined tertiary structures facilitates their development as protein-based drugs. This research overview discusses emerging developments involving the use of miniature proteins as scaffolds to design novel therapeutics for the treatment and study of human disease. Specifically, this review will explore strategies to: (i) stabilize miniature protein tertiary structure; (ii) optimize biomolecular recognition by grafting functional epitopes onto miniature protein scaffolds; and (iii) enhance cytosolic delivery of miniature proteins through the use of cationic motifs that facilitate endosomal escape. These objectives are discussed not only to address challenges in developing effective miniature protein-based drugs, but also to highlight the tremendous potential miniature proteins hold for combating and understanding human disease. Drug Dev Res 78 : 268-282, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  8. On the origins of the weak folding cooperativity of a designed ββα ultrafast protein FSD-1.

    PubMed

    Wu, Chun; Shea, Joan-Emma

    2010-11-18

    FSD-1, a designed small ultrafast folder with a ββα fold, has been actively studied in the last few years as a model system for studying protein folding mechanisms and for testing of the accuracy of computational models. The suitability of this protein to describe the folding of naturally occurring α/β proteins has recently been challenged based on the observation that the melting transition is very broad, with ill-resolved baselines. Using molecular dynamics simulations with the AMBER protein force field (ff96) coupled with the implicit solvent model (IGB = 5), we shed new light into the nature of this transition and resolve the experimental controversies. We show that the melting transition corresponds to the melting of the protein as a whole, and not solely to the helix-coil transition. The breadth of the folding transition arises from the spread in the melting temperatures (from ∼325 K to ∼302 K) of the individual transitions: formation of the hydrophobic core, β-hairpin and tertiary fold, with the helix formed earlier. Our simulations initiated from an extended chain accurately predict the native structure, provide a reasonable estimate of the transition barrier height, and explicitly demonstrate the existence of multiple pathways and multiple transition states for folding. Our exhaustive sampling enables us to assess the quality of the Amber ff96/igb5 combination and reveals that while this force field can predict the correct native fold, it nonetheless overstabilizes the α-helix portion of the protein (Tm = ∼387K) as well as the denatured structures.

  9. Accelerating large-scale protein structure alignments with graphics processing units

    PubMed Central

    2012-01-01

    Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132

  10. Topological frustration in βα-repeat proteins: sequence diversity modulates the conserved folding mechanisms of α/β/α sandwich proteins

    PubMed Central

    Hills, Ronald D.; Kathuria, Sagar V.; Wallace, Louise A.; Day, Iain J.; Brooks, Charles L.; Matthews, C. Robert

    2010-01-01

    The thermodynamic hypothesis of Anfinsen postulates that structures and stabilities of globular proteins are determined by their amino acid sequences. Chain topology, however, is known to influence the folding reaction, in that motifs with a preponderance of local interactions typically fold more rapidly than those with a larger fraction of non-local interactions. Together, the topology and sequence can modulate the energy landscape and influence the rate at which the protein folds to the native conformation. To explore the relationship of sequence and topology in the folding of βα–repeat proteins, which are dominated by local interactions, a combined experimental and simulation analysis was performed on two members of the flavodoxin-like, α/β/α sandwich fold. Spo0F and the N-terminal receiver domain of NtrC (NT-NtrC) have similar topologies but low sequence identity, enabling a test of the effects of sequence on folding. Experimental results demonstrated that both response-regulator proteins fold via parallel channels through highly structured sub-millisecond intermediates before accessing their cis prolyl peptide bond-containing native conformations. Global analysis of the experimental results preferentially places these intermediates off the productive folding pathway. Sequence-sensitive Gō-model simulations conclude that frustration in the folding in Spo0F, corresponding to the appearance of the off-pathway intermediate, reflects competition for intra-subdomain van der Waals contacts between its N- and C-terminal subdomains. The extent of transient, premature structure appears to correlate with the number of isoleucine, leucine and valine (ILV) side-chains that form a large sequence-local cluster involving the central β-sheet and helices α2, α3 and α4. The failure to detect the off-pathway species in the simulations of NT-NtrC may reflect the reduced number of ILV side-chains in its corresponding hydrophobic cluster. The location of the hydrophobic clusters in the structure may also be related to the differing functional properties of these response regulators. Comparison with the results of previous experimental and simulation analyses on the homologous CheY argues that prematurely-folded unproductive intermediates are a common property of the βα-repeat motif. PMID:20226790

  11. The Crystal Structure of a Maxi/Mini-Ferritin Chimera Reveals Guiding Principles for the Assembly of Protein Cages

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cornell, Thomas A.; Srivastava, Yogesh; Jauch, Ralf

    Cage proteins assemble into nanoscale structures with large central cavities. They play roles, including those as virus capsids and chaperones, and have been applied to drug delivery and nanomaterials. Furthermore, protein cages have been used as model systems to understand and design protein quaternary structure. Ferritins are ubiquitous protein cages that manage iron homeostasis and oxidative damage. Two ferritin subfamilies have strongly similar tertiary structure yet distinct quaternary structure: maxi-ferritins normally assemble into 24-meric, octahedral cages with C-terminal E-helices centered around 4-fold symmetry axes, and mini-ferritins are 12-meric, tetrahedral cages with 3-fold axes defined by C-termini lacking E-domains. To understandmore » the role E-domains play in ferritin quaternary structure, we previously designed a chimera of a maxi-ferritin E-domain fused to the C-terminus of a mini-ferritin. The chimera is a 12-mer cage midway in size between those of the maxi- and mini-ferritin. The research described herein sets out to understand (a) whether the increase in size over a typical mini-ferritin is due to a frozen state where the E-domain is flipped out of the cage and (b) whether the symmetrical preference of the E-domain in the maxi-ferritin (4-fold axis) overrules the C-terminal preference in the mini-ferritin (3-fold axis). With a 1.99 Å resolution crystal structure, we determined that the chimera assembles into a tetrahedral cage that can be nearly superimposed with the parent mini-ferritin, and that the E-domains are flipped external to the cage at the 3-fold symmetry axes.« less

  12. CABS-fold: Server for the de novo and consensus-based prediction of protein structure.

    PubMed

    Blaszczyk, Maciej; Jamroz, Michal; Kmiecik, Sebastian; Kolinski, Andrzej

    2013-07-01

    The CABS-fold web server provides tools for protein structure prediction from sequence only (de novo modeling) and also using alternative templates (consensus modeling). The web server is based on the CABS modeling procedures ranked in previous Critical Assessment of techniques for protein Structure Prediction competitions as one of the leading approaches for de novo and template-based modeling. Except for template data, fragmentary distance restraints can also be incorporated into the modeling process. The web server output is a coarse-grained trajectory of generated conformations, its Jmol representation and predicted models in all-atom resolution (together with accompanying analysis). CABS-fold can be freely accessed at http://biocomp.chem.uw.edu.pl/CABSfold.

  13. CABS-fold: server for the de novo and consensus-based prediction of protein structure

    PubMed Central

    Blaszczyk, Maciej; Jamroz, Michal; Kmiecik, Sebastian; Kolinski, Andrzej

    2013-01-01

    The CABS-fold web server provides tools for protein structure prediction from sequence only (de novo modeling) and also using alternative templates (consensus modeling). The web server is based on the CABS modeling procedures ranked in previous Critical Assessment of techniques for protein Structure Prediction competitions as one of the leading approaches for de novo and template-based modeling. Except for template data, fragmentary distance restraints can also be incorporated into the modeling process. The web server output is a coarse-grained trajectory of generated conformations, its Jmol representation and predicted models in all-atom resolution (together with accompanying analysis). CABS-fold can be freely accessed at http://biocomp.chem.uw.edu.pl/CABSfold. PMID:23748950

  14. Structure and Function of the Sterol Carrier Protein-2 N-Terminal Presequence†

    PubMed Central

    Martin, Gregory G.; Hostetler, Heather A.; McIntosh, Avery L.; Tichy, Shane E.; Williams, Brad J.; Russell, David H.; Berg, Jeremy M.; Spencer, Thomas A.; Ball, Judith; Kier, Ann B.; Schroeder, Friedhelm

    2008-01-01

    Although sterol carrier protein-2 (SCP-2) is encoded as a precursor protein (proSCP-2), little is known regarding the structure and function of the 20-amino acid N-terminal presequence. As shown herein, the presequence contains significant secondary structure and alters SCP-2: (i) secondary structure (CD), (ii) tertiary structure (aqueous exposure of Trp shown by UV absorbance, fluorescence, fluorescence quenching), (iii) ligand binding site [Trp response to ligands, peptide cross-linked by photoactivatable free cholesterol (FCBP)], (iv) selectivity for interaction with anionic phospholipid-rich membranes, (v) interaction with a peroxisomal import protein [FRET studies of Pex5p(C) binding], the N-terminal presequence increased SCP-2’s affinity for Pex5p(C) by 10-fold, and (vi) intracellular targeting in living and fixed cells (confocal microscopy). Nearly 5-fold more SCP-2 than proSCP-2 colocalized with plasma membrane lipid rafts/caveolae (AF488-CTB), 2.8-fold more SCP-2 than proSCP-2 colocalized with a mitochondrial marker (Mitotracker), but nearly 2-fold less SCP-2 than proSCP-2 colocalized with peroxisomes (AF488-antibody to PMP70). These data indicate the importance of the N-terminal presequence in regulating SCP-2 structure, cholesterol localization within the ligand binding site, membrane association, and, potentially, intracellular targeting. PMID:18465878

  15. Dodging the crisis of folding proteins with knots

    NASA Astrophysics Data System (ADS)

    Sulkowska, Joanna

    2009-03-01

    Proteins with nontrivial topology, containing knots and slipknots, have the ability to fold to their native states without any additional external forces invoked. A mechanism is suggested for folding of these proteins, such as YibK and YbeA, which involves an intermediate configuration with a slipknot. It elucidates the role of topological barriers and backtracking during the folding event. It also illustrates that native contacts are sufficient to guarantee folding in around 1-2% of the simulations, and how slipknot intermediates are needed to reduce the topological bottlenecks. As expected, simulations of proteins with similar structure but with knot removed fold much more efficiently, clearly demonstrating the origin of these topological barriers. Although these studies are based on a simple coarse-grained model, they are already able to extract some of the underlying principles governing folding in such complex topologies.

  16. The hypothetical protein Atu4866 from Agrobacterium tumefaciens adopts a streptavidin-like fold

    PubMed Central

    Ai, Xuanjun; Semesi, Anthony; Yee, Adelinda; Arrowsmith, Cheryl H.; Choy, Wing-Yiu; Li, Shawn S.C.

    2008-01-01

    Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares ≥60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a β-barrel/sandwich formed by eight antiparallel β-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site. PMID:18042676

  17. Effect of temperature on the conformation of natively unfolded protein 4E-BP1 in aqueous and mixed solutions containing trifluoroethanol and hexafluoroisopropanol.

    PubMed

    Hackl, Ellen V

    2015-02-01

    Natively unfolded (intrinsically disordered) proteins have attracted growing attention due to their high abundance in nature, involvement in various signalling and regulatory pathways and direct association with many diseases. In the present work the combined effect of temperature and alcohols, trifluoroethanol (TFE) and hexafluoroisopropanol (HFIP), on the natively unfolded 4E-BP1 protein was studied to elucidate the balance between temperature-induced folding and unfolding in intrinsically disordered proteins. It was shown that elevated temperatures induce reversible partial folding of 4E-BP1 both in buffer and in the mixed solutions containing denaturants. In the mixed solutions containing TFE (HFIP) 4E-BP1 adopts a partially folded helical conformation. As the temperature increases, the initial temperature-induced protein folding is replaced by irreversible unfolding/melting only after a certain level of the protein helicity has been reached. Onset unfolding temperature decreases with TFE (HFIP) concentration in solution. It was shown that an increase in the temperature induces two divergent processes in a natively unfolded protein--hydrophobicity-driven folding and unfolding. Balance between these two processes determines thermal behaviour of a protein. The correlation between heat-induced protein unfolding and the amount of helical content in a protein is revealed. Heat-induced secondary structure formation can be a valuable test to characterise minor changes in the conformations of natively unfolded proteins as a result of site-directed mutagenesis. Mutants with an increased propensity to fold into a structured form reveal different temperature behaviour.

  18. Precursory signatures of protein folding/unfolding: From time series correlation analysis to atomistic mechanisms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hsu, P. J.; Lai, S. K., E-mail: sklai@coll.phy.ncu.edu.tw; Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan

    Folded conformations of proteins in thermodynamically stable states have long lifetimes. Before it folds into a stable conformation, or after unfolding from a stable conformation, the protein will generally stray from one random conformation to another leading thus to rapid fluctuations. Brief structural changes therefore occur before folding and unfolding events. These short-lived movements are easily overlooked in studies of folding/unfolding for they represent momentary excursions of the protein to explore conformations in the neighborhood of the stable conformation. The present study looks for precursory signatures of protein folding/unfolding within these rapid fluctuations through a combination of three techniques: (1)more » ultrafast shape recognition, (2) time series segmentation, and (3) time series correlation analysis. The first procedure measures the differences between statistical distance distributions of atoms in different conformations by calculating shape similarity indices from molecular dynamics simulation trajectories. The second procedure is used to discover the times at which the protein makes transitions from one conformation to another. Finally, we employ the third technique to exploit spatial fingerprints of the stable conformations; this procedure is to map out the sequences of changes preceding the actual folding and unfolding events, since strongly correlated atoms in different conformations are different due to bond and steric constraints. The aforementioned high-frequency fluctuations are therefore characterized by distinct correlational and structural changes that are associated with rate-limiting precursors that translate into brief segments. Guided by these technical procedures, we choose a model system, a fragment of the protein transthyretin, for identifying in this system not only the precursory signatures of transitions associated with α helix and β hairpin, but also the important role played by weaker correlations in such protein folding dynamics.« less

  19. Precursory signatures of protein folding/unfolding: From time series correlation analysis to atomistic mechanisms

    NASA Astrophysics Data System (ADS)

    Hsu, P. J.; Cheong, S. A.; Lai, S. K.

    2014-05-01

    Folded conformations of proteins in thermodynamically stable states have long lifetimes. Before it folds into a stable conformation, or after unfolding from a stable conformation, the protein will generally stray from one random conformation to another leading thus to rapid fluctuations. Brief structural changes therefore occur before folding and unfolding events. These short-lived movements are easily overlooked in studies of folding/unfolding for they represent momentary excursions of the protein to explore conformations in the neighborhood of the stable conformation. The present study looks for precursory signatures of protein folding/unfolding within these rapid fluctuations through a combination of three techniques: (1) ultrafast shape recognition, (2) time series segmentation, and (3) time series correlation analysis. The first procedure measures the differences between statistical distance distributions of atoms in different conformations by calculating shape similarity indices from molecular dynamics simulation trajectories. The second procedure is used to discover the times at which the protein makes transitions from one conformation to another. Finally, we employ the third technique to exploit spatial fingerprints of the stable conformations; this procedure is to map out the sequences of changes preceding the actual folding and unfolding events, since strongly correlated atoms in different conformations are different due to bond and steric constraints. The aforementioned high-frequency fluctuations are therefore characterized by distinct correlational and structural changes that are associated with rate-limiting precursors that translate into brief segments. Guided by these technical procedures, we choose a model system, a fragment of the protein transthyretin, for identifying in this system not only the precursory signatures of transitions associated with α helix and β hairpin, but also the important role played by weaker correlations in such protein folding dynamics.

  20. Polymer Uncrossing and Knotting in Protein Folding, and Their Role in Minimal Folding Pathways

    PubMed Central

    Mohazab, Ali R.; Plotkin, Steven S.

    2013-01-01

    We introduce a method for calculating the extent to which chain non-crossing is important in the most efficient, optimal trajectories or pathways for a protein to fold. This involves recording all unphysical crossing events of a ghost chain, and calculating the minimal uncrossing cost that would have been required to avoid such events. A depth-first tree search algorithm is applied to find minimal transformations to fold , , , and knotted proteins. In all cases, the extra uncrossing/non-crossing distance is a small fraction of the total distance travelled by a ghost chain. Different structural classes may be distinguished by the amount of extra uncrossing distance, and the effectiveness of such discrimination is compared with other order parameters. It was seen that non-crossing distance over chain length provided the best discrimination between structural and kinetic classes. The scaling of non-crossing distance with chain length implies an inevitable crossover to entanglement-dominated folding mechanisms for sufficiently long chains. We further quantify the minimal folding pathways by collecting the sequence of uncrossing moves, which generally involve leg, loop, and elbow-like uncrossing moves, and rendering the collection of these moves over the unfolded ensemble as a multiple-transformation “alignment”. The consensus minimal pathway is constructed and shown schematically for representative cases of an , , and knotted protein. An overlap parameter is defined between pathways; we find that proteins have minimal overlap indicating diverse folding pathways, knotted proteins are highly constrained to follow a dominant pathway, and proteins are somewhere in between. Thus we have shown how topological chain constraints can induce dominant pathway mechanisms in protein folding. PMID:23365638

  1. Supramolecular Architectures and Mimics of Complex Natural Folds Derived from Rationally Designed alpha-Helical Protein Structures

    NASA Astrophysics Data System (ADS)

    Tavenor, Nathan Albert

    Protein-based supramolecular polymers (SMPs) are a class of biomaterials which draw inspiration from and expand upon the many examples of complex protein quaternary structures observed in nature: collagen, microtubules, viral capsids, etc. Designing synthetic supramolecular protein scaffolds both increases our understanding of natural superstructures and allows for the creation of novel materials. Similar to small-molecule SMPs, protein-based SMPs form due to self-assembly driven by intermolecular interactions between monomers, and monomer structure determines the properties of the overall material. Using protein-based monomers takes advantage of the self-assembly and highly specific molecular recognition properties encodable in polypeptide sequences to rationally design SMP architectures. The central hypothesis underlying our work is that alpha-helical coiled coils, a well-studied protein quaternary folding motif, are well-suited to SMP design through the addition of synthetic linkers at solvent-exposed sites. Through small changes in the structures of the cross-links and/or peptide sequence, we have been able to control both the nanoscale organization and the macroscopic properties of the SMPs. Changes to the linker and hydrophobic core of the peptide can be used to control polymer rigidity, stability, and dimensionality. The gaps in knowledge that this thesis sought to fill on this project were 1) the relationship between the molecular structure of the cross-linked polypeptides and the macroscopic properties of the SMPs and 2) a means of creating materials exhibiting multi-dimensional net or framework topologies. Separate from the above efforts on supramolecular architectures was work on improving backbone modification strategies for an alpha-helix in the context of a complex protein tertiary fold. Earlier work in our lab had successfully incorporated unnatural building blocks into every major secondary structure (beta-sheet, alpha-helix, loops and beta-turns) of a small protein with a tertiary fold. Although the tertiary fold of the native sequence was mimicked by the resulting artificial protein, the thermodynamic stability was greatly compromised. Most of this energetic penalty derived from the modifications present in the alpha-helix. The contribution within this thesis was direct comparison of several alpha-helical design strategies and establishment of the thermodynamic consequences of each.

  2. Probing the Folding-Unfolding Transition of a Thermophilic Protein, MTH1880

    PubMed Central

    Jung, Youngjin; Han, Jeongmin; Yun, Ji-Hye; Chang, Iksoo; Lee, Weontae

    2016-01-01

    The folding mechanism of typical proteins has been studied widely, while our understanding of the origin of the high stability of thermophilic proteins is still elusive. Of particular interest is how an atypical thermophilic protein with a novel fold maintains its structure and stability under extreme conditions. Folding-unfolding transitions of MTH1880, a thermophilic protein from Methanobacterium thermoautotrophicum, induced by heat, urea, and GdnHCl, were investigated using spectroscopic techniques including circular dichorism, fluorescence, NMR combined with molecular dynamics (MD) simulations. Our results suggest that MTH1880 undergoes a two-state N to D transition and it is extremely stable against temperature and denaturants. The reversibility of refolding was confirmed by spectroscopic methods and size exclusion chromatography. We found that the hyper-stability of the thermophilic MTH1880 protein originates from an extensive network of both electrostatic and hydrophobic interactions coordinated by the central β-sheet. Spectroscopic measurements, in combination with computational simulations, have helped to clarify the thermodynamic and structural basis for hyper-stability of the novel thermophilic protein MTH1880. PMID:26766214

  3. Frustration Sculpts the Early Stages of Protein Folding.

    PubMed

    Di Silvio, Eva; Brunori, Maurizio; Gianni, Stefano

    2015-09-07

    The funneled energy landscape theory implies that protein structures are minimally frustrated. Yet, because of the divergent demands between folding and function, regions of frustrated patterns are present at the active site of proteins. To understand the effects of such local frustration in dictating the energy landscape of proteins, here we compare the folding mechanisms of the two alternative spliced forms of a PDZ domain (PDZ2 and PDZ2as) that share a nearly identical sequence and structure, while displaying different frustration patterns. The analysis, based on the kinetic characterization of a large number of site-directed mutants, reveals that although the late stages for folding are very robust and biased by native topology, the early stages are more malleable and dominated by local frustration. The results are briefly discussed in the context of the energy-landscape theory. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. The Est3 protein associates with yeast telomerase through an OB-fold domain

    PubMed Central

    Lee, Jaesung S.; Mandell, Edward K.; Tucey, Timothy M.; Morris, Danna K.; Victoria, Lundblad

    2009-01-01

    The Est3 protein is a small regulatory subunit of yeast telomerase which is dispensable for enzyme catalysis but essential for telomere replication in vivo. Using structure prediction combined with in vivo characterization, we show here that Est3 consists of a predicted OB (oligo-saccharide/oligo-nucleotide binding) fold. Mutagenesis of predicted surface residues was used to generate a functional map of one surface of Est3, which identified a site that mediates association with the telomerase complex. Surprisingly, the predicted OB-fold of Est3 is structurally similar to the OB-fold of the mammalian TPP1 protein, despite the fact that Est3 and TPP1, as components of telomerase and a telomere capping complex, respectively, perform functionally distinct tasks at chromosome ends. The analysis performed on Est3 may be instructive in generating comparable missense mutations on the surface of the OB-fold domain of TPP1. PMID:19172754

  5. The Folding Energy Landscape and Free Energy Excitations of Cytochrome c

    PubMed Central

    Weinkam, Patrick; Zimmermann, Jörg; Romesberg, Floyd E.

    2014-01-01

    The covalently bound heme cofactor plays a dominant role in the folding of cytochrome c. Due to the complicated inorganic chemistry of the heme, some might consider the folding of cytochrome c to be a special case that follows different principles than those used to describe folding of proteins without cofactors. Recent investigations, however, demonstrate that models which are commonly used to describe folding for many proteins work well for cytochrome c when heme is explicitly introduced and generally provide results that agree with experimental observations. We will first discuss results from simple native structure-based models. These models include attractive interactions between nonadjacent residues only if they are present in the crystal structure at pH 7. Since attractive nonnative contacts are not included in native structure-based models, their energy landscapes can be described as “perfectly funneled.” In other words, native structure-based models are energetically guided towards the native state and contain no energetic traps that would hinder folding. Energetic traps are sources of frustration which cause specific transient intermediates to be populated. Native structure-based models do include repulsion between residues due to excluded volume. Nonenergetic traps can therefore exist if the chain, which cannot cross over itself, must partially unfold in order for folding to proceed. The ability of native structure-based models to capture these type of motions is in part responsible for their successful predictions of folding pathways for many types of proteins. Models without frustration describe well the sequence of folding events for cytochrome c inferred from hydrogen exchange experiments thereby justifying their use as a starting point. At low pH, the folding sequence of cytochrome c deviates from that at pH 7 and from those predicted from models with perfectly funneled energy landscapes. Alternate folding pathways are a result of “chemical frustration.” This frustration arises because some regions of the protein are destabilized more than others due to the heterogeneous distribution of titratable residues that are protonated at low pH. We construct more complex models that include chemical frustration, in addition to the native structure-based terms. These more complex models only modestly perturb the energy landscape which remains overall well funneled. These perturbed models can accurately describe how alternative folding pathways are used at low pH. At alkaline pH, cytochrome c populates distinctly different structural ensembles. For instance, lysine residues are deprotonated and compete for the heme ligation site. The same models that can describe folding at low pH also predict well the structures and relative stabilities of intermediates populated at alkaline pH. PMID:20143816

  6. Synchrotron radiation circular dichroism spectroscopy study of recombinant T β4 folding

    NASA Astrophysics Data System (ADS)

    Huang, Yung-Chin; Chu, Hsueh-Liang; Chen, Peng-Jen; Chang, Chia-Ching

    Thymosin beta 4 (T β4) is a 43-amino acid small peptide, has been demonstrated that it can promote cardiac repair, wound repair, tissue protection, and involve in the proliferation of blood cell precursor stem cells of bone marrow. Moreover, T β4 has been identified as a multifunction intrinsically disordered protein, which is lacking the stable tertiary structure. Owing to the small size and disordered character, the T β4 protein degrades rapidly and the storage condition is critical. Therefore, it is not easy to reveal its folding mechanism of native T β4. However, recombinant T β4 protein (rT β4), which fused with a 5-kDa peptide in its amino-terminal, is stable and possesses identical function of T β4. Therefore, rT β4 can be used to study its folding mechanism. By using over-critical folding process, stable folding intermediates of rT β4 can be obtained. Structure analysis of folding intermediates by synchrotron radiation circular dichroism (SRCD) and fluorescence spectroscopies indicate that rT β4 is a random coli major protein and its hydrophobic region becomes compact gradually. Moreover, the rT β4 folding is a two state transition. Thermal denaturation analysis indicates that rT β4 lacks stable tertiary structure. These results indicated that rT β4, similar to T β4, is an intrinsically disordered protein. Research is supported by MOST, Taiwan. MOST 103-2112-M-009-011-MY3. Corresponding author: Chia-Ching Chang; ccchang01@faculty.nctu.edu.tw.

  7. An improved method to detect correct protein folds using partial clustering.

    PubMed

    Zhou, Jianjun; Wishart, David S

    2013-01-16

    Structure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient "partial" clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods. We propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either C(α) RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite. The new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance.

  8. An improved method to detect correct protein folds using partial clustering

    PubMed Central

    2013-01-01

    Background Structure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient “partial“ clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods. Results We propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either Cα RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite. Conclusions The new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance. PMID:23323835

  9. Electrodynamic pressure modulation of protein stability in cosolvents.

    PubMed

    Damodaran, Srinivasan

    2013-11-19

    Cosolvents affect structural stability of proteins in aqueous solutions. A clear understanding of the mechanism by which cosolvents impact protein stability is critical to understanding protein folding in a biological milieu. In this study, we investigated the Lifshitz-van der Waals dispersion interaction of seven different solutes with nine globular proteins and report that in an aqueous medium the structure-stabilizing solutes exert a positive electrodynamic pressure, whereas the structure-destabilizing solutes exert a negative electrodynamic pressure on the proteins. The net increase in the thermal denaturation temperature (ΔTd) of a protein in 1 M solution of various solutes was linearly related to the electrodynamic pressure (PvdW) between the solutes and the protein. The slope of the PvdW versus ΔTd plots was protein-dependent. However, we find a positive linear relationship (r(2) = 0.79) between the slope (i.e., d(ΔTd)/dPvdW) and the adiabatic compressibility (βs) of the proteins. Together, these results clearly indicate that the Lifshitz's dispersion forces are inextricably involved in solute-induced stabilization/destabilization of globular proteins. The positive and/or negative electrodynamic pressure generated by the solute-protein interaction across the water medium seems to be the fundamental mechanism by which solutes affect protein stability. This is at variance with the existing preferential hydration concept. The implication of these results is significant in the sense that, in addition to the hydrophobic effect that drives protein folding, the electrodynamic forces between the proteins and solutes in the biological milieu also might play a role in the folding process as well as in the stability of the folded state.

  10. Different Members of a Simple Three-Helix Bundle Protein Family Have Very Different Folding Rate Constants and Fold by Different Mechanisms

    PubMed Central

    Wensley, Beth G.; Gärtner, Martina; Choo, Wan Xian; Batey, Sarah; Clarke, Jane

    2009-01-01

    The 15th, 16th, and 17th repeats of chicken brain α-spectrin (R15, R16, and R17, respectively) are very similar in terms of structure and stability. However, R15 folds and unfolds 3 orders of magnitude faster than R16 and R17. This is unexpected. The rate-limiting transition state for R15 folding is investigated using protein engineering methods (Φ-value analysis) and compared with previously completed analyses of R16 and R17. Characterisation of many mutants suggests that all three proteins have similar complexity in the folding landscape. The early rate-limiting transition states of the three domains are similar in terms of overall structure, but there are significant differences in the patterns of Φ-values. R15 apparently folds via a nucleation–condensation mechanism, which involves concomitant folding and packing of the A- and C-helices, establishing the correct topology. R16 and R17 fold via a more framework-like mechanism, which may impede the search to find the correct packing of the helices, providing a possible explanation for the fast folding of R15. PMID:19445951

  11. Modulation of a Pore in the Capsid of JC Polyomavirus Reduces Infectivity and Prevents Exposure of the Minor Capsid Proteins

    PubMed Central

    Nelson, Christian D. S.; Ströh, Luisa J.; Gee, Gretchen V.; O'Hara, Bethany A.; Stehle, Thilo

    2015-01-01

    ABSTRACT JC polyomavirus (JCPyV) infection of immunocompromised individuals results in the fatal demyelinating disease progressive multifocal leukoencephalopathy (PML). The viral capsid of JCPyV is composed primarily of the major capsid protein virus protein 1 (VP1), and pentameric arrangement of VP1 monomers results in the formation of a pore at the 5-fold axis of symmetry. While the presence of this pore is conserved among polyomaviruses, its functional role in infection or assembly is unknown. Here, we investigate the role of the 5-fold pore in assembly and infection of JCPyV by generating a panel of mutant viruses containing amino acid substitutions of the residues lining this pore. Multicycle growth assays demonstrated that the fitness of all mutants was reduced compared to that of the wild-type virus. Bacterial expression of VP1 pentamers containing substitutions to residues lining the 5-fold pore did not affect pentamer assembly or prevent association with the VP2 minor capsid protein. The X-ray crystal structures of selected pore mutants contained subtle changes to the 5-fold pore, and no other changes to VP1 were observed. Pore mutant pseudoviruses were not deficient in assembly, packaging of the minor capsid proteins, or binding to cells or in transport to the host cell endoplasmic reticulum. Instead, these mutant viruses were unable to expose VP2 upon arrival to the endoplasmic reticulum, a step that is critical for infection. This study demonstrated that the 5-fold pore is an important structural feature of JCPyV and that minor modifications to this structure have significant impacts on infectious entry. IMPORTANCE JCPyV is an important human pathogen that causes a severe neurological disease in immunocompromised individuals. While the high-resolution X-ray structure of the major capsid protein of JCPyV has been solved, the importance of a major structural feature of the capsid, the 5-fold pore, remains poorly understood. This pore is conserved across polyomaviruses and suggests either that these viruses have limited structural plasticity in this region or that this pore is important in infection or assembly. Using a structure-guided mutational approach, we showed that modulation of this pore severely inhibits JCPyV infection. These mutants do not appear deficient in assembly or early steps in infectious entry and are instead reduced in their ability to expose a minor capsid protein in the host cell endoplasmic reticulum. Our work demonstrates that the 5-fold pore is an important structural feature for JCPyV. PMID:25609820

  12. PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences.

    PubMed

    Ganesan, K; Parthasarathy, S

    2011-12-01

    Annotation of any newly determined protein sequence depends on the pairwise sequence identity with known sequences. However, for the twilight zone sequences which have only 15-25% identity, the pair-wise comparison methods are inadequate and the annotation becomes a challenging task. Such sequences can be annotated by using methods that recognize their fold. Bowie et al. described a 3D1D profile method in which the amino acid sequences that fold into a known 3D structure are identified by their compatibility to that known 3D structure. We have improved the above method by using the predicted secondary structure information and employ it for fold recognition from the twilight zone sequences. In our Protein Secondary Structure 3D1D (PSS-3D1D) method, a score (w) for the predicted secondary structure of the query sequence is included in finding the compatibility of the query sequence to the known fold 3D structures. In the benchmarks, the PSS-3D1D method shows a maximum of 21% improvement in predicting correctly the α + β class of folds from the sequences with twilight zone level of identity, when compared with the 3D1D profile method. Hence, the PSS-3D1D method could offer more clues than the 3D1D method for the annotation of twilight zone sequences. The web based PSS-3D1D method is freely available in the PredictFold server at http://bioinfo.bdu.ac.in/servers/ .

  13. Electronic polarization stabilizes tertiary structure prediction of HP-36.

    PubMed

    Duan, Li L; Zhu, Tong; Zhang, Qing G; Tang, Bo; Zhang, John Z H

    2014-04-01

    Molecular dynamic (MD) simulations with both implicit and explicit solvent models have been carried out to study the folding dynamics of HP-36 protein. Starting from the extended conformation, the secondary structure of all three helices in HP-36 was formed in about 50 ns and remained stable in the remaining simulation. However, the formation of the tertiary structure was difficult. Although some intermediates were close to the native structure, the overall conformation was not stable. Further analysis revealed that the large structure fluctuation of loop and hydrophobic core regions was devoted mostly to the instability of the structure during MD simulation. The backbone root-mean-square deviation (RMSD) of the loop and hydrophobic core regions showed strong correlation with the backbone RMSD of the whole protein. The free energy landscape indicated that the distribution of main chain torsions in loop and turn regions was far away from the native state. Starting from an intermediate structure extracted from the initial AMBER simulation, HP-36 was found to generally fold to the native state under the dynamically adjusted polarized protein-specific charge (DPPC) simulation, while the peptide did not fold into the native structure when AMBER force filed was used. The two best folded structures were extracted and taken into further simulations in water employing AMBER03 charge and DPPC for 25 ns. Result showed that introducing polarization effect into interacting potential could stabilize the near-native protein structure.

  14. A minimalist model protein with multiple folding funnels

    PubMed Central

    Locker, C. Rebecca; Hernandez, Rigoberto

    2001-01-01

    Kinetic and structural studies of wild-type proteins such as prions and amyloidogenic proteins provide suggestive evidence that proteins may adopt multiple long-lived states in addition to the native state. All of these states differ structurally because they lie far apart in configuration space, but their stability is not necessarily caused by cooperative (nucleation) effects. In this study, a minimalist model protein is designed to exhibit multiple long-lived states to explore the dynamics of the corresponding wild-type proteins. The minimalist protein is modeled as a 27-monomer sequence confined to a cubic lattice with three different monomer types. An order parameter—the winding index—is introduced to characterize the extent of folding. The winding index has several advantages over other commonly used order parameters like the number of native contacts. It can distinguish between enantiomers, its calculation requires less computational time than the number of native contacts, and reduced-dimensional landscapes can be developed when the native state structure is not known a priori. The results for the designed model protein prove by existence that the rugged energy landscape picture of protein folding can be generalized to include protein “misfolding” into long-lived states. PMID:11470921

  15. An overlapping region between the two terminal folding units of the outer surface protein A (OspA) controls its folding behavior.

    PubMed

    Makabe, Koki; Nakamura, Takashi; Dhar, Debanjan; Ikura, Teikichi; Koide, Shohei; Kuwajima, Kunihiro

    2018-04-27

    Although many naturally occurring proteins consist of multiple domains, most studies on protein folding to date deal with single-domain proteins or isolated domains of multi-domain proteins. Studies of multi-domain protein folding are required for further advancing our understanding of protein folding mechanisms. Borrelia outer surface protein A (OspA) is a β-rich two-domain protein, in which two globular domains are connected by a rigid and stable single-layer β-sheet. Thus, OspA is particularly suited as a model system for studying the interplays of domains in protein folding. Here, we studied the equilibria and kinetics of the urea-induced folding-unfolding reactions of OspA probed with tryptophan fluorescence and ultraviolet circular dichroism. Global analysis of the experimental data revealed compelling lines of evidence for accumulation of an on-pathway intermediate during kinetic refolding and for the identity between the kinetic intermediate and a previously described equilibrium unfolding intermediate. The results suggest that the intermediate has the fully native structure in the N-terminal domain and the single layer β-sheet, with the C-terminal domain still unfolded. The observation of the productive on-pathway folding intermediate clearly indicates substantial interactions between the two domains mediated by the single-layer β-sheet. We propose that a rigid and stable intervening region between two domains creates an overlap between two folding units and can energetically couple their folding reactions. Copyright © 2018. Published by Elsevier Ltd.

  16. Discrete-continuous duality of protein structure space.

    PubMed

    Sadreyev, Ruslan I; Kim, Bong-Hyun; Grishin, Nick V

    2009-06-01

    Recently, the nature of protein structure space has been widely discussed in the literature. The traditional discrete view of protein universe as a set of separate folds has been criticized in the light of growing evidence that almost any arrangement of secondary structures is possible and the whole protein space can be traversed through a path of similar structures. Here we argue that the discrete and continuous descriptions are not mutually exclusive, but complementary: the space is largely discrete in evolutionary sense, but continuous geometrically when purely structural similarities are quantified. Evolutionary connections are mainly confined to separate structural prototypes corresponding to folds as islands of structural stability, with few remaining traceable links between the islands. However, for a geometric similarity measure, it is usually possible to find a reasonable cutoff that yields paths connecting any two structures through intermediates.

  17. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field.

    PubMed

    Xu, Dong; Zhang, Yang

    2012-07-01

    Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field. Copyright © 2012 Wiley Periodicals, Inc.

  18. All-Atom Four-Body Knowledge-Based Statistical Potentials to Distinguish Native Protein Structures from Nonnative Folds

    PubMed Central

    2017-01-01

    Recent advances in understanding protein folding have benefitted from coarse-grained representations of protein structures. Empirical energy functions derived from these techniques occasionally succeed in distinguishing native structures from their corresponding ensembles of nonnative folds or decoys which display varying degrees of structural dissimilarity to the native proteins. Here we utilized atomic coordinates of single protein chains, comprising a large diverse training set, to develop and evaluate twelve all-atom four-body statistical potentials obtained by exploring alternative values for a pair of inherent parameters. Delaunay tessellation was performed on the atomic coordinates of each protein to objectively identify all quadruplets of interacting atoms, and atomic potentials were generated via statistical analysis of the data and implementation of the inverted Boltzmann principle. Our potentials were evaluated using benchmarking datasets from Decoys-‘R'-Us, and comparisons were made with twelve other physics- and knowledge-based potentials. Ranking 3rd, our best potential tied CHARMM19 and surpassed AMBER force field potentials. We illustrate how a generalized version of our potential can be used to empirically calculate binding energies for target-ligand complexes, using HIV-1 protease-inhibitor complexes for a practical application. The combined results suggest an accurate and efficient atomic four-body statistical potential for protein structure prediction and assessment. PMID:29119109

  19. Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds

    PubMed Central

    Roessler, Christian G.; Hall, Branwen M.; Anderson, William J.; Ingram, Wendy M.; Roberts, Sue A.; Montfort, William R.; Cordes, Matthew H. J.

    2008-01-01

    Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a “stepping-stone” method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and λ. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and λ. The domains show 40% sequence identity but differ by switching of α-helix to β-sheet in a C-terminal region spanning ≈25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization. PMID:18227506

  20. The crystal structure of a partial mouse Notch-1 ankyrin domain: Repeats 4 through 7 preserve an ankyrin fold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lubman, Olga Y.; Kopan, Raphael; Waksman, Gabriel

    Folding and stability of proteins containing ankyrin repeats (ARs) is of great interest because they mediate numerous protein-protein interactions involved in a wide range of regulatory cellular processes. Notch, an ankyrin domain containing protein, signals by converting a transcriptional repression complex into an activation complex. The Notch ANK domain is essential for Notch function and contains seven ARs. Here, we present the 2.2 {angstrom} crystal structure of ARs 4-7 from mouse Notch 1 (m1ANK). These C-terminal repeats were resistant to degradation during crystallization, and their secondary and tertiary structures are maintained in the absence of repeats 1-3. The crystallized fragmentmore » adopts a typical ankyrin fold including the poorly conserved seventh AR, as seen in the Drosophila Notch ANK domain (dANK). The structural preservation and stability of the C-terminal repeats shed a new light onto the mechanism of hetero-oligomeric assembly during Notch-mediated transcriptional activation.« less

  1. PconsFold: improved contact predictions improve protein models.

    PubMed

    Michel, Mirco; Hayat, Sikander; Skwark, Marcin J; Sander, Chris; Marks, Debora S; Elofsson, Arne

    2014-09-01

    Recently it has been shown that the quality of protein contact prediction from evolutionary information can be improved significantly if direct and indirect information is separated. Given sufficiently large protein families, the contact predictions contain sufficient information to predict the structure of many protein families. However, since the first studies contact prediction methods have improved. Here, we ask how much the final models are improved if improved contact predictions are used. In a small benchmark of 15 proteins, we show that the TM-scores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold. In a larger benchmark, we find that the quality is improved with 15-30% when using PconsC in comparison with earlier contact prediction methods. Further, using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved. PconsFold is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information. PconsFold is based on PconsC contact prediction and uses the Rosetta folding protocol. Due to its modularity, the contact prediction tool can be easily exchanged. The source code of PconsFold is available on GitHub at https://www.github.com/ElofssonLab/pcons-fold under the MIT license. PconsC is available from http://c.pcons.net/. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  2. BeStSel: a web server for accurate protein secondary structure prediction and fold recognition from the circular dichroism spectra.

    PubMed

    Micsonai, András; Wien, Frank; Bulyáki, Éva; Kun, Judit; Moussong, Éva; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2018-06-11

    Circular dichroism (CD) spectroscopy is a widely used method to study the protein secondary structure. However, for decades, the general opinion was that the correct estimation of β-sheet content is challenging because of the large spectral and structural diversity of β-sheets. Recently, we showed that the orientation and twisting of β-sheets account for the observed spectral diversity, and developed a new method to estimate accurately the secondary structure (PNAS, 112, E3095). BeStSel web server provides the Beta Structure Selection method to analyze the CD spectra recorded by conventional or synchrotron radiation CD equipment. Both normalized and measured data can be uploaded to the server either as a single spectrum or series of spectra. The originality of BeStSel is that it carries out a detailed secondary structure analysis providing information on eight secondary structure components including parallel-β structure and antiparallel β-sheets with three different groups of twist. Based on these, it predicts the protein fold down to the topology/homology level of the CATH protein fold classification. The server also provides a module to analyze the structures deposited in the PDB for BeStSel secondary structure contents in relation to Dictionary of Secondary Structure of Proteins data. The BeStSel server is freely accessible at http://bestsel.elte.hu.

  3. The Safety Dance: Biophysics of Membrane Protein Folding and Misfolding in a Cellular Context

    PubMed Central

    Schlebach, Jonathan P.; Sanders, Charles R.

    2015-01-01

    Most biological processes require the production and degradation of proteins, a task that weighs heavily on the cell. Mutations that compromise the conformational stability of proteins place both specific and general burdens on cellular protein homeostasis (proteostasis) in ways that contribute to numerous diseases. Efforts to elucidate the chain of molecular events responsible for diseases of protein folding address one of the foremost challenges in biomedical science. However, relatively little is known about the processes by which mutations prompt the misfolding of α-helical membrane proteins, which rely on an intricate network of cellular machinery to acquire and maintain their functional structures within cellular membranes. In this review, we summarize the current understanding of the physical principles that guide membrane protein biogenesis and folding in the context of mammalian cells. Additionally, we explore how pathogenic mutations that influence biogenesis may differ from those that disrupt folding and assembly, as well as how this may relate to disease mechanisms and therapeutic intervention. These perspectives indicate an imperative for the use of information from structural, cellular, and biochemical studies of membrane proteins in the design of novel therapeutics and in personalized medicine. PMID:25420508

  4. Entropic (de)stabilization of surface-bound peptides conjugated with polymers

    NASA Astrophysics Data System (ADS)

    Carmichael, Scott P.; Shell, M. Scott

    2015-12-01

    In many emerging biotechnologies, functional proteins must maintain their native structures on or near interfaces (e.g., tethered peptide arrays, protein coated nanoparticles, and amphiphilic peptide micelles). Because the presence of a surface is known to dramatically alter the thermostability of tethered proteins, strategies to stabilize surface-bound proteins are highly sought. Here, we show that polymer conjugation allows for significant control over the secondary structure and thermostability of a model surface-tethered peptide. We use molecular dynamics simulations to examine the folding behavior of a coarse-grained helical peptide that is conjugated to polymers of various lengths and at various conjugation sites. These polymer variations reveal surprisingly diverse behavior, with some stabilizing and some destabilizing the native helical fold. We show that ideal-chain polymer entropies explain these varied effects and can quantitatively predict shifts in folding temperature. We then develop a generic theoretical model, based on ideal-chain entropies, that predicts critical lengths for conjugated polymers to effect changes in the folding of a surface-bound protein. These results may inform new design strategies for the stabilization of surface-associated proteins important for a range technological applications.

  5. Entropic (de)stabilization of surface-bound peptides conjugated with polymers.

    PubMed

    Carmichael, Scott P; Shell, M Scott

    2015-12-28

    In many emerging biotechnologies, functional proteins must maintain their native structures on or near interfaces (e.g., tethered peptide arrays, protein coated nanoparticles, and amphiphilic peptide micelles). Because the presence of a surface is known to dramatically alter the thermostability of tethered proteins, strategies to stabilize surface-bound proteins are highly sought. Here, we show that polymer conjugation allows for significant control over the secondary structure and thermostability of a model surface-tethered peptide. We use molecular dynamics simulations to examine the folding behavior of a coarse-grained helical peptide that is conjugated to polymers of various lengths and at various conjugation sites. These polymer variations reveal surprisingly diverse behavior, with some stabilizing and some destabilizing the native helical fold. We show that ideal-chain polymer entropies explain these varied effects and can quantitatively predict shifts in folding temperature. We then develop a generic theoretical model, based on ideal-chain entropies, that predicts critical lengths for conjugated polymers to effect changes in the folding of a surface-bound protein. These results may inform new design strategies for the stabilization of surface-associated proteins important for a range technological applications.

  6. Automated design evolution of stereochemically randomized protein foldamers

    NASA Astrophysics Data System (ADS)

    Ranbhor, Ranjit; Kumar, Anil; Patel, Kirti; Ramakrishnan, Vibin; Durani, Susheel

    2018-05-01

    Diversification of chain stereochemistry opens up the possibilities of an ‘in principle’ increase in the design space of proteins. This huge increase in the sequence and consequent structural variation is aimed at the generation of smart materials. To diversify protein structure stereochemically, we introduced L- and D-α-amino acids as the design alphabet. With a sequence design algorithm, we explored the usage of specific variables such as chirality and the sequence of this alphabet in independent steps. With molecular dynamics, we folded stereochemically diverse homopolypeptides and evaluated their ‘fitness’ for possible design as protein-like foldamers. We propose a fitness function to prune the most optimal fold among 1000 structures simulated with an automated repetitive simulated annealing molecular dynamics (AR-SAMD) approach. The highly scored poly-leucine fold with sequence lengths of 24 and 30 amino acids were later sequence-optimized using a Dead End Elimination cum Monte Carlo based optimization tool. This paper demonstrates a novel approach for the de novo design of protein-like foldamers.

  7. The folding energy landscape and free energy excitations of cytochrome c.

    PubMed

    Weinkam, Patrick; Zimmermann, Jörg; Romesberg, Floyd E; Wolynes, Peter G

    2010-05-18

    The covalently bound heme cofactor plays a dominant role in the folding of cytochrome c. Because of the complicated inorganic chemistry of the heme, some might consider the folding of cytochrome c to be a special case, following principles different from those used to describe the folding of proteins without cofactors. Recent investigations, however, demonstrate that common models describing folding for many proteins work well for cytochrome c when heme is explicitly introduced, generally providing results that agree with experimental observations. In this Account, we first discuss results from simple native structure-based models. These models include attractive interactions between nonadjacent residues only if they are present in the crystal structure at pH 7. Because attractive nonnative contacts are not included in native structure-based models, their energy landscapes can be described as "perfectly funneled". In other words, native structure-based models are energetically guided towards the native state and contain no energetic traps that would hinder folding. Energetic traps are denoted sources of "frustration", which cause specific transient intermediates to be populated. Native structure-based models do, however, include repulsion between residues due to excluded volume. Nonenergetic traps can therefore exist if the chain, which cannot cross over itself, must partially unfold so that folding can proceed. The ability of native structure-based models to capture this kind of motion is partly responsible for their successful predictions of folding pathways for many types of proteins. Models without frustration describe the sequence of folding events for cytochrome c well (as inferred from hydrogen-exchange experiments), thereby justifying their use as a starting point. At low pH, the experimentally observed folding sequence of cytochrome c deviates from that at pH 7 and from models with perfectly funneled energy landscapes. Here, alternate folding pathways are a result of "chemical frustration". This frustration arises because some regions of the protein are destabilized more than others due to the heterogeneous distribution of titratable residues that are protonated at low pH. Beginning with native structure-based terms, we construct more complex models by adding chemical frustration. These more complex models only modestly perturb the energy landscape, which remains, overall, well funneled. These perturbed models can accurately describe how alternative folding pathways are used at low pH. At alkaline pH, cytochrome c populates distinctly different structural ensembles. For instance, lysine residues are deprotonated and compete for the heme ligation site. The same models that can describe folding at low pH also predict well the structures and relative stabilities of intermediates populated at alkaline pH. The success of models based on funneled energy landscapes suggest that cytochrome c folding is driven primarily by native contacts. The presence of heme appears to add chemical complexity to the folding process, but it does not require fundamental modification of the general principles used to describe folding. Moreover, its added complexity provides a valuable means of probing the folding energy landscape in greater detail than is possible with simpler systems.

  8. The turn of the screw: an exercise in protein secondary structure.

    PubMed

    Pikaart, Michael

    2011-01-01

    An exercise using simple paper strips to illustrate protein helical and sheet secondary structures is presented. Drawing on the rich historical context of the use of physical models in protein biochemistry by early practitioners, in particular Linus Pauling, the purpose of this activity is to cultivate in students a hands-on, intuitive sense of protein secondary structure and to complement the common computer-based structural portrayals often used in teaching biochemistry. As students fold these paper strips into model secondary structures, they will better grasp how intramolecular hydrogen bonds form in the folding of a polypeptide into secondary structure, and how these hydrogen bonds direct the overall shape of helical and sheet structures, including the handedness of the α-helix and the difference between right- and the left-handed twist. Copyright © 2010 Wiley Periodicals, Inc.

  9. Transiently disordered tails accelerate folding of globular proteins.

    PubMed

    Mallik, Saurav; Ray, Tanaya; Kundu, Sudip

    2017-07-01

    Numerous biological proteins exhibit intrinsic disorder at their termini, which are associated with multifarious functional roles. Here, we show the surprising result that an increased percentage of terminal short transiently disordered regions with enhanced flexibility (TstDREF) is associated with accelerated folding rates of globular proteins. Evolutionary conservation of predicted disorder at TstDREFs and drastic alteration of folding rates upon point-mutations suggest critical regulatory role(s) of TstDREFs in shaping the folding kinetics. TstDREFs are associated with long-range intramolecular interactions and the percentage of native secondary structural elements physically contacted by TstDREFs exhibit another surprising positive correlation with folding kinetics. These results allow us to infer probable molecular mechanisms behind the TstDREF-mediated regulation of folding kinetics that challenge protein biochemists to assess by direct experimental testing. © 2017 Federation of European Biochemical Societies.

  10. Protein folding on Biosensor tips: Folding of Maltodextrin glucosidase monitored by its interactions with GroEL

    PubMed Central

    Pastor, Ashutosh; Singh, Amit K.; Fisher, Mark T.; Chaudhuri, Tapan K.

    2016-01-01

    Protein folding has been extensively studied for past four decades by employing solution based experiments such as solubility, enzymatic activity, secondary structure analysis, and analytical methods like FRET, NMR and HD exchange. However, for rapid analysis of the folding process, solution based approaches are often plagued with aggregation side reactions resulting in poor yields. In this work we demonstrate that a Bio-Layer Interferometry (BLI) chaperonin detection system can be potentially applied to identify superior refolding conditions for denatured proteins. The degree of immobilized protein folding as a function of time can be detected by monitoring the binding of the high-affinity nucleotide-free form of the chaperonin GroEL. GroEL preferentially interacts with proteins that have hydrophobic surfaces exposed in their unfolded or partially folded form so a decrease in GroEL binding can be correlated with burial of hydrophobic surfaces as folding progresses. The magnitude of GroEL binding to the protein immobilized on Bio-layer interferometry biosensor inversely reflects the extent of protein folding and hydrophobic residue burial. We demonstrate conditions where accelerated folding can be observed for the aggregation prone protein Maltodextrin glucosidase (MalZ). Superior immobilized folding conditions identified on the Bio-layer interferometry biosensor surface were reproduced on Ni-NTA sepharose bead surfaces and resulted in significant improvement in folding yields of released MalZ (measured by enzymatic activity) compared to bulk refolding conditions in solution. PMID:27367928

  11. An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier.

    PubMed

    Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi

    2017-03-15

    Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  12. Fast large-scale clustering of protein structures using Gauss integrals.

    PubMed

    Harder, Tim; Borg, Mikael; Boomsma, Wouter; Røgen, Peter; Hamelryck, Thomas

    2012-02-15

    Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by Røgen and co-workers--and subsequently performing K-means clustering. Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.

  13. Use of pressure in reversed-phase liquid chromatography to study protein conformational changes by differential deuterium exchange.

    PubMed

    Makarov, Alexey A; Schafer, Wes A; Helmy, Roy

    2015-02-17

    The market of protein therapeutics is exploding, and characterization methods for proteins are being further developed to understand and explore conformational structures with regards to function and activity. There are several spectroscopic techniques that allow for analyzing protein secondary structure in solution. However, a majority of these techniques need to use purified protein, concentrated enough in the solution to produce a relevant spectrum. In this study, we describe a novel approach which uses ultrahigh pressure liquid chromatography (UHPLC) coupled with mass-spectrometry (MS) to explore compressibility of the secondary structure of proteins under increasing pressure detected by hydrogen-deuterium exchange (HDX). Several model proteins were used for these studies. The studies were conducted with UHPLC in isocratic mode at constant flow rate and temperature. The pressure was modified by a backpressure regulator up to about 1200 bar. It was found that the increase of retention factors upon pressure increase, at constant flow rate and temperature, was based on reduction of the proteins' molecular molar volume. The change in the proteins' molecular molar volume was caused by changes in protein folding, as was revealed by differential deuterium exchange. The degree of protein folding under certain UHPLC conditions can be controlled by pressure, at constant temperature and flow rate. By modifying pressure during UHPLC separation, it was possible to achieve changes in protein folding, which were manifested as changes in the number of labile protons exchanged to deuterons, or vice versa. Moreover, it was demonstrated with bovine insulin that a small difference in the number of protons exchanged to deuterons (based on protein folding under pressure) could be observed between batches obtained from different sources. The use of HDX during UHPLC separation allowed one to examine protein folding by pressure at constant flow rate and temperature in a mixture of sample solution with minimal amounts of sample used for analysis.

  14. The Histone Database: an integrated resource for histones and histone fold-containing proteins

    PubMed Central

    Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David

    2011-01-01

    Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671

  15. Structure and activity of the Pseudomonas aeruginosa hotdog-fold thioesterases PA5202 and PA2801

    PubMed Central

    Gonzalez, Claudio F.; Tchigvintsev, Anatoli; Brown, Greg; Flick, Robert; Evdokimova, Elena; Xu, Xiaohui; Osipiuk, Jerzy; Cuff, Marianne E.; Lynch, Susan; Joachimiak, Andrzej; Savchenko, Alexei; Yakunin, Alexander F.

    2013-01-01

    The hotdog fold is one of the basic protein folds widely present in bacteria, archaea, and eukaryotes. Many of these proteins exhibit thioesterase activity against fatty acyl-CoAs and play important roles in lipid metabolism, cellular signaling, and degradation of xenobiotics. The genome of the opportunistic pathogen Pseudomonas aeruginosa contains over 20 genes encoding predicted hotdog-fold proteins, none of which have been experimentally characterized. We have found that two P. aeruginosa hotdog proteins display high thioesterase activity against 3-hydroxy-3-methylglutaryl-CoA and glutaryl-CoA (PA5202), and octanoyl-CoA (PA2801). Crystal structures of these proteins were solved (1.70 and 1.75 Å) and revealed a hotdog fold with a potential catalytic carboxylate residue located on the long alpha helix (Asp57 in PA5202 and Glu35 in PA2801). Alanine replacement mutagenesis of PA5202 identified four residues (Asn42, Arg43, Asp57, and Thr76), which are critical for activity and are located in the active site. A P. aeruginosa PA5202 deletion strain showed an increased secretion of the antimicrobial pigment pyocyanine and an increased expression of genes involved in pyocyanin biosynthesis suggesting a functional link between the PA5202 activity and pyocyanin production. Thus, the P. aeruginosa hotdog thioesterases PA5202 and PA2801 have similar structures, but exhibit different substrate preferences and functions. PMID:22439787

  16. Retarded protein folding of deficient human α1-antitrypsin D256V and L41P variants

    PubMed Central

    Jung, Chan-Hun; Na, Yu-Ran; Im, Hana

    2004-01-01

    α1-Antitrypsin is the most abundant protease inhibitor in plasma and is the archetype of the serine protease inhibitor superfamily. Genetic variants of human α1-antitrypsin are associated with early-onset emphysema and liver cirrhosis. However, the detailed molecular mechanism for the pathogenicity of most variant α1-antitrypsin molecules is not known. Here we examined the structural basis of a dozen deficient α1-antitrypsin variants. Unlike most α1-antitrypsin variants, which were unstable, D256V and L41P variants exhibited extremely retarded protein folding as compared with the wild-type molecule. Once folded, however, the stability and inhibitory activity of these variant proteins were comparable to those of the wild-type molecule. Retarded protein folding may promote protein aggregation by allowing the accumulation of aggregation-prone folding intermediates. Repeated observations of retarded protein folding indicate that it is an important mechanism causing α1-antitrypsin deficiency by variant molecules, which have to fold into the metastable native form to be functional. PMID:14767073

  17. Anomalous diffusion in neutral evolution of model proteins.

    PubMed

    Nelson, Erik D; Grishin, Nick V

    2015-06-01

    Protein evolution is frequently explored using minimalist polymer models, however, little attention has been given to the problem of structural drift, or diffusion. Here, we study neutral evolution of small protein motifs using an off-lattice heteropolymer model in which individual monomers interact as low-resolution amino acids. In contrast to most earlier models, both the length and folded structure of the polymers are permitted to change. To describe structural change, we compute the mean-square distance (MSD) between monomers in homologous folds separated by n neutral mutations. We find that structural change is episodic, and, averaged over lineages (for example, those extending from a single sequence), exhibits a power-law dependence on n. We show that this exponent depends on the alignment method used, and we analyze the distribution of waiting times between neutral mutations. The latter are more disperse than for models required to maintain a specific fold, but exhibit a similar power-law tail.

  18. Anomalous diffusion in neutral evolution of model proteins

    NASA Astrophysics Data System (ADS)

    Nelson, Erik D.; Grishin, Nick V.

    2015-06-01

    Protein evolution is frequently explored using minimalist polymer models, however, little attention has been given to the problem of structural drift, or diffusion. Here, we study neutral evolution of small protein motifs using an off-lattice heteropolymer model in which individual monomers interact as low-resolution amino acids. In contrast to most earlier models, both the length and folded structure of the polymers are permitted to change. To describe structural change, we compute the mean-square distance (MSD) between monomers in homologous folds separated by n neutral mutations. We find that structural change is episodic, and, averaged over lineages (for example, those extending from a single sequence), exhibits a power-law dependence on n . We show that this exponent depends on the alignment method used, and we analyze the distribution of waiting times between neutral mutations. The latter are more disperse than for models required to maintain a specific fold, but exhibit a similar power-law tail.

  19. In silico insights into protein-protein interactions and folding dynamics of the saposin-like domain of Solanum tuberosum aspartic protease.

    PubMed

    De Moura, Dref C; Bryksa, Brian C; Yada, Rickey Y

    2014-01-01

    The plant-specific insert is an approximately 100-residue domain found exclusively within the C-terminal lobe of some plant aspartic proteases. Structurally, this domain is a member of the saposin-like protein family, and is involved in plant pathogen defense as well as vacuolar targeting of the parent protease molecule. Similar to other members of the saposin-like protein family, most notably saposins A and C, the recently resolved crystal structure of potato (Solanum tuberosum) plant-specific insert has been shown to exist in a substrate-bound open conformation in which the plant-specific insert oligomerizes to form homodimers. In addition to the open structure, a closed conformation also exists having the classic saposin fold of the saposin-like protein family as observed in the crystal structure of barley (Hordeum vulgare L.) plant-specific insert. In the present study, the mechanisms of tertiary and quaternary conformation changes of potato plant-specific insert were investigated in silico as a function of pH. Umbrella sampling and determination of the free energy change of dissociation of the plant-specific insert homodimer revealed that increasing the pH of the system to near physiological levels reduced the free energy barrier to dissociation. Furthermore, principal component analysis was used to characterize conformational changes at both acidic and neutral pH. The results indicated that the plant-specific insert may adopt a tertiary structure similar to the characteristic saposin fold and suggest a potential new structural motif among saposin-like proteins. To our knowledge, this acidified PSI structure presents the first example of an alternative saposin-fold motif for any member of the large and diverse SAPLIP family.

  20. In Silico Insights into Protein-Protein Interactions and Folding Dynamics of the Saposin-Like Domain of Solanum tuberosum Aspartic Protease

    PubMed Central

    De Moura, Dref C.; Bryksa, Brian C.; Yada, Rickey Y.

    2014-01-01

    The plant-specific insert is an approximately 100-residue domain found exclusively within the C-terminal lobe of some plant aspartic proteases. Structurally, this domain is a member of the saposin-like protein family, and is involved in plant pathogen defense as well as vacuolar targeting of the parent protease molecule. Similar to other members of the saposin-like protein family, most notably saposins A and C, the recently resolved crystal structure of potato (Solanum tuberosum) plant-specific insert has been shown to exist in a substrate-bound open conformation in which the plant-specific insert oligomerizes to form homodimers. In addition to the open structure, a closed conformation also exists having the classic saposin fold of the saposin-like protein family as observed in the crystal structure of barley (Hordeum vulgare L.) plant-specific insert. In the present study, the mechanisms of tertiary and quaternary conformation changes of potato plant-specific insert were investigated in silico as a function of pH. Umbrella sampling and determination of the free energy change of dissociation of the plant-specific insert homodimer revealed that increasing the pH of the system to near physiological levels reduced the free energy barrier to dissociation. Furthermore, principal component analysis was used to characterize conformational changes at both acidic and neutral pH. The results indicated that the plant-specific insert may adopt a tertiary structure similar to the characteristic saposin fold and suggest a potential new structural motif among saposin-like proteins. To our knowledge, this acidified PSI structure presents the first example of an alternative saposin-fold motif for any member of the large and diverse SAPLIP family. PMID:25188221

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    FLANAGAN,J.M.; BEWLEY,M.C.

    It is generally accepted that the information necessary to specify the native, functional, three-dimensional structure of a protein is encoded entirely within its amino acid sequence; however, efficient reversible folding and unfolding is observed only with a subset of small single-domain proteins. Refolding experiments often lead to the formation of kinetically-trapped, misfolded species that aggregate, even in dilute solution. In the cellular environment, the barriers to efficient protein folding and maintenance of native structure are even larger due to the nature of this process. First, nascent polypeptides must fold in an extremely crowded environment where the concentration of macromolecules approachesmore » 300-400 mg/mL and on average, each ribosome is within its own diameter of another ribosome (1-3). These conditions of severe molecular crowding, coupled with high concentrations of nascent polypeptide chains, favor nonspecific aggregation over productive folding (3). Second, folding of newly-translated polypeptides occurs in the context of their vehtorial synthesis process. Amino acids are added to a growing nascent chain at the rate of -5 residues per set, which means that for a 300 residue protein its N-terminus will be exposed to the cytosol {approx}1 min before its C-terminus and be free to begin the folding process. However, because protein folding is highly cooperative, the nascent polypeptide cannot reach its native state until a complete folding domain (50-250 residues) has emerged from the ribosome. Thus, for a single-domain protein, the final steps in folding are only completed post-translationally since {approx}40 residues of a nascent chain are sequestered within the exit channel of the ribosome and are not available for folding (4). A direct consequence of this limitation in cellular folding is that during translation incomplete domains will exist in partially-folded states that tend to expose hydrophobic residues that are prone to aggregation and/or misfolding. Thus it is not surprising that, in cells, the protein folding process is error prone and organisms have evolved ''editing'' or quality control (QC) systems to assist in the folding, maintenance and, when necessary, selective removal of damaged proteins. In fact, there is growing evidence that failure of these QC-systems contributes to a number of disease states (5-8). This chapter describes our current understanding of the nature and mechanisms of the protein quality control systems in the cytosol of bacteria. Parallel systems are exploited in the cytosol and mitochondria of eukaryotes to prevent the accumulation of misfolded proteins.« less

  2. Protein Aggregation/Folding: The Role of Deterministic Singularities of Sequence Hydrophobicity as Determined by Nonlinear Signal Analysis of Acylphosphatase and Aβ(1–40)

    PubMed Central

    Zbilut, Joseph P.; Colosimo, Alfredo; Conti, Filippo; Colafranceschi, Mauro; Manetti, Cesare; Valerio, MariaCristina; Webber, Charles L.; Giuliani, Alessandro

    2003-01-01

    The problem of protein folding vs. aggregation was investigated in acylphosphatase and the amyloid protein Aβ(1–40) by means of nonlinear signal analysis of their chain hydrophobicity. Numerical descriptors of recurrence patterns provided the basis for statistical evaluation of folding/aggregation distinctive features. Static and dynamic approaches were used to elucidate conditions coincident with folding vs. aggregation using comparisons with known protein secondary structure classifications, site-directed mutagenesis studies of acylphosphatase, and molecular dynamics simulations of amyloid protein, Aβ(1–40). The results suggest that a feature derived from principal component space characterized by the smoothness of singular, deterministic hydrophobicity patches plays a significant role in the conditions governing protein aggregation. PMID:14645049

  3. Neuroligin Trafficking Deficiencies Arising from Mutations in the α/β-Hydrolase Fold Protein Family*

    PubMed Central

    De Jaco, Antonella; Lin, Michael Z.; Dubi, Noga; Comoletti, Davide; Miller, Meghan T.; Camp, Shelley; Ellisman, Mark; Butko, Margaret T.; Tsien, Roger Y.; Taylor, Palmer

    2010-01-01

    Despite great functional diversity, characterization of the α/β-hydrolase fold proteins that encompass a superfamily of hydrolases, heterophilic adhesion proteins, and chaperone domains reveals a common structural motif. By incorporating the R451C mutation found in neuroligin (NLGN) and associated with autism and the thyroglobulin G2320R (G221R in NLGN) mutation responsible for congenital hypothyroidism into NLGN3, we show that mutations in the α/β-hydrolase fold domain influence folding and biosynthetic processing of neuroligin3 as determined by in vitro susceptibility to proteases, glycosylation processing, turnover, and processing rates. We also show altered interactions of the mutant proteins with chaperones in the endoplasmic reticulum and arrest of transport along the secretory pathway with diversion to the proteasome. Time-controlled expression of a fluorescently tagged neuroligin in hippocampal neurons shows that these mutations compromise neuronal trafficking of the protein, with the R451C mutation reducing and the G221R mutation virtually abolishing the export of NLGN3 from the soma to the dendritic spines. Although the R451C mutation causes a local folding defect, the G221R mutation appears responsible for more global misfolding of the protein, reflecting their sequence positions in the structure of the protein. Our results suggest that disease-related mutations in the α/β-hydrolase fold domain share common trafficking deficiencies yet lead to discrete congenital disorders of differing severity in the endocrine and nervous systems. PMID:20615874

  4. Neuroligin trafficking deficiencies arising from mutations in the alpha/beta-hydrolase fold protein family.

    PubMed

    De Jaco, Antonella; Lin, Michael Z; Dubi, Noga; Comoletti, Davide; Miller, Meghan T; Camp, Shelley; Ellisman, Mark; Butko, Margaret T; Tsien, Roger Y; Taylor, Palmer

    2010-09-10

    Despite great functional diversity, characterization of the alpha/beta-hydrolase fold proteins that encompass a superfamily of hydrolases, heterophilic adhesion proteins, and chaperone domains reveals a common structural motif. By incorporating the R451C mutation found in neuroligin (NLGN) and associated with autism and the thyroglobulin G2320R (G221R in NLGN) mutation responsible for congenital hypothyroidism into NLGN3, we show that mutations in the alpha/beta-hydrolase fold domain influence folding and biosynthetic processing of neuroligin3 as determined by in vitro susceptibility to proteases, glycosylation processing, turnover, and processing rates. We also show altered interactions of the mutant proteins with chaperones in the endoplasmic reticulum and arrest of transport along the secretory pathway with diversion to the proteasome. Time-controlled expression of a fluorescently tagged neuroligin in hippocampal neurons shows that these mutations compromise neuronal trafficking of the protein, with the R451C mutation reducing and the G221R mutation virtually abolishing the export of NLGN3 from the soma to the dendritic spines. Although the R451C mutation causes a local folding defect, the G221R mutation appears responsible for more global misfolding of the protein, reflecting their sequence positions in the structure of the protein. Our results suggest that disease-related mutations in the alpha/beta-hydrolase fold domain share common trafficking deficiencies yet lead to discrete congenital disorders of differing severity in the endocrine and nervous systems.

  5. Crystal structure of P58(IPK) TPR fragment reveals the mechanism for its molecular chaperone activity in UPR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tao, Jiahui; Petrova, Kseniya; Ron, David

    2010-05-25

    P58(IPK) might function as an endoplasmic reticulum molecular chaperone to maintain protein folding homeostasis during unfolded protein responses. P58(IPK) contains nine tetratricopeptide repeat (TPR) motifs and a C-terminal J-domain within its primary sequence. To investigate the mechanism by which P58(IPK) functions to promote protein folding within the endoplasmic reticulum, we have determined the crystal structure of P58(IPK) TPR fragment to 2.5 {angstrom} resolution by the SAD method. The crystal structure of P58(IPK) revealed three domains (I-III) with similar folds and each domain contains three TPR motifs. An ELISA assay indicated that P58(IPK) acts as a molecular chaperone by interacting withmore » misfolded proteins such as luciferase and rhodanese. The P58(IPK) structure reveals a conserved hydrophobic patch located in domain I that might be involved in binding the misfolded polypeptides. Structure-based mutagenesis for the conserved hydrophobic residues located in domain I significantly reduced the molecular chaperone activity of P58(IPK).« less

  6. Structural anatomy of telomere OB proteins.

    PubMed

    Horvath, Martin P

    2011-10-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA.

  7. Structural anatomy of telomere OB proteins

    PubMed Central

    Horvath, Martin P.

    2015-01-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA. PMID:21950380

  8. Using the Computer Game "FoldIt" to Entice Students to Explore External Representations of Protein Structure in a Biochemistry Course for Nonmajors

    ERIC Educational Resources Information Center

    Farley, Peter C.

    2013-01-01

    This article describes a novel approach to teaching novice Biochemistry students visual literacy skills and understanding of some aspects of protein structure using the internet resource FoldIt and a worksheet based on selected Introductory Puzzles from this computer game. In responding to a questionnaire, students indicated that they (94%)…

  9. The parallel universe of RNA folding.

    PubMed

    Batey, R T; Doudna, J A

    1998-05-01

    How do large RNA molecules find their active conformations among a universe of possible structures? Two recent studies reveal that RNA folding is a rapid and ordered process, with surprising similarities to protein folding mechanisms.

  10. Protein Folding and Structure Prediction from the Ground Up: The Atomistic Associative Memory, Water Mediated, Structure and Energy Model.

    PubMed

    Chen, Mingchen; Lin, Xingcheng; Zheng, Weihua; Onuchic, José N; Wolynes, Peter G

    2016-08-25

    The associative memory, water mediated, structure and energy model (AWSEM) is a coarse-grained force field with transferable tertiary interactions that incorporates local in sequence energetic biases using bioinformatically derived structural information about peptide fragments with locally similar sequences that we call memories. The memory information from the protein data bank (PDB) database guides proper protein folding. The structural information about available sequences in the database varies in quality and can sometimes lead to frustrated free energy landscapes locally. One way out of this difficulty is to construct the input fragment memory information from all-atom simulations of portions of the complete polypeptide chain. In this paper, we investigate this approach first put forward by Kwac and Wolynes in a more complete way by studying the structure prediction capabilities of this approach for six α-helical proteins. This scheme which we call the atomistic associative memory, water mediated, structure and energy model (AAWSEM) amounts to an ab initio protein structure prediction method that starts from the ground up without using bioinformatic input. The free energy profiles from AAWSEM show that atomistic fragment memories are sufficient to guide the correct folding when tertiary forces are included. AAWSEM combines the efficiency of coarse-grained simulations on the full protein level with the local structural accuracy achievable from all-atom simulations of only parts of a large protein. The results suggest that a hybrid use of atomistic fragment memory and database memory in structural predictions may well be optimal for many practical applications.

  11. The Folding of de Novo Designed Protein DS119 via Molecular Dynamics Simulations.

    PubMed

    Wang, Moye; Hu, Jie; Zhang, Zhuqing

    2016-04-26

    As they are not subjected to natural selection process, de novo designed proteins usually fold in a manner different from natural proteins. Recently, a de novo designed mini-protein DS119, with a βαβ motif and 36 amino acids, has folded unusually slowly in experiments, and transient dimers have been detected in the folding process. Here, by means of all-atom replica exchange molecular dynamics (REMD) simulations, several comparably stable intermediate states were observed on the folding free-energy landscape of DS119. Conventional molecular dynamics (CMD) simulations showed that when two unfolded DS119 proteins bound together, most binding sites of dimeric aggregates were located at the N-terminal segment, especially residues 5-10, which were supposed to form β-sheet with its own C-terminal segment. Furthermore, a large percentage of individual proteins in the dimeric aggregates adopted conformations similar to those in the intermediate states observed in REMD simulations. These results indicate that, during the folding process, DS119 can easily become trapped in intermediate states. Then, with diffusion, a transient dimer would be formed and stabilized with the binding interface located at N-terminals. This means that it could not quickly fold to the native structure. The complicated folding manner of DS119 implies the important influence of natural selection on protein-folding kinetics, and more improvement should be achieved in rational protein design.

  12. The Folding of de Novo Designed Protein DS119 via Molecular Dynamics Simulations

    PubMed Central

    Wang, Moye; Hu, Jie; Zhang, Zhuqing

    2016-01-01

    As they are not subjected to natural selection process, de novo designed proteins usually fold in a manner different from natural proteins. Recently, a de novo designed mini-protein DS119, with a βαβ motif and 36 amino acids, has folded unusually slowly in experiments, and transient dimers have been detected in the folding process. Here, by means of all-atom replica exchange molecular dynamics (REMD) simulations, several comparably stable intermediate states were observed on the folding free-energy landscape of DS119. Conventional molecular dynamics (CMD) simulations showed that when two unfolded DS119 proteins bound together, most binding sites of dimeric aggregates were located at the N-terminal segment, especially residues 5–10, which were supposed to form β-sheet with its own C-terminal segment. Furthermore, a large percentage of individual proteins in the dimeric aggregates adopted conformations similar to those in the intermediate states observed in REMD simulations. These results indicate that, during the folding process, DS119 can easily become trapped in intermediate states. Then, with diffusion, a transient dimer would be formed and stabilized with the binding interface located at N-terminals. This means that it could not quickly fold to the native structure. The complicated folding manner of DS119 implies the important influence of natural selection on protein-folding kinetics, and more improvement should be achieved in rational protein design. PMID:27128902

  13. Protein domain assignment from the recurrence of locally similar structures

    PubMed Central

    Tai, Chin-Hsien; Sam, Vichetra; Gibrat, Jean-Francois; Garnier, Jean; Munson, Peter J.

    2010-01-01

    Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6,373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies. PMID:21287617

  14. Overlooked Short Toxin-Like Proteins: A Shortcut to Drug Design

    PubMed Central

    Linial, Michal

    2017-01-01

    Short stable peptides have huge potential for novel therapies and biosimilars. Cysteine-rich short proteins are characterized by multiple disulfide bridges in a compact structure. Many of these metazoan proteins are processed, folded, and secreted as soluble stable folds. These properties are shared by both marine and terrestrial animal toxins. These stable short proteins are promising sources for new drug development. We developed ClanTox (classifier of animal toxins) to identify toxin-like proteins (TOLIPs) using machine learning models trained on a large-scale proteomic database. Insects proteomes provide a rich source for protein innovations. Therefore, we seek overlooked toxin-like proteins from insects (coined iTOLIPs). Out of 4180 short (<75 amino acids) secreted proteins, 379 were predicted as iTOLIPs with high confidence, with as many as 30% of the genes marked as uncharacterized. Based on bioinformatics, structure modeling, and data-mining methods, we found that the most significant group of predicted iTOLIPs carry antimicrobial activity. Among the top predicted sequences were 120 termicin genes from termites with antifungal properties. Structural variations of insect antimicrobial peptides illustrate the similarity to a short version of the defensin fold with antifungal specificity. We also identified 9 proteins that strongly resemble ion channel inhibitors from scorpion and conus toxins. Furthermore, we assigned functional fold to numerous uncharacterized iTOLIPs. We conclude that a systematic approach for finding iTOLIPs provides a rich source of peptides for drug design and innovative therapeutic discoveries. PMID:29109389

  15. Structure Prediction and Analysis of Neuraminidase Sequence Variants

    ERIC Educational Resources Information Center

    Thayer, Kelly M.

    2016-01-01

    Analyzing protein structure has become an integral aspect of understanding systems of biochemical import. The laboratory experiment endeavors to introduce protein folding to ascertain structures of proteins for which the structure is unavailable, as well as to critically evaluate the quality of the prediction obtained. The model system used is the…

  16. Mechanisms of protein-folding diseases at a glance.

    PubMed

    Valastyan, Julie S; Lindquist, Susan

    2014-01-01

    For a protein to function appropriately, it must first achieve its proper conformation and location within the crowded environment inside the cell. Multiple chaperone systems are required to fold proteins correctly. In addition, degradation pathways participate by destroying improperly folded proteins. The intricacy of this multisystem process provides many opportunities for error. Furthermore, mutations cause misfolded, nonfunctional forms of proteins to accumulate. As a result, many pathological conditions are fundamentally rooted in the protein-folding problem that all cells must solve to maintain their function and integrity. Here, to illustrate the breadth of this phenomenon, we describe five examples of protein-misfolding events that can lead to disease: improper degradation, mislocalization, dominant-negative mutations, structural alterations that establish novel toxic functions, and amyloid accumulation. In each case, we will highlight current therapeutic options for battling such diseases.

  17. Exploring the protein folding free energy landscape: coupling replica exchange method with P3ME/RESPA algorithm.

    PubMed

    Zhou, Ruhong

    2004-05-01

    A highly parallel replica exchange method (REM) that couples with a newly developed molecular dynamics algorithm particle-particle particle-mesh Ewald (P3ME)/RESPA has been proposed for efficient sampling of protein folding free energy landscape. The algorithm is then applied to two separate protein systems, beta-hairpin and a designed protein Trp-cage. The all-atom OPLSAA force field with an explicit solvent model is used for both protein folding simulations. Up to 64 replicas of solvated protein systems are simulated in parallel over a wide range of temperatures. The combined trajectories in temperature and configurational space allow a replica to overcome free energy barriers present at low temperatures. These large scale simulations reveal detailed results on folding mechanisms, intermediate state structures, thermodynamic properties and the temperature dependences for both protein systems.

  18. Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.

    PubMed

    Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G

    2016-01-01

    While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.

  19. From Sequence and Forces to Structure, Function and Evolution of Intrinsically Disordered Proteins

    PubMed Central

    Forman-Kay, Julie D.; Mittag, Tanja

    2015-01-01

    Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales and compactness is shaping a unified understanding of structure-dynamics-disorder/function relationships. On the 20th anniversary of this journal, Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional and evolutionary properties. PMID:24010708

  20. From sequence and forces to structure, function, and evolution of intrinsically disordered proteins.

    PubMed

    Forman-Kay, Julie D; Mittag, Tanja

    2013-09-03

    Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics, and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales, and compactness are shaping a unified understanding of structure-dynamics-disorder/function relationships. In the 20(th) anniversary of Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional, and evolutionary properties. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Molecular Dynamics based on a Generalized Born solvation model: application to protein folding

    NASA Astrophysics Data System (ADS)

    Onufriev, Alexey

    2004-03-01

    An accurate description of the aqueous environment is essential for realistic biomolecular simulations, but may become very expensive computationally. We have developed a version of the Generalized Born model suitable for describing large conformational changes in macromolecules. The model represents the solvent implicitly as continuum with the dielectric properties of water, and include charge screening effects of salt. The computational cost associated with the use of this model in Molecular Dynamics simulations is generally considerably smaller than the cost of representing water explicitly. Also, compared to traditional Molecular Dynamics simulations based on explicit water representation, conformational changes occur much faster in implicit solvation environment due to the absence of viscosity. The combined speed-up allow one to probe conformational changes that occur on much longer effective time-scales. We apply the model to folding of a 46-residue three helix bundle protein (residues 10-55 of protein A, PDB ID 1BDD). Starting from an unfolded structure at 450 K, the protein folds to the lowest energy state in 6 ns of simulation time, which takes about a day on a 16 processor SGI machine. The predicted structure differs from the native one by 2.4 A (backbone RMSD). Analysis of the structures seen on the folding pathway reveals details of the folding process unavailable form experiment.

  2. Towards quantitative classification of folded proteins in terms of elementary functions.

    PubMed

    Hu, Shuangwei; Krokhotin, Andrei; Niemi, Antti J; Peng, Xubiao

    2011-04-01

    A comparative classification scheme provides a good basis for several approaches to understand proteins, including prediction of relations between their structure and biological function. But it remains a challenge to combine a classification scheme that describes a protein starting from its well-organized secondary structures and often involves direct human involvement, with an atomary-level physics-based approach where a protein is fundamentally nothing more than an ensemble of mutually interacting carbon, hydrogen, oxygen, and nitrogen atoms. In order to bridge these two complementary approaches to proteins, conceptually novel tools need to be introduced. Here we explain how an approach toward geometric characterization of entire folded proteins can be based on a single explicit elementary function that is familiar from nonlinear physical systems where it is known as the kink soliton. Our approach enables the conversion of hierarchical structural information into a quantitative form that allows for a folded protein to be characterized in terms of a small number of global parameters that are in principle computable from atomary-level considerations. As an example we describe in detail how the native fold of the myoglobin 1M6C emerges from a combination of kink solitons with a very high atomary-level accuracy. We also verify that our approach describes longer loops and loops connecting α helices with β strands, with the same overall accuracy. ©2011 American Physical Society

  3. Folding processes of the B domain of protein A to the native state observed in all-atom ab initio folding simulations

    NASA Astrophysics Data System (ADS)

    Lei, Hongxing; Wu, Chun; Wang, Zhi-Xiang; Zhou, Yaoqi; Duan, Yong

    2008-06-01

    Reaching the native states of small proteins, a necessary step towards a comprehensive understanding of the folding mechanisms, has remained a tremendous challenge to ab initio protein folding simulations despite the extensive effort. In this work, the folding process of the B domain of protein A (BdpA) has been simulated by both conventional and replica exchange molecular dynamics using AMBER FF03 all-atom force field. Started from an extended chain, a total of 40 conventional (each to 1.0 μs) and two sets of replica exchange (each to 200.0 ns per replica) molecular dynamics simulations were performed with different generalized-Born solvation models and temperature control schemes. The improvements in both the force field and solvent model allowed successful simulations of the folding process to the native state as demonstrated by the 0.80 A˚ Cα root mean square deviation (RMSD) of the best folded structure. The most populated conformation was the native folded structure with a high population. This was a significant improvement over the 2.8 A˚ Cα RMSD of the best nativelike structures from previous ab initio folding studies on BdpA. To the best of our knowledge, our results demonstrate, for the first time, that ab initio simulations can reach the native state of BdpA. Consistent with experimental observations, including Φ-value analyses, formation of helix II/III hairpin was a crucial step that provides a template upon which helix I could form and the folding process could complete. Early formation of helix III was observed which is consistent with the experimental results of higher residual helical content of isolated helix III among the three helices. The calculated temperature-dependent profile and the melting temperature were in close agreement with the experimental results. The simulations further revealed that phenylalanine 31 may play critical to achieve the correct packing of the three helices which is consistent with the experimental observation. In addition to the mechanistic studies, an ab initio structure prediction was also conducted based on both the physical energy and a statistical potential. Based on the lowest physical energy, the predicted structure was 2.0 A˚ Cα RMSD away from the experimentally determined structure.

  4. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement

    PubMed Central

    Xu, Dong; Zhang, Jian; Roy, Ambrish; Zhang, Yang

    2011-01-01

    I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and FG-MD, were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of beta-proteins are still needed to further improve the I-TASSER pipeline. PMID:22069036

  5. A residue in helical conformation in the native state adopts a β-strand conformation in the folding transition state despite its high and canonical Φ-value.

    PubMed

    Zarrine-Afsar, Arash; Dahesh, Samira; Davidson, Alan R

    2012-05-01

    Delineating structures of the transition states in protein folding reactions has provided great insight into the mechanisms by which proteins fold. The most common method for obtaining this information is Φ-value analysis, which is carried out by measuring the changes in the folding and unfolding rates caused by single amino acid substitutions at various positions within a given protein. Canonical Φ-values range between 0 and 1, and residues displaying high values within this range are interpreted to be important in stabilizing the transition state structure, and to elicit this stabilization through native-like interactions. Although very successful in defining the general features of transition state structures, Φ-value analysis can be confounded when non-native interactions stabilize this state. In addition, direct information on backbone conformation within the transition state is not provided. In the work described here, we have investigated structure formation at a conserved β-bulge (with helical conformation) in the Fyn SH3 domain by characterizing the effects of substituting all natural amino acids at one position within this structural motif. By comparing the effects on folding rates of these substitutions with database-derived local structure propensity values, we have determined that this position adopts a non-native backbone conformation in the folding transition state. This result is surprising because this position displays a high and canonical Φ-value of 0.7. This work emphasizes the potential role of non-native conformations in folding pathways and demonstrates that even positions displaying high and canonical Φ-values may, nevertheless, adopt a non-native conformation in the transition state. Copyright © 2012 Wiley Periodicals, Inc.

  6. Pushing the size limit of de novo structure ensemble prediction guided by sparse SDSL-EPR restraints to 200 residues: The monomeric and homodimeric forms of BAX

    PubMed Central

    Fischer, Axel W.; Bordignon, Enrica; Bleicken, Stephanie; García-Sáez, Ana J.; Jeschke, Gunnar; Meiler, Jens

    2016-01-01

    Structure determination remains a challenge for many biologically important proteins. In particular, proteins that adopt multiple conformations often evade crystallization in all biologically relevant states. Although computational de novo protein folding approaches often sample biologically relevant conformations, the selection of the most accurate model for different functional states remains a formidable challenge, in particular, for proteins with more than about 150 residues. Electron paramagnetic resonance (EPR) spectroscopy can obtain limited structural information for proteins in well-defined biological states and thereby assist in selecting biologically relevant conformations. The present study demonstrates that de novo folding methods are able to accurately sample the folds of 192-residue long soluble monomeric Bcl-2-associated X protein (BAX). The tertiary structures of the monomeric and homodimeric forms of BAX were predicted using the primary structure as well as 25 and 11 EPR distance restraints, respectively. The predicted models were subsequently compared to respective NMR/X-ray structures of BAX. EPR restraints improve the protein-size normalized root-mean-square-deviation (RMSD100) of the most accurate models with respect to the NMR/crystal structure from 5.9 Å to 3.9 Å and from 5.7 Å to 3.3 Å, respectively. Additionally, the model discrimination is improved, which is demonstrated by an improvement of the enrichment from 5% to 15% and from 13% to 21%, respectively. PMID:27129417

  7. Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs

    PubMed Central

    2011-01-01

    Background Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design. Results In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys. Conclusions Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry. PMID:21605466

  8. Lipid-protein nanodiscs for cell-free production of integral membrane proteins in a soluble and folded state: comparison with detergent micelles, bicelles and liposomes.

    PubMed

    Lyukmanova, E N; Shenkarev, Z O; Khabibullina, N F; Kopeina, G S; Shulepko, M A; Paramonov, A S; Mineev, K S; Tikhonov, R V; Shingarova, L N; Petrovskaya, L E; Dolgikh, D A; Arseniev, A S; Kirpichnikov, M P

    2012-03-01

    Production of integral membrane proteins (IMPs) in a folded state is a key prerequisite for their functional and structural studies. In cell-free (CF) expression systems membrane mimicking components could be added to the reaction mixture that promotes IMP production in a soluble form. Here lipid-protein nanodiscs (LPNs) of different lipid compositions (DMPC, DMPG, POPC, POPC/DOPG) have been compared with classical membrane mimicking media such as detergent micelles, lipid/detergent bicelles and liposomes by their ability to support CF synthesis of IMPs in a folded and soluble state. Three model membrane proteins of different topology were used: homodimeric transmembrane (TM) domain of human receptor tyrosine kinase ErbB3 (TM-ErbB3, 1TM); voltage-sensing domain of K(+) channel KvAP (VSD, 4TM); and bacteriorhodopsin from Exiguobacterium sibiricum (ESR, 7TM). Structural and/or functional properties of the synthesized proteins were analyzed. LPNs significantly enhanced synthesis of the IMPs in a soluble form regardless of the lipid composition. A partial disintegration of LPNs composed of unsaturated lipids was observed upon co-translational IMP incorporation. Contrary to detergents the nanodiscs resulted in the synthesis of ~80% active ESR and promoted correct folding of the TM-ErbB3. None of the tested membrane mimetics supported CF synthesis of correctly folded VSD, and the protocol of the domain refolding was developed. The use of LPNs appears to be the most promising approach to CF production of IMPs in a folded state. NMR analysis of (15)N-Ile-TM-ErbB3 co-translationally incorporated into LPNs shows the great prospects of this membrane mimetics for structural studies of IMPs produced by CF systems. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. In silico folding of a three helix protein and characterization of its free-energy landscape in an all-atom force field.

    PubMed

    Herges, T; Wenzel, W

    2005-01-14

    We report the reproducible first-principles folding of the 40 amino-acid, three-helix headpiece of the HIV accessory protein in a recently developed all-atom free-energy force field. Six of 20 simulations using an adapted basin-hopping method converged to better than 3 A backbone rms deviation to the experimental structure. Using over 60 000 low-energy conformations of this protein, we constructed a decoy tree that completely characterizes its folding funnel.

  10. In Silico Folding of a Three Helix Protein and Characterization of Its Free-Energy Landscape in an All-Atom Force Field

    NASA Astrophysics Data System (ADS)

    Herges, T.; Wenzel, W.

    2005-01-01

    We report the reproducible first-principles folding of the 40 amino-acid, three-helix headpiece of the HIV accessory protein in a recently developed all-atom free-energy force field. Six of 20 simulations using an adapted basin-hopping method converged to better than 3Å backbone rms deviation to the experimental structure. Using over 60 000 low-energy conformations of this protein, we constructed a decoy tree that completely characterizes its folding funnel.

  11. Introducing the Levinthal's Protein Folding Paradox and Its Solution

    ERIC Educational Resources Information Center

    Martínez, Leandro

    2014-01-01

    The protein folding (Levinthal's) paradox states that it would not be possible in a physically meaningful time to a protein to reach the native (functional) conformation by a random search of the enormously large number of possible structures. This paradox has been solved: it was shown that small biases toward the native conformation result…

  12. Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?

    PubMed

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.

  13. Inversion of the Balance between Hydrophobic and Hydrogen Bonding Interactions in Protein Folding and Aggregation

    PubMed Central

    Fitzpatrick, Anthony W.; Knowles, Tuomas P. J.; Waudby, Christopher A.; Vendruscolo, Michele; Dobson, Christopher M.

    2011-01-01

    Identifying the forces that drive proteins to misfold and aggregate, rather than to fold into their functional states, is fundamental to our understanding of living systems and to our ability to combat protein deposition disorders such as Alzheimer's disease and the spongiform encephalopathies. We report here the finding that the balance between hydrophobic and hydrogen bonding interactions is different for proteins in the processes of folding to their native states and misfolding to the alternative amyloid structures. We find that the minima of the protein free energy landscape for folding and misfolding tend to be respectively dominated by hydrophobic and by hydrogen bonding interactions. These results characterise the nature of the interactions that determine the competition between folding and misfolding of proteins by revealing that the stability of native proteins is primarily determined by hydrophobic interactions between side-chains, while the stability of amyloid fibrils depends more on backbone intermolecular hydrogen bonding interactions. PMID:22022239

  14. Ab Initio Protein Structure Assembly Using Continuous Structure Fragments and Optimized Knowledge-based Force Field

    PubMed Central

    Xu, Dong; Zhang, Yang

    2012-01-01

    Ab initio protein folding is one of the major unsolved problems in computational biology due to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1–20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 non-homologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score (TM-score) >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in 1/3 cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction (CASP9) experiment, QUARK server outperformed the second and third best servers by 18% and 47% based on the cumulative Z-score of global distance test-total (GDT-TS) scores in the free modeling (FM) category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress towards the solution of the most important problem in the field. PMID:22411565

  15. Traversing the folding pathway of proteins using temperature-aided cascade molecular dynamics with conformation-dependent charges.

    PubMed

    Jani, Vinod; Sonavane, Uddhavesh; Joshi, Rajendra

    2016-07-01

    Protein folding is a multi-micro second time scale event and involves many conformational transitions. Crucial conformational transitions responsible for biological functions of biomolecules are difficult to capture using current state-of-the-art molecular dynamics (MD) simulations. Protein folding, being a stochastic process, witnesses these transitions as rare events. Many new methodologies have been proposed for observing these rare events. In this work, a temperature-aided cascade MD is proposed as a technique for studying the conformational transitions. Folding studies for Engrailed homeodomain and Immunoglobulin domain B of protein A have been carried out. Using this methodology, the unfolded structures with RMSD of 20 Å were folded to a structure with RMSD of 2 Å. Three sets of cascade MD runs were carried out using implicit solvation, explicit solvation, and charge updation scheme. In the charge updation scheme, charges based on the conformation obtained are calculated and are updated in the topology file. In all the simulations, the structure of 2 Å was reached within a few nanoseconds using these methods. Umbrella sampling has been performed using snapshots from the temperature-aided cascade MD simulation trajectory to build an entire conformational transition pathway. The advantage of the method is that the possible pathways for a particular reaction can be explored within a short duration of simulation time and the disadvantage is that the knowledge of the start and end state is required. The charge updation scheme adds the polarization effects in the force fields. This improves the electrostatic interaction among the atoms, which may help the protein to fold faster.

  16. Clusters of isoleucine, leucine, and valine side chains define cores of stability in high-energy states of globular proteins: Sequence determinants of structure and stability.

    PubMed

    Kathuria, Sagar V; Chan, Yvonne H; Nobrega, R Paul; Özen, Ayşegül; Matthews, C Robert

    2016-03-01

    Measurements of protection against exchange of main chain amide hydrogens (NH) with solvent hydrogens in globular proteins have provided remarkable insights into the structures of rare high-energy states that populate their folding free-energy surfaces. Lacking, however, has been a unifying theory that rationalizes these high-energy states in terms of the structures and sequences of their resident proteins. The Branched Aliphatic Side Chain (BASiC) hypothesis has been developed to explain the observed patterns of protection in a pair of TIM barrel proteins. This hypothesis supposes that the side chains of isoleucine, leucine, and valine (ILV) residues often form large hydrophobic clusters that very effectively impede the penetration of water to their underlying hydrogen bond networks and, thereby, enhance the protection against solvent exchange. The linkage between the secondary and tertiary structures enables these ILV clusters to serve as cores of stability in high-energy partially folded states. Statistically significant correlations between the locations of large ILV clusters in native conformations and strong protection against exchange for a variety of motifs reported in the literature support the generality of the BASiC hypothesis. The results also illustrate the necessity to elaborate this simple hypothesis to account for the roles of adjacent hydrocarbon moieties in defining stability cores of partially folded states along folding reaction coordinates. © 2015 The Protein Society.

  17. Study of protein folding under native conditions by rapidly switching the hydrostatic pressure inside an NMR sample cell

    PubMed Central

    Charlier, Cyril; Alderson, T. Reid; Courtney, Joseph M.; Ying, Jinfa; Anfinrud, Philip

    2018-01-01

    In general, small proteins rapidly fold on the timescale of milliseconds or less. For proteins with a substantial volume difference between the folded and unfolded states, their thermodynamic equilibrium can be altered by varying the hydrostatic pressure. Using a pressure-sensitized mutant of ubiquitin, we demonstrate that rapidly switching the pressure within an NMR sample cell enables study of the unfolded protein under native conditions and, vice versa, study of the native protein under denaturing conditions. This approach makes it possible to record 2D and 3D NMR spectra of the unfolded protein at atmospheric pressure, providing residue-specific information on the folding process. 15N and 13C chemical shifts measured immediately after dropping the pressure from 2.5 kbar (favoring unfolding) to 1 bar (native) are close to the random-coil chemical shifts observed for a large, disordered peptide fragment of the protein. However, 15N relaxation data show evidence for rapid exchange, on a ∼100-μs timescale, between the unfolded state and unstable, structured states that can be considered as failed folding events. The NMR data also provide direct evidence for parallel folding pathways, with approximately one-half of the protein molecules efficiently folding through an on-pathway kinetic intermediate, whereas the other half fold in a single step. At protein concentrations above ∼300 μM, oligomeric off-pathway intermediates compete with folding of the native state. PMID:29666248

  18. Common fold in helix–hairpin–helix proteins

    PubMed Central

    Shao, Xuguang; Grishin, Nick V.

    2000-01-01

    Helix–hairpin–helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein–protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)2 domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)2 domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each α-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the α-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glyco­s­y­lases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)2 domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)2 functional unit. PMID:10908318

  19. Non-detergent sulphobetaines: a new class of molecules that facilitate in vitro protein renaturation.

    PubMed

    Goldberg, M E; Expert-Bezançon, N; Vuillard, L; Rabilloud, T

    1996-01-01

    Attempts to renature proteins often yield aggregates rather than native protein. To minimize aggregation, low protein concentrations and/or solubilizing agents are used. Here, we test new solubilizing molecules, non-detergent sulphobetaines, to improve the renaturation of two very different enzymes, hen egg white lysozyme and bacterial beta-D-galactosidase. The renaturation was conducted in the presence of five different sulphobetaines and the yield of active enzyme was measured. The five sulphobetaines improved the yield of native lysozyme up to 12-fold. Some sulphobetaines improved the yield of galactosidase up to 80-fold, but one reduced it 100-fold. Non-detergent sulphobetaines strongly affect the balance between aggregation and folding. Their effect depends on their structure and on their interactions with folding intermediates. These results should serve as a basis for designing more efficient sulphobetaines; for designing improved renaturation protocols using existing sulphobetaines; and for characterizing folding intermediates that interact with sulphobetaines.

  20. Folding and stability of helical bundle proteins from coarse-grained models.

    PubMed

    Kapoor, Abhijeet; Travesset, Alex

    2013-07-01

    We develop a coarse-grained model where solvent is considered implicitly, electrostatics are included as short-range interactions, and side-chains are coarse-grained to a single bead. The model depends on three main parameters: hydrophobic, electrostatic, and side-chain hydrogen bond strength. The parameters are determined by considering three level of approximations and characterizing the folding for three selected proteins (training set). Nine additional proteins (containing up to 126 residues) as well as mutated versions (test set) are folded with the given parameters. In all folding simulations, the initial state is a random coil configuration. Besides the native state, some proteins fold into an additional state differing in the topology (structure of the helical bundle). We discuss the stability of the native states, and compare the dynamics of our model to all atom molecular dynamics simulations as well as some general properties on the interactions governing folding dynamics. Copyright © 2013 Wiley Periodicals, Inc.

  1. GPCR-I-TASSER: A hybrid approach to G protein-coupled receptor structure modeling and the application to the human genome

    PubMed Central

    Zhang, Jian; Yang, Jianyi; Jang, Richard; Zhang, Yang

    2015-01-01

    SUMMARY Experimental structure determination remains very difficult for G protein-coupled receptors (GPCRs). We propose a new hybrid protocol to construct GPCR structure models that integrates experimental mutagenesis data with ab initio transmembrane (TM) helix assembly simulations. The method was tested on 24 known GPCRs where the ab initio TM-helix assembly procedure constructed the correct fold for 20 cases. When combined with weak-homology and sparse mutagenesis restraints, the method generated correct folds for all the tested cases with an average C-alpha RMSD 2.4 Å in the TM-regions. The new hybrid protocol was applied to model all 1026 GPCRs in the human genome, where 923 have a high confidence score that are expected to have correct folds; these contain many pharmaceutically important families with no previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin and Neuropeptide Y receptors. The results demonstrate new progress on genome-wide structure modeling of transmembrane proteins. PMID:26190572

  2. The Proteome Folding Project: Proteome-scale prediction of structure and function

    PubMed Central

    Drew, Kevin; Winters, Patrick; Butterfoss, Glenn L.; Berstis, Viktors; Uplinger, Keith; Armstrong, Jonathan; Riffle, Michael; Schweighofer, Erik; Bovermann, Bill; Goodlett, David R.; Davis, Trisha N.; Shasha, Dennis; Malmström, Lars; Bonneau, Richard

    2011-01-01

    The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions. PMID:21824995

  3. Adenovirus fibre shaft sequences fold into the native triple beta-spiral fold when N-terminally fused to the bacteriophage T4 fibritin foldon trimerisation motif.

    PubMed

    Papanikolopoulou, Katerina; Teixeira, Susana; Belrhali, Hassan; Forsyth, V Trevor; Mitraki, Anna; van Raaij, Mark J

    2004-09-03

    Adenovirus fibres are trimeric proteins that consist of a globular C-terminal domain, a central fibrous shaft and an N-terminal part that attaches to the viral capsid. In the presence of the globular C-terminal domain, which is necessary for correct trimerisation, the shaft segment adopts a triple beta-spiral conformation. We have replaced the head of the fibre by the trimerisation domain of the bacteriophage T4 fibritin, the foldon. Two different fusion constructs were made and crystallised, one with an eight amino acid residue linker and one with a linker of only two residues. X-ray crystallographic studies of both fusion proteins shows that residues 319-391 of the adenovirus type 2 fibre shaft fold into a triple beta-spiral fold indistinguishable from the native structure, although this is now resolved at a higher resolution of 1.9 A. The foldon residues 458-483 also adopt their natural structure. The intervening linkers are not well ordered in the crystal structures. This work shows that the shaft sequences retain their capacity to fold into their native beta-spiral fibrous fold when fused to a foreign C-terminal trimerisation motif. It provides a structural basis to artificially trimerise longer adenovirus shaft segments and segments from other trimeric beta-structured fibre proteins. Such artificial fibrous constructs, amenable to crystallisation and solution studies, can offer tractable model systems for the study of beta-fibrous structure. They can also prove useful for gene therapy and fibre engineering applications.

  4. Predicting folding-unfolding transitions in proteins without a priori knowledge of the folded state

    NASA Astrophysics Data System (ADS)

    Okan, Osman; Turgut, Deniz; Garcia, Angel; Ozisik, Rahmi

    2013-03-01

    The common computational method of studying folding transitions in proteins is to compare simulated conformations against the folded structure, but this method obviously requires the folded structure to be known beforehand. In the current study, we show that the use of bond orientational order parameter (BOOP) Ql [Steinhardt PJ, Nelson DR, Ronchetti M, Phys. Rev. B 1983, 28, 784] is a viable alternative to the commonly adopted root mean squared distance (RMSD) measure in probing conformational transitions. Replica exchange molecular dynamics simulations of the trp-cage protein (with 20 residues) in TIP-3P water were used to compare BOOP against RMSD. The results indicate that the correspondence between BOOP and RMSD time series become stronger with increasing l. We finally show that robust linear models that incorporate different Ql can be parameterized from a given replica run and can be used to study other replica trajectories. This work is partially supported by NSF DUE-1003574.

  5. Enhanced Wang Landau sampling of adsorbed protein conformations.

    PubMed

    Radhakrishna, Mithun; Sharma, Sumit; Kumar, Sanat K

    2012-03-21

    Using computer simulations to model the folding of proteins into their native states is computationally expensive due to the extraordinarily low degeneracy of the ground state. In this paper, we develop an efficient way to sample these folded conformations using Wang Landau sampling coupled with the configurational bias method (which uses an unphysical "temperature" that lies between the collapse and folding transition temperatures of the protein). This method speeds up the folding process by roughly an order of magnitude over existing algorithms for the sequences studied. We apply this method to study the adsorption of intrinsically disordered hydrophobic polar protein fragments on a hydrophobic surface. We find that these fragments, which are unstructured in the bulk, acquire secondary structure upon adsorption onto a strong hydrophobic surface. Apparently, the presence of a hydrophobic surface allows these random coil fragments to fold by providing hydrophobic contacts that were lost in protein fragmentation. © 2012 American Institute of Physics

  6. The Energy Landscapes of Repeat-Containing Proteins: Topology, Cooperativity, and the Folding Funnels of One-Dimensional Architectures

    PubMed Central

    Komives, Elizabeth A.; Wolynes, Peter G.

    2008-01-01

    Repeat-proteins are made up of near repetitions of 20– to 40–amino acid stretches. These polypeptides usually fold up into non-globular, elongated architectures that are stabilized by the interactions within each repeat and those between adjacent repeats, but that lack contacts between residues distant in sequence. The inherent symmetries both in primary sequence and three-dimensional structure are reflected in a folding landscape that may be analyzed as a quasi–one-dimensional problem. We present a general description of repeat-protein energy landscapes based on a formal Ising-like treatment of the elementary interaction energetics in and between foldons, whose collective ensemble are treated as spin variables. The overall folding properties of a complete “domain” (the stability and cooperativity of the repeating array) can be derived from this microscopic description. The one-dimensional nature of the model implies there are simple relations for the experimental observables: folding free-energy (ΔGwater) and the cooperativity of denaturation (m-value), which do not ordinarily apply for globular proteins. We show how the parameters for the “coarse-grained” description in terms of foldon spin variables can be extracted from more detailed folding simulations on perfectly funneled landscapes. To illustrate the ideas, we present a case-study of a family of tetratricopeptide (TPR) repeat proteins and quantitatively relate the results to the experimentally observed folding transitions. Based on the dramatic effect that single point mutations exert on the experimentally observed folding behavior, we speculate that natural repeat proteins are “poised” at particular ratios of inter- and intra-element interaction energetics that allow them to readily undergo structural transitions in physiologically relevant conditions, which may be intrinsically related to their biological functions. PMID:18483553

  7. Revisiting the NMR structure of the ultrafast downhill folding protein gpW from bacteriophage λ.

    PubMed

    Sborgi, Lorenzo; Verma, Abhinav; Muñoz, Victor; de Alba, Eva

    2011-01-01

    GpW is a 68-residue protein from bacteriophage λ that participates in virus head morphogenesis. Previous NMR studies revealed a novel α+β fold for this protein. Recent experiments have shown that gpW folds in microseconds by crossing a marginal free energy barrier (i.e., downhill folding). These features make gpW a highly desirable target for further experimental and computational folding studies. As a step in that direction, we have re-determined the high-resolution structure of gpW by multidimensional NMR on a construct that eliminates the purification tags and unstructured C-terminal tail present in the prior study. In contrast to the previous work, we have obtained a full manual assignment and calculated the structure using only unambiguous distance restraints. This new structure confirms the α+β topology, but reveals important differences in tertiary packing. Namely, the two α-helices are rotated along their main axis to form a leucine zipper. The β-hairpin is orthogonal to the helical interface rather than parallel, displaying most tertiary contacts through strand 1. There also are differences in secondary structure: longer and less curved helices and a hairpin that now shows the typical right-hand twist. Molecular dynamics simulations starting from both gpW structures, and calculations with CS-Rosetta, all converge to our gpW structure. This confirms that the original structure has strange tertiary packing and strained secondary structure. A comparison of NMR datasets suggests that the problems were mainly caused by incomplete chemical shift assignments, mistakes in NOE assignment and the inclusion of ambiguous distance restraints during the automated procedure used in the original study. The new gpW corrects these problems, providing the appropriate structural reference for future work. Furthermore, our results are a cautionary tale against the inclusion of ambiguous experimental information in the determination of protein structures.

  8. The role of atomic level steric effects and attractive forces in protein folding.

    PubMed

    Lammert, Heiko; Wolynes, Peter G; Onuchic, José N

    2012-02-01

    Protein folding into tertiary structures is controlled by an interplay of attractive contact interactions and steric effects. We investigate the balance between these contributions using structure-based models using an all-atom representation of the structure combined with a coarse-grained contact potential. Tertiary contact interactions between atoms are collected into a single broad attractive well between the C(β) atoms between each residue pair in a native contact. Through the width of these contact potentials we control their tolerance for deviations from the ideal structure and the spatial range of attractive interactions. In the compact native state dominant packing constraints limit the effects of a coarse-grained contact potential. During folding, however, the broad attractive potentials allow an early collapse that starts before the native local structure is completely adopted. As a consequence the folding transition is broadened and the free energy barrier is decreased. Eventually two-state folding behavior is lost completely for systems with very broad attractive potentials. The stabilization of native-like residue interactions in non-perfect geometries early in the folding process frequently leads to structural traps. Global mirror images are a notable example. These traps are penalized by the details of the repulsive interactions only after further collapse. Successful folding to the native state requires simultaneous guidance from both attractive and repulsive interactions. Copyright © 2011 Wiley Periodicals, Inc.

  9. Balancing energy and entropy: A minimalist model for the characterization of protein folding landscapes

    PubMed Central

    Das, Payel; Matysiak, Silvina; Clementi, Cecilia

    2005-01-01

    Coarse-grained models have been extremely valuable in promoting our understanding of protein folding. However, the quantitative accuracy of existing simplified models is strongly hindered either from the complete removal of frustration (as in the widely used Gō-like models) or from the compromise with the minimal frustration principle and/or realistic protein geometry (as in the simple on-lattice models). We present a coarse-grained model that “naturally” incorporates sequence details and energetic frustration into an overall minimally frustrated folding landscape. The model is coupled with an optimization procedure to design the parameters of the protein Hamiltonian to fold into a desired native structure. The application to the study of src-Src homology 3 domain shows that this coarse-grained model contains the main physical-chemical ingredients that are responsible for shaping the folding landscape of this protein. The results illustrate the importance of nonnative interactions and energetic heterogeneity for a quantitative characterization of folding mechanisms. PMID:16006532

  10. Discrete Molecular Dynamics Approach to the Study of Disordered and Aggregating Proteins.

    PubMed

    Emperador, Agustí; Orozco, Modesto

    2017-03-14

    We present a refinement of the Coarse Grained PACSAB force field for Discrete Molecular Dynamics (DMD) simulations of proteins in aqueous conditions. As the original version, the refined method provides good representation of the structure and dynamics of folded proteins but provides much better representations of a variety of unfolded proteins, including some very large, impossible to analyze by atomistic simulation methods. The PACSAB/DMD method also reproduces accurately aggregation properties, providing good pictures of the structural ensembles of proteins showing a folded core and an intrinsically disordered region. The combination of accuracy and speed makes the method presented here a good alternative for the exploration of unstructured protein systems.

  11. The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction.

    PubMed

    Roche, Daniel B; Buenavista, Maria T; Tetchner, Stuart J; McGuffin, Liam J

    2011-07-01

    The IntFOLD server is a novel independent server that integrates several cutting edge methods for the prediction of structure and function from sequence. Our guiding principles behind the server development were as follows: (i) to provide a simple unified resource that makes our prediction software accessible to all and (ii) to produce integrated output for predictions that can be easily interpreted. The output for predictions is presented as a simple table that summarizes all results graphically via plots and annotated 3D models. The raw machine readable data files for each set of predictions are also provided for developers, which comply with the Critical Assessment of Methods for Protein Structure Prediction (CASP) data standards. The server comprises an integrated suite of five novel methods: nFOLD4, for tertiary structure prediction; ModFOLD 3.0, for model quality assessment; DISOclust 2.0, for disorder prediction; DomFOLD 2.0 for domain prediction; and FunFOLD 1.0, for ligand binding site prediction. Predictions from the IntFOLD server were found to be competitive in several categories in the recent CASP9 experiment. The IntFOLD server is available at the following web site: http://www.reading.ac.uk/bioinf/IntFOLD/.

  12. Nicked apomyoglobin: a noncovalent complex of two polypeptide fragments comprising the entire protein chain.

    PubMed

    Musi, Valeria; Spolaore, Barbara; Picotti, Paola; Zambonin, Marcello; De Filippis, Vincenzo; Fontana, Angelo

    2004-05-25

    Limited proteolysis of the 153-residue chain of horse apomyoglobin (apoMb) by thermolysin results in the selective cleavage of the peptide bond Pro88-Leu89. The N-terminal (residues 1-88) and C-terminal (residues 89-153) fragments of apoMb were isolated to homogeneity and their conformational and association properties investigated in detail. Far-UV circular dichroism (CD) measurements revealed that both fragments in isolation acquire a high content of helical secondary structure, while near-UV CD indicated the absence of tertiary structure. A 1:1 mixture of the fragments leads to a tight noncovalent protein complex (1-88/89-153, nicked apoMb), characterized by secondary and tertiary structures similar to those of intact apoMb. The apoMb complex binds heme in a nativelike manner, as given by CD measurements in the Soret region. Second-derivative absorption spectra in the 250-300 nm region provided evidence that the degree of exposure of Tyr residues in the nicked species is similar to that of the intact protein at neutral pH. Also, the microenvironment of Trp residues, located in positions 7 and 14 of the 153-residue chain of the protein, is similar in both protein species, as given by fluorescence emission data. Moreover, in analogy to intact apoMb, the nicked protein binds the hydrophobic dye 1-anilinonaphthalene-8-sulfonate (ANS). Taken together, our results indicate that the two proteolytic fragments 1-88 and 89-153 of apoMb adopt partly folded states characterized by sufficiently nativelike conformational features that promote their specific association and mutual stabilization into a nicked protein species much resembling in its structural features intact apoMb. It is suggested that the formation of a noncovalent complex upon fragment complementation can mimic the protein folding process of the entire protein chain, with the difference that the folding of the complementary fragments is an intermolecular process. In particular, this study emphasizes the importance of interactions between marginally stable elements of secondary structure in promoting the tertiary contacts of a native protein. Considering that apoMb has been extensively used as a paradigm in protein folding studies for the past few decades, the novel fragment complementing system of apoMb here described appears to be very useful for investigating the initial as well as late events in protein folding.

  13. Formation of highly stable chimeric trimers by fusion of an adenovirus fiber shaft fragment with the foldon domain of bacteriophage t4 fibritin.

    PubMed

    Papanikolopoulou, Katerina; Forge, Vincent; Goeltz, Pierrette; Mitraki, Anna

    2004-03-05

    The folding of beta-structured, fibrous proteins is a largely unexplored area. A class of such proteins is used by viruses as adhesins, and recent studies revealed novel beta-structured motifs for them. We have been studying the folding and assembly of adenovirus fibers that consist of a globular C-terminal domain, a central fibrous shaft, and an N-terminal part that attaches to the viral capsid. The globular C-terminal, or "head" domain, has been postulated to be necessary for the trimerization of the fiber and might act as a registration signal that directs its correct folding and assembly. In this work, we replaced the head of the fiber by the trimerization domain of the bacteriophage T4 fibritin, termed "foldon." Two chimeric proteins, comprising the foldon domain connected at the C-terminal end of four fiber shaft repeats with or without the use of a natural linker sequence, fold into highly stable, SDS-resistant trimers. The structural signatures of the chimeric proteins as seen by CD and infrared spectroscopy are reported. The results suggest that the foldon domain can successfully replace the fiber head domain in ensuring correct trimerization of the shaft sequences. Biological implications and implications for engineering highly stable, beta-structured nanorods are discussed.

  14. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    NASA Astrophysics Data System (ADS)

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; Kathuria, Sagar V.; Bilsel, Osman; Matthews, C. Robert; Lane, T. J.; Pande, Vijay S.

    2017-03-01

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. Here, we report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structure of the excited state ensemble. This prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. Using these results, we then predict incisive single molecule FRET experiments as a means of model validation. This study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.

  15. Context-dependent effects of asparagine glycosylation on Pin WW folding kinetics and thermodynamics.

    PubMed

    Price, Joshua L; Shental-Bechor, Dalit; Dhar, Apratim; Turner, Maurice J; Powers, Evan T; Gruebele, Martin; Levy, Yaakov; Kelly, Jeffery W

    2010-11-03

    Asparagine glycosylation is one of the most common and important post-translational modifications of proteins in eukaryotic cells. N-glycosylation occurs when a triantennary glycan precursor is transferred en bloc to a nascent polypeptide (harboring the N-X-T/S sequon) as the peptide is cotranslationally translocated into the endoplasmic reticulum (ER). In addition to facilitating binding interactions with components of the ER proteostasis network, N-glycans can also have intrinsic effects on protein folding by directly altering the folding energy landscape. Previous work from our laboratories (Hanson et al. Proc. Natl. Acad. Sci. U.S.A. 2009, 109, 3131-3136; Shental-Bechor, D.; Levy, Y. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 8256-8261) suggested that the three sugar residues closest to the protein are sufficient for accelerating protein folding and stabilizing the resulting structure in vitro; even a monosaccharide can have a dramatic effect. The highly conserved nature of these three proximal sugars in N-glycans led us to speculate that introducing an N-glycosylation site into a protein that is not normally glycosylated would stabilize the protein and increase its folding rate in a manner that does not depend on the presence of specific stabilizing protein-saccharide interactions. Here, we test this hypothesis experimentally and computationally by incorporating an N-linked GlcNAc residue at various positions within the Pin WW domain, a small β-sheet-rich protein. The results show that an increased folding rate and enhanced thermodynamic stability are not general, context-independent consequences of N-glycosylation. Comparison between computational predictions and experimental observations suggests that generic glycan-based excluded volume effects are responsible for the destabilizing effect of glycosylation at highly structured positions. However, this reasoning does not adequately explain the observed destabilizing effect of glycosylation within flexible loops. Our data are consistent with the hypothesis that specific, evolved protein-glycan contacts must also play an important role in mediating the beneficial energetic effects on protein folding that glycosylation can confer.

  16. CASP10-BCL::Fold efficiently samples topologies of large proteins.

    PubMed

    Heinze, Sten; Putnam, Daniel K; Fischer, Axel W; Kohlmann, Tim; Weiner, Brian E; Meiler, Jens

    2015-03-01

    During CASP10 in summer 2012, we tested BCL::Fold for prediction of free modeling (FM) and template-based modeling (TBM) targets. BCL::Fold assembles the tertiary structure of a protein from predicted secondary structure elements (SSEs) omitting more flexible loop regions early on. This approach enables the sampling of conformational space for larger proteins with more complex topologies. In preparation of CASP11, we analyzed the quality of CASP10 models throughout the prediction pipeline to understand BCL::Fold's ability to sample the native topology, identify native-like models by scoring and/or clustering approaches, and our ability to add loop regions and side chains to initial SSE-only models. The standout observation is that BCL::Fold sampled topologies with a GDT_TS score > 33% for 12 of 18 and with a topology score > 0.8 for 11 of 18 test cases de novo. Despite the sampling success of BCL::Fold, significant challenges still exist in clustering and loop generation stages of the pipeline. The clustering approach employed for model selection often failed to identify the most native-like assembly of SSEs for further refinement and submission. It was also observed that for some β-strand proteins model refinement failed as β-strands were not properly aligned to form hydrogen bonds removing otherwise accurate models from the pool. Further, BCL::Fold samples frequently non-natural topologies that require loop regions to pass through the center of the protein. © 2015 Wiley Periodicals, Inc.

  17. Lattice model simulation of interchain protein interactions and the folding dynamics and dimerization of the GCN4 Leucine zipper

    NASA Astrophysics Data System (ADS)

    Liu, Yanxin; Chapagain, Prem P.; Parra, Jose L.; Gerstman, Bernard S.

    2008-01-01

    The highest level in the hierarchy of protein structure and folding is the formation of protein complexes through protein-protein interactions. We have made modifications to a well established computer lattice model to expand its applicability to two-protein dimerization and aggregation. Based on Brownian dynamics, we implement translation and rotation moves of two peptide chains relative to each other, in addition to the intrachain motions already present in the model. We use this two-chain model to study the folding dynamics of the yeast transcription factor GCN4 leucine zipper. The calculated heat capacity curves agree well with experimental measurements. Free energy landscapes and median first passage times for the folding process are calculated and elucidate experimentally measured characteristics such as the multistate nature of the dimerization process.

  18. Structural Conservation of the Myoviridae Phage Tail Sheath Protein Fold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aksyuk, Anastasia A.; Kurochkina, Lidia P.; Fokine, Andrei

    2012-02-21

    Bacteriophage phiKZ is a giant phage that infects Pseudomonas aeruginosa, a human pathogen. The phiKZ virion consists of a 1450 {angstrom} diameter icosahedral head and a 2000 {angstrom}-long contractile tail. The structure of the whole virus was previously reported, showing that its tail organization in the extended state is similar to the well-studied Myovirus bacteriophage T4 tail. The crystal structure of a tail sheath protein fragment of phiKZ was determined to 2.4 {angstrom} resolution. Furthermore, crystal structures of two prophage tail sheath proteins were determined to 1.9 and 3.3 {angstrom} resolution. Despite low sequence identity between these proteins, all ofmore » these structures have a similar fold. The crystal structure of the phiKZ tail sheath protein has been fitted into cryo-electron-microscopy reconstructions of the extended tail sheath and of a polysheath. The structural rearrangement of the phiKZ tail sheath contraction was found to be similar to that of phage T4.« less

  19. Beta-structures in fibrous proteins.

    PubMed

    Kajava, Andrey V; Squire, John M; Parry, David A D

    2006-01-01

    The beta-form of protein folding, one of the earliest protein structures to be defined, was originally observed in studies of silks. It was then seen in early studies of synthetic polypeptides and, of course, is now known to be present in a variety of guises as an essential component of globular protein structures. However, in the last decade or so it has become clear that the beta-conformation of chains is present not only in many of the amyloid structures associated with, for example, Alzheimer's Disease, but also in the prion structures associated with the spongiform encephalopathies. Furthermore, X-ray crystallography studies have revealed the high incidence of the beta-fibrous proteins among virulence factors of pathogenic bacteria and viruses. Here we describe the basic forms of the beta-fold, summarize the many different new forms of beta-structural fibrous arrangements that have been discovered, and review advances in structural studies of amyloid and prion fibrils. These and other issues are described in detail in later chapters.

  20. Cofactor-binding sites in proteins of deviating sequence: comparative analysis and clustering in torsion angle, cavity, and fold space.

    PubMed

    Stegemann, Björn; Klebe, Gerhard

    2012-02-01

    Small molecules are recognized in protein-binding pockets through surface-exposed physicochemical properties. To optimize binding, they have to adopt a conformation corresponding to a local energy minimum within the formed protein-ligand complex. However, their conformational flexibility makes them competent to bind not only to homologous proteins of the same family but also to proteins of remote similarity with respect to the shape of the binding pockets and folding pattern. Considering drug action, such observations can give rise to unexpected and undesired cross reactivity. In this study, datasets of six different cofactors (ADP, ATP, NAD(P)(H), FAD, and acetyl CoA, sharing an adenosine diphosphate moiety as common substructure), observed in multiple crystal structures of protein-cofactor complexes exhibiting sequence identity below 25%, have been analyzed for the conformational properties of the bound ligands, the distribution of physicochemical properties in the accommodating protein-binding pockets, and the local folding patterns next to the cofactor-binding site. State-of-the-art clustering techniques have been applied to group the different protein-cofactor complexes in the different spaces. Interestingly, clustering in cavity (Cavbase) and fold space (DALI) reveals virtually the same data structuring. Remarkable relationships can be found among the different spaces. They provide information on how conformations are conserved across the host proteins and which distinct local cavity and fold motifs recognize the different portions of the cofactors. In those cases, where different cofactors are found to be accommodated in a similar fashion to the same fold motifs, only a commonly shared substructure of the cofactors is used for the recognition process. Copyright © 2011 Wiley Periodicals, Inc.

  1. Structural biology of intrinsically disordered proteins: Revisiting unsolved mysteries.

    PubMed

    Sigalov, Alexander B

    2016-06-01

    The emergence of intrinsically disordered proteins (IDPs) has challenged the classical protein structure-function paradigm by introducing a new paradigm of "coupled binding and folding". This paradigm suggests that IDPs fold upon binding to their partners. Further studies, however, revealed a novel and previously unrecognized phenomenon of "uncoupled binding and folding" suggesting that IDPs do not necessarily fold upon interaction with their lipid and protein partners. The complex and often unusual biophysics of IDPs makes structural characterization of these proteins and their complexes not only challenging but often resulting in opposite conclusions. For this reason, some crucial questions in this field remain unsolved for well over a decade. Considering an important role of IDPs in cellular regulation, signaling and control in health and disease, more efforts are needed to solve these mysteries. Here, I focus on two long-standing contradictions in the literature concerning dimerization and membrane-binding activities of IDPs. Molecular explanation of these discrepancies is provided. I also demonstrate how resolution of these critical issues in the field of IDPs results in our expanded understanding of cell function and has multiple applications in biology and medicine. Copyright © 2016 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.

  2. Analyzing the effect of homogeneous frustration in protein folding.

    PubMed

    Contessoto, Vinícius G; Lima, Debora T; Oliveira, Ronaldo J; Bruni, Aline T; Chahine, Jorge; Leite, Vitor B P

    2013-10-01

    The energy landscape theory has been an invaluable theoretical framework in the understanding of biological processes such as protein folding, oligomerization, and functional transitions. According to the theory, the energy landscape of protein folding is funneled toward the native state, a conformational state that is consistent with the principle of minimal frustration. It has been accepted that real proteins are selected through natural evolution, satisfying the minimum frustration criterion. However, there is evidence that a low degree of frustration accelerates folding. We examined the interplay between topological and energetic protein frustration. We employed a Cα structure-based model for simulations with a controlled nonspecific energetic frustration added to the potential energy function. Thermodynamics and kinetics of a group of 19 proteins are completely characterized as a function of increasing level of energetic frustration. We observed two well-separated groups of proteins: one group where a little frustration enhances folding rates to an optimal value and another where any energetic frustration slows down folding. Protein energetic frustration regimes and their mechanisms are explained by the role of non-native contact interactions in different folding scenarios. These findings strongly correlate with the protein free-energy folding barrier and the absolute contact order parameters. These computational results are corroborated by principal component analysis and partial least square techniques. One simple theoretical model is proposed as a useful tool for experimentalists to predict the limits of improvements in real proteins. Copyright © 2013 Wiley Periodicals, Inc.

  3. Conserved nucleation sites reinforce the significance of Phi value analysis in protein-folding studies.

    PubMed

    Gianni, Stefano; Jemth, Per

    2014-07-01

    The only experimental strategy to address the structure of folding transition states, the so-called Φ value analysis, relies on the synergy between site directed mutagenesis and the measurement of reaction kinetics. Despite its importance, the Φ value analysis has been often criticized and its power to pinpoint structural information has been questioned. In this hypothesis, we demonstrate that comparing the Φ values between proteins not only allows highlighting the robustness of folding pathways but also provides per se a strong validation of the method. © 2014 International Union of Biochemistry and Molecular Biology.

  4. Water promotes the sealing of nanoscale packing defects in folding proteins.

    PubMed

    Fernández, Ariel

    2014-05-21

    A net dipole moment is shown to arise from a non-Debye component of water polarization created by nanoscale packing defects on the protein surface. Accordingly, the protein electrostatic field exerts a torque on the induced dipole, locally impeding the nucleation of ice at the protein-water interface. We evaluate the solvent orientation steering (SOS) as the reversible work needed to align the induced dipoles with the Debye electrostatic field and computed the SOS for the variable interface of a folding protein. The minimization of the SOS is shown to drive protein folding as evidenced by the entrainment of the total free energy by the SOS energy along trajectories that approach a Debye limit state where no torque arises. This result suggests that the minimization of anomalous water polarization at the interface promotes the sealing of packing defects, thereby maintaining structural integrity and committing the protein chain to fold.

  5. SAAFEC: Predicting the Effect of Single Point Mutations on Protein Folding Free Energy Using a Knowledge-Modified MM/PBSA Approach.

    PubMed

    Getov, Ivan; Petukh, Marharyta; Alexov, Emil

    2016-04-07

    Folding free energy is an important biophysical characteristic of proteins that reflects the overall stability of the 3D structure of macromolecules. Changes in the amino acid sequence, naturally occurring or made in vitro, may affect the stability of the corresponding protein and thus could be associated with disease. Several approaches that predict the changes of the folding free energy caused by mutations have been proposed, but there is no method that is clearly superior to the others. The optimal goal is not only to accurately predict the folding free energy changes, but also to characterize the structural changes induced by mutations and the physical nature of the predicted folding free energy changes. Here we report a new method to predict the Single Amino Acid Folding free Energy Changes (SAAFEC) based on a knowledge-modified Molecular Mechanics Poisson-Boltzmann (MM/PBSA) approach. The method is comprised of two main components: a MM/PBSA component and a set of knowledge based terms delivered from a statistical study of the biophysical characteristics of proteins. The predictor utilizes a multiple linear regression model with weighted coefficients of various terms optimized against a set of experimental data. The aforementioned approach yields a correlation coefficient of 0.65 when benchmarked against 983 cases from 42 proteins in the ProTherm database. the webserver can be accessed via http://compbio.clemson.edu/SAAFEC/.

  6. Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation

    PubMed Central

    Chen, Ke; Gao, Ye; Mih, Nathan; O’Brien, Edward J.; Yang, Laurence; Palsson, Bernhard O.

    2017-01-01

    Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses. PMID:29073085

  7. Design of tryptophan-containing mutants of the symmetrical Pizza protein for biophysical studies.

    PubMed

    Noguchi, Hiroki; Mylemans, Bram; De Zitter, Elke; Van Meervelt, Luc; Tame, Jeremy R H; Voet, Arnout

    2018-03-18

    β-propeller proteins are highly symmetrical, being composed of a repeated motif with four anti-parallel β-sheets arranged around a central axis. Recently we designed the first completely symmetrical β-propeller protein, Pizza6, consisting of six identical tandem repeats. Pizza6 is expected to prove a useful building block for bionanotechnology, and also a tool to investigate the folding and evolution of β-propeller proteins. Folding studies are made difficult by the high stability and the lack of buried Trp residues to act as monitor fluorophores, so we have designed and characterized several Trp-containing Pizza6 derivatives. In total four proteins were designed, of which three could be purified and characterized. Crystal structures confirm these mutant proteins maintain the expected structure, and a clear redshift of Trp fluorescence emission could be observed upon denaturation. Among the derivative proteins, Pizza6-AYW appears to be the most suitable model protein for future folding/unfolding kinetics studies as it has a comparable stability as natural β-propeller proteins. Copyright © 2018 Elsevier Inc. All rights reserved.

  8. Large-scale structure prediction by improved contact predictions and model quality assessment.

    PubMed

    Michel, Mirco; Menéndez Hurtado, David; Uziela, Karolis; Elofsson, Arne

    2017-07-15

    Accurate contact predictions can be used for predicting the structure of proteins. Until recently these methods were limited to very big protein families, decreasing their utility. However, recent progress by combining direct coupling analysis with machine learning methods has made it possible to predict accurate contact maps for smaller families. To what extent these predictions can be used to produce accurate models of the families is not known. We present the PconsFold2 pipeline that uses contact predictions from PconsC3, the CONFOLD folding algorithm and model quality estimations to predict the structure of a protein. We show that the model quality estimation significantly increases the number of models that reliably can be identified. Finally, we apply PconsFold2 to 6379 Pfam families of unknown structure and find that PconsFold2 can, with an estimated 90% specificity, predict the structure of up to 558 Pfam families of unknown structure. Out of these, 415 have not been reported before. Datasets as well as models of all the 558 Pfam families are available at http://c3.pcons.net/ . All programs used here are freely available. arne@bioinfo.se. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  9. The topomer-sampling model of protein folding

    PubMed Central

    Debe, Derek A.; Carlson, Matt J.; Goddard, William A.

    1999-01-01

    Clearly, a protein cannot sample all of its conformations (e.g., ≈3100 ≈ 1048 for a 100 residue protein) on an in vivo folding timescale (<1 s). To investigate how the conformational dynamics of a protein can accommodate subsecond folding time scales, we introduce the concept of the native topomer, which is the set of all structures similar to the native structure (obtainable from the native structure through local backbone coordinate transformations that do not disrupt the covalent bonding of the peptide backbone). We have developed a computational procedure for estimating the number of distinct topomers required to span all conformations (compact and semicompact) for a polypeptide of a given length. For 100 residues, we find ≈3 × 107 distinct topomers. Based on the distance calculated between different topomers, we estimate that a 100-residue polypeptide diffusively samples one topomer every ≈3 ns. Hence, a 100-residue protein can find its native topomer by random sampling in just ≈100 ms. These results suggest that subsecond folding of modest-sized, single-domain proteins can be accomplished by a two-stage process of (i) topomer diffusion: random, diffusive sampling of the 3 × 107 distinct topomers to find the native topomer (≈0.1 s), followed by (ii) intratopomer ordering: nonrandom, local conformational rearrangements within the native topomer to settle into the precise native state. PMID:10077555

  10. Structure and activity of the Pseudomonas aeruginosa hotdog-fold thioesterases PA5202 and PA2801.

    PubMed

    Gonzalez, Claudio F; Tchigvintsev, Anatoli; Brown, Greg; Flick, Robert; Evdokimova, Elena; Xu, Xiaohui; Osipiuk, Jerzy; Cuff, Marianne E; Lynch, Susan; Joachimiak, Andrzej; Savchenko, Alexei; Yakunin, Alexander F

    2012-06-15

    The hotdog fold is one of the basic protein folds widely present in bacteria, archaea and eukaryotes. Many of these proteins exhibit thioesterase activity against fatty acyl-CoAs and play important roles in lipid metabolism, cellular signalling and degradation of xenobiotics. The genome of the opportunistic pathogen Pseudomonas aeruginosa contains over 20 genes encoding predicted hotdog-fold proteins, none of which have been experimentally characterized. We have found that two P. aeruginosa hotdog proteins display high thioesterase activity against 3-hydroxy-3-methylglutaryl-CoA and glutaryl-CoA (PA5202), and octanoyl-CoA (PA2801). Crystal structures of these proteins were solved (at 1.70 and 1.75 Å for PA5202 and PA2801 respectively) and revealed a hotdog fold with a potential catalytic carboxylate residue located on the long α-helix (Asp(57) in PA5202 and Glu(35) in PA2801). Alanine residue replacement mutagenesis of PA5202 identified four residues (Asn(42), Arg(43), Asp(57) and Thr(76)) that are critical for its activity and are located in the active site. A P. aeruginosa PA5202 deletion strain showed an increased secretion of the antimicrobial pigment pyocyanine and an increased expression of genes involved in pyocyanin biosynthesis, suggesting a functional link between PA5202 activity and pyocyanin production. Thus the P. aeruginosa hotdog thioesterases PA5202 and PA2801 have similar structures, but exhibit different substrate preferences and functions.

  11. Proteome-level interplay between folding and aggregation propensities of proteins.

    PubMed

    Tartaglia, Gian Gaetano; Vendruscolo, Michele

    2010-10-08

    With the advent of proteomics, there is an increasing need of tools for predicting the properties of large numbers of proteins by using the information provided by their amino acid sequences, even in the absence of the knowledge of their structures. One of the most important types of predictions concerns whether proteins will fold or aggregate. Here, we study the competition between these two processes by analyzing the relationship between the folding and aggregation propensity profiles for the human and Escherichia coli proteomes. These profiles are calculated, respectively, using the CamFold method, which we introduce in this work, and the Zyggregator method. Our results indicate that the kinetic behavior of proteins is, to a large extent, determined by the interplay between regions of low folding and high aggregation propensities. Copyright © 2010. Published by Elsevier Ltd.

  12. Quality control in the secretory assembly line.

    PubMed Central

    Helenius, A

    2001-01-01

    As a rule, only proteins that have reached a native, folded and assembled structure are transported to their target organelles and compartments within the cell. In the secretory pathway of eukaryotic cells, this type of sorting is particularly important. A variety of molecular mechanisms are involved that distinguish between folded and unfolded proteins, modulate their intracellular transport, and induce degradation if they fail to fold. This phenomenon, called quality control, occurs at several levels and involves different types of folding sensors. The quality control system provides a stringent and versatile molecular sorting system that guaranties fidelity of protein expression in the secretory pathway. PMID:11260794

  13. A limited universe of membrane protein families and folds

    PubMed Central

    Oberai, Amit; Ihm, Yungok; Kim, Sanguk; Bowie, James U.

    2006-01-01

    One of the goals of structural genomics is to obtain a structural representative of almost every fold in nature. A recent estimate suggests that 70%–80% of soluble protein domains identified in the first 1000 genome sequences should be covered by about 25,000 structures—a reasonably achievable goal. As no current estimates exist for the number of membrane protein families, however, it is not possible to know whether family coverage is a realistic goal for membrane proteins. Here we find that virtually all polytopic helical membrane protein families are present in the already known sequences so we can make an estimate of the total number of families. We find that only ∼700 polytopic membrane protein families account for 80% of structured residues and ∼1700 cover 90% of structured residues. While apparently a finite and reachable goal, we estimate that it will likely take more than three decades to obtain the structures needed for 90% residue coverage, if current trends continue. PMID:16815920

  14. Coupling ligand recognition to protein folding in an engineered variant of rabbit ileal lipid binding protein.

    PubMed

    Kouvatsos, Nikolaos; Meldrum, Jill K; Searle, Mark S; Thomas, Neil R

    2006-11-28

    We have engineered a variant of the beta-clam shell protein ILBP which lacks the alpha-helical motif that caps the central binding cavity; the mutant protein is sufficiently destabilised that it is unfolded under physiological conditions, however, it unexpectedly binds its natural bile acid substrates with high affinity forming a native-like beta-sheet rich structure and demonstrating strong thermodynamic coupling between ligand binding and protein folding.

  15. Guiding the folding pathway of DNA origami

    NASA Astrophysics Data System (ADS)

    Dunn, Katherine E.; Dannenberg, Frits; Ouldridge, Thomas E.; Kwiatkowska, Marta; Turberfield, Andrew J.; Bath, Jonathan

    2015-09-01

    DNA origami is a robust assembly technique that folds a single-stranded DNA template into a target structure by annealing it with hundreds of short `staple' strands. Its guiding design principle is that the target structure is the single most stable configuration. The folding transition is cooperative and, as in the case of proteins, is governed by information encoded in the polymer sequence. A typical origami folds primarily into the desired shape, but misfolded structures can kinetically trap the system and reduce the yield. Although adjusting assembly conditions or following empirical design rules can improve yield, well-folded origami often need to be separated from misfolded structures. The problem could in principle be avoided if assembly pathway and kinetics were fully understood and then rationally optimized. To this end, here we present a DNA origami system with the unusual property of being able to form a small set of distinguishable and well-folded shapes that represent discrete and approximately degenerate energy minima in a vast folding landscape, thus allowing us to probe the assembly process. The obtained high yield of well-folded origami structures confirms the existence of efficient folding pathways, while the shape distribution provides information about individual trajectories through the folding landscape. We find that, similarly to protein folding, the assembly of DNA origami is highly cooperative; that reversible bond formation is important in recovering from transient misfoldings; and that the early formation of long-range connections can very effectively enforce particular folds. We use these insights to inform the design of the system so as to steer assembly towards desired structures. Expanding the rational design process to include the assembly pathway should thus enable more reproducible synthesis, particularly when targeting more complex structures. We anticipate that this expansion will be essential if DNA origami is to continue its rapid development and become a reliable manufacturing technology.

  16. Guiding the folding pathway of DNA origami.

    PubMed

    Dunn, Katherine E; Dannenberg, Frits; Ouldridge, Thomas E; Kwiatkowska, Marta; Turberfield, Andrew J; Bath, Jonathan

    2015-09-03

    DNA origami is a robust assembly technique that folds a single-stranded DNA template into a target structure by annealing it with hundreds of short 'staple' strands. Its guiding design principle is that the target structure is the single most stable configuration. The folding transition is cooperative and, as in the case of proteins, is governed by information encoded in the polymer sequence. A typical origami folds primarily into the desired shape, but misfolded structures can kinetically trap the system and reduce the yield. Although adjusting assembly conditions or following empirical design rules can improve yield, well-folded origami often need to be separated from misfolded structures. The problem could in principle be avoided if assembly pathway and kinetics were fully understood and then rationally optimized. To this end, here we present a DNA origami system with the unusual property of being able to form a small set of distinguishable and well-folded shapes that represent discrete and approximately degenerate energy minima in a vast folding landscape, thus allowing us to probe the assembly process. The obtained high yield of well-folded origami structures confirms the existence of efficient folding pathways, while the shape distribution provides information about individual trajectories through the folding landscape. We find that, similarly to protein folding, the assembly of DNA origami is highly cooperative; that reversible bond formation is important in recovering from transient misfoldings; and that the early formation of long-range connections can very effectively enforce particular folds. We use these insights to inform the design of the system so as to steer assembly towards desired structures. Expanding the rational design process to include the assembly pathway should thus enable more reproducible synthesis, particularly when targeting more complex structures. We anticipate that this expansion will be essential if DNA origami is to continue its rapid development and become a reliable manufacturing technology.

  17. The intracellular region of ClC-3 chloride channel is in a partially folded state and a monomer.

    PubMed

    Li, Shu Jie; Kawazaki, Masanobu; Ogasahara, Kyoko; Nakagawa, Atsushi

    2006-05-01

    In contrast to bacterial ClC chloride channels, all eukaryotic ClC chloride channels have a conserved long intracellular region that makes up of the carboxyl terminus of the protein and is necessary for channel functions as a channel gate. Little is known, however, about the molecular structure of the intracellular region of ClC chloride channels so far. Here, for the first time, we have expressed and purified the intracellular region of the rat ClC-3 chloride channel (C-ClC-3) as a water-soluble protein under physiological conditions, and investigated its structural characteristics and assembly behavior by means of circular dichroism (CD) spectroscopy, differential scanning calorimetry (DSC), size exclusion chromatography and analytical ultracentrifugation. The far-UV CD spectra of C-ClC-3 in the native state and in the presence of urea clearly show that the protein has a significantly folded secondary structure consisting of alpha-helices and beta-sheets, while the near-UV CD spectra and DSC experiments indicate the protein is deficient in well-defined tertiary packing. Its Stokes radius is larger than its expected size as a folded globular protein, as determined on size exclusion chromatography. Furthermore, the DisEMBL program, a useful computational tool for the prediction of disordered/unstructured regions within a protein sequence, predicts that the protein is in a partially folded state. Based on these results, we conclude that C-ClC-3 is partially folded. On the other hand, both size exclusion chromatography and sedimentation equilibrium analysis show that C-ClC-3 exists as a monomer in solution, not a dimer like the whole ClC-3 molecule.

  18. Physical-chemical features of non-detergent sulfobetaines active as protein-folding helpers.

    PubMed

    Expert-Bezançon, Nicole; Rabilloud, Thierry; Vuillard, Laurent; Goldberg, Michel E

    2003-01-01

    Some non-detergent sulfobetaines had been shown to prevent aggregation and improve the yield of active proteins when added to the buffer during in vitro protein renaturation. With the aim of designing more efficient folding helpers, a series of non-detergent sulfobetaines have been synthesized and their efficiency in improving the renaturation of a variety of proteins (E. coli tryptophan synthase and beta-D-galactosidase, hen lysozyme, bovine serum albumin, a monoclonal antibody) have been investigated. Attempts to correlate the structure of each sulfobetaines with its effect on folding revealed some molecular features that appear important in helping renaturation. This enabled us to design and synthesize new non-detergent sulfobetaines that act as potent folding helpers.

  19. Structure Prediction and Analysis of DNA Transposon and LINE Retrotransposon Proteins*

    PubMed Central

    Abrusán, György; Zhang, Yang; Szilágyi, András

    2013-01-01

    Despite the considerable amount of research on transposable elements, no large-scale structural analyses of the TE proteome have been performed so far. We predicted the structures of hundreds of proteins from a representative set of DNA and LINE transposable elements and used the obtained structural data to provide the first general structural characterization of TE proteins and to estimate the frequency of TE domestication and horizontal transfer events. We show that 1) ORF1 and Gag proteins of retrotransposons contain high amounts of structural disorder; thus, despite their very low conservation, the presence of disordered regions and probably their chaperone function is conserved. 2) The distribution of SCOP classes in DNA transposons and LINEs indicates that the proteins of DNA transposons are more ancient, containing folds that already existed when the first cellular organisms appeared. 3) DNA transposon proteins have lower contact order than randomly selected reference proteins, indicating rapid folding, most likely to avoid protein aggregation. 4) Structure-based searches for TE homologs indicate that the overall frequency of TE domestication events is low, whereas we found a relatively high number of cases where horizontal transfer, frequently involving parasites, is the most likely explanation for the observed homology. PMID:23530042

  20. Theoretical and computational studies in protein folding, design, and function

    NASA Astrophysics Data System (ADS)

    Morrissey, Michael Patrick

    2000-10-01

    In this work, simplified statistical models are used to understand an array of processes related to protein folding and design. In Part I, lattice models are utilized to test several theories about the statistical properties of protein-like systems. In Part II, sequence analysis and all-atom simulations are used to advance a novel theory for the behavior of a particular protein. Part I is divided into five chapters. In Chapter 2, a method of sequence design for model proteins, based on statistical mechanical first-principles, is developed. The cumulant design method uses a mean-field approximation to expand the free energy of a sequence in temperature. The method successfully designs sequences which fold to a target lattice structure at a specific temperature, a feat which was not possible using previous design methods. The next three chapters are computational studies of the double mutant cycle, which has been used experimentally to predict intra-protein interactions. Complete structure prediction is demonstrated for a model system using exhaustive, and also sub-exhaustive, double mutants. Nonadditivity of enthalpy, rather than of free energy, is proposed and demonstrated to be a superior marker for inter-residue contact. Next, a new double mutant protocol, called exchange mutation, is introduced. Although simple statistical arguments predict exchange mutation to be a more accurate contact predictor than standard mutant cycles, this hypothesis was not upheld in lattice simulations. Reasons for this inconsistency will be discussed. Finally, a multi-chain folding algorithm is introduced. Known as LINKS, this algorithm was developed to test a method of structure prediction which utilizes chain-break mutants. While structure prediction was not successful, LINKS should nevertheless be a useful tool for the study of protein-protein and protein-ligand interactions. The last chapter of Part I utilizes the lattice to explore the differences between standard folding, from the fully denatured state, and cotranslational folding, whereby one end of a protein is synthesized and released before the other. Cotranslational folding is shown to accelerate folding kinetics, particularly when the target backbone contains many local contacts. Additionally, cotranslation is shown capable of "guiding" a model protein into a metastable, local contact-rich state, despite the existence of a true native state of much lower energy. In Part II, a model is developed for the behavior of PrP, a unique mammalian protein which has been shown to possess two native states. The pathogenic "scrapie" state PrPSc, which has not been structurally characterized, is known to trigger conversion of the characterized endogenous conformation PrPC into additional PrPSc, Residues 144--153 are shown to form the most hydrophilic naturally occurring alpha-helix, out of a broad database with more than 10,000 candidates. The novel beta-nucleation model proposes that PrPSc, is not a distinct mono-molecular state, but is rather a beta-sheet-like aggregate centered around helix-1 components of multiple PrP molecules. The remainder of Part II uses molecular dynamics simulations to support the beta-nucleation hypothesis, and to propose a system of peptide ligands which may arrest the process of prion propagation.

  1. Chemical Ligation of Folded Recombinant Proteins: Segmental Isotopic Labeling of Domains for NMR Studies

    NASA Astrophysics Data System (ADS)

    Xu, Rong; Ayers, Brenda; Cowburn, David; Muir, Tom W.

    1999-01-01

    A convenient in vitro chemical ligation strategy has been developed that allows folded recombinant proteins to be joined together. This strategy permits segmental, selective isotopic labeling of the product. The src homology type 3 and 2 domains (SH3 and SH2) of Abelson protein tyrosine kinase, which constitute the regulatory apparatus of the protein, were individually prepared in reactive forms that can be ligated together under normal protein-folding conditions to form a normal peptide bond at the ligation junction. This strategy was used to prepare NMR sample quantities of the Abelson protein tyrosine kinase-SH(32) domain pair, in which only one of the domains was labeled with 15N Mass spectrometry and NMR analyses were used to confirm the structure of the ligated protein, which was also shown to have appropriate ligand-binding properties. The ability to prepare recombinant proteins with selectively labeled segments having a single-site mutation, by using a combination of expression of fusion proteins and chemical ligation in vitro, will increase the size limits for protein structural determination in solution with NMR methods. In vitro chemical ligation of expressed protein domains will also provide a combinatorial approach to the synthesis of linked protein domains.

  2. Synthetic Biology of Proteins: Tuning GFPs Folding and Stability with Fluoroproline

    PubMed Central

    Steiner, Thomas; Hess, Petra; Bae, Jae Hyun; Wiltschi, Birgit; Moroder, Luis; Budisa, Nediljko

    2008-01-01

    Background Proline residues affect protein folding and stability via cis/trans isomerization of peptide bonds and by the Cγ-exo or -endo puckering of their pyrrolidine rings. Peptide bond conformation as well as puckering propensity can be manipulated by proper choice of ring substituents, e.g. Cγ-fluorination. Synthetic chemistry has routinely exploited ring-substituted proline analogs in order to change, modulate or control folding and stability of peptides. Methodology/Principal Findings In order to transmit this synthetic strategy to complex proteins, the ten proline residues of enhanced green fluorescent protein (EGFP) were globally replaced by (4R)- and (4S)-fluoroprolines (FPro). By this approach, we expected to affect the cis/trans peptidyl-proline bond isomerization and pyrrolidine ring puckering, which are responsible for the slow folding of this protein. Expression of both protein variants occurred at levels comparable to the parent protein, but the (4R)-FPro-EGFP resulted in irreversibly unfolded inclusion bodies, whereas the (4S)-FPro-EGFP led to a soluble fluorescent protein. Upon thermal denaturation, refolding of this variant occurs at significantly higher rates than the parent EGFP. Comparative inspection of the X-ray structures of EGFP and (4S)-FPro-EGFP allowed to correlate the significantly improved refolding with the Cγ-endo puckering of the pyrrolidine rings, which is favored by 4S-fluorination, and to lesser extents with the cis/trans isomerization of the prolines. Conclusions/Significance We discovered that the folding rates and stability of GFP are affected to a lesser extent by cis/trans isomerization of the proline bonds than by the puckering of pyrrolidine rings. In the Cγ-endo conformation the fluorine atoms are positioned in the structural context of the GFP such that a network of favorable local interactions is established. From these results the combined use of synthetic amino acids along with detailed structural knowledge and existing protein engineering methods can be envisioned as a promising strategy for the design of complex tailor-made proteins and even cellular structures of superior properties compared to the native forms. PMID:18301757

  3. High-resolution structure of a retroviral protease folded as a monomer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilski, Miroslaw; Polish Academy of Sciences, 61-704 Poznan; Kazmierczyk, Maciej

    2011-11-01

    The crystal structure of Mason–Pfizer monkey virus protease folded as a monomer has been solved by molecular replacement using a model generated by players of the online game Foldit. The structure shows at high resolution the details of a retroviral protease folded as a monomer which can guide rational design of protease dimerization inhibitors as retroviral drugs. Mason–Pfizer monkey virus (M-PMV), a D-type retrovirus assembling in the cytoplasm, causes simian acquired immunodeficiency syndrome (SAIDS) in rhesus monkeys. Its pepsin-like aspartic protease (retropepsin) is an integral part of the expressed retroviral polyproteins. As in all retroviral life cycles, release and dimerizationmore » of the protease (PR) is strictly required for polyprotein processing and virion maturation. Biophysical and NMR studies have indicated that in the absence of substrates or inhibitors M-PMV PR should fold into a stable monomer, but the crystal structure of this protein could not be solved by molecular replacement despite countless attempts. Ultimately, a solution was obtained in mr-rosetta using a model constructed by players of the online protein-folding game Foldit. The structure indeed shows a monomeric protein, with the N- and C-termini completely disordered. On the other hand, the flap loop, which normally gates access to the active site of homodimeric retropepsins, is clearly traceable in the electron density. The flap has an unusual curled shape and a different orientation from both the open and closed states known from dimeric retropepsins. The overall fold of the protein follows the retropepsin canon, but the C{sup α} deviations are large and the active-site ‘DTG’ loop (here NTG) deviates up to 2.7 Å from the standard conformation. This structure of a monomeric retropepsin determined at high resolution (1.6 Å) provides important extra information for the design of dimerization inhibitors that might be developed as drugs for the treatment of retroviral infections, including AIDS.« less

  4. PASS2: an automated database of protein alignments organised as structural superfamilies.

    PubMed

    Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

    2004-04-02

    The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html

  5. The Intrinsically Disordered Regions of the Drosophila melanogaster Hox Protein Ultrabithorax Select Interacting Proteins Based on Partner Topology

    PubMed Central

    Hsiao, Hao-Ching; Gonzalez, Kim L.; Catanese, Daniel J.; Jordy, Kristopher E.; Matthews, Kathleen S.; Bondos, Sarah E.

    2014-01-01

    Interactions between structured proteins require a complementary topology and surface chemistry to form sufficient contacts for stable binding. However, approximately one third of protein interactions are estimated to involve intrinsically disordered regions of proteins. The dynamic nature of disordered regions before and, in some cases, after binding calls into question the role of partner topology in forming protein interactions. To understand how intrinsically disordered proteins identify the correct interacting partner proteins, we evaluated interactions formed by the Drosophila melanogaster Hox transcription factor Ultrabithorax (Ubx), which contains both structured and disordered regions. Ubx binding proteins are enriched in specific folds: 23 of its 39 partners include one of 7 folds, out of the 1195 folds recognized by SCOP. For the proteins harboring the two most populated folds, DNA-RNA binding 3-helical bundles and α-α superhelices, the regions of the partner proteins that exhibit these preferred folds are sufficient for Ubx binding. Three disorder-containing regions in Ubx are required to bind these partners. These regions are either alternatively spliced or multiply phosphorylated, providing a mechanism for cellular processes to regulate Ubx-partner interactions. Indeed, partner topology correlates with the ability of individual partner proteins to bind Ubx spliceoforms. Partners bind different disordered regions within Ubx to varying extents, creating the potential for competition between partners and cooperative binding by partners. The ability of partners to bind regions of Ubx that activate transcription and regulate DNA binding provides a mechanism for partners to modulate transcription regulation by Ubx, and suggests that one role of disorder in Ubx is to coordinate multiple molecular functions in response to tissue-specific cues. PMID:25286318

  6. A universal molecular clock of protein folds and its power in tracing the early history of aerobic metabolism and planet oxygenation.

    PubMed

    Wang, Minglei; Jiang, Ying-Ying; Kim, Kyung Mo; Qu, Ge; Ji, Hong-Fang; Mittenthal, Jay E; Zhang, Hong-Yu; Caetano-Anollés, Gustavo

    2011-01-01

    The standard molecular clock describes a constant rate of molecular evolution and provides a powerful framework for evolutionary timescales. Here, we describe the existence and implications of a molecular clock of folds, a universal recurrence in the discovery of new structures in the world of proteins. Using a phylogenomic structural census in hundreds of proteomes, we build phylogenies and time lines of domains at fold and fold superfamily levels of structural complexity. These time lines correlate approximately linearly with geological timescales and were here used to date two crucial events in life history, planet oxygenation and organism diversification. We first dissected the structures and functions of enzymes in simulated metabolic networks. The placement of anaerobic and aerobic enzymes in the time line revealed that aerobic metabolism emerged about 2.9 billion years (giga-annum; Ga) ago and expanded during a period of about 400 My, reaching what is known as the Great Oxidation Event. During this period, enzymes recruited old and new folds for oxygen-mediated enzymatic activities. Remarkably, the first fold lost by a superkingdom disappeared in Archaea 2.6 Ga ago, within the span of oxygen rise, suggesting that oxygen also triggered diversification of life. The implications of a molecular clock of folds are many and important for the neutral theory of molecular evolution and for understanding the growth and diversity of the protein world. The clock also extends the standard concept that was specific to molecules and their timescales and turns it into a universal timescale-generating tool.

  7. 1H, 15N and 13C resonance assignments of the J-domain of co-chaperone Sis1 from Saccharomyces cerevisiae.

    PubMed

    Pinheiro, Glaucia M S; Amorim, Gisele C; Iqbal, Anwar; Ramos, C H I; Almeida, Fabio C L

    2018-04-30

    Protein folding in the cell is usually aided by molecular chaperones, from which the Hsp70 (Hsp = heat shock protein) family has many important roles, such as aiding nascent folding and participating in translocation. Hsp70 has ATPase activity which is stimulated by binding to the J-domain present in co-chaperones from the Hsp40 family. Hsp40s have many functions, as for instance the binding to partially folded proteins to be delivered to Hsp70. However, the presence of the J-domain characterizes Hsp40s or, by this reason, as J-proteins. The J-domain alone can stimulate Hsp70 ATPase activity. Apparently, it also maintains the same conformation as in the whole protein although structural information on full J-proteins is still missing. This work reports the 1 H, 15 N and 13 C resonance assignments of the J-domain of a Hsp40 from Saccharomyces cerevisiae, named Sis1. Secondary structure and order parameter prediction from chemical shifts are also reported. Altogether, the data show that Sis1 J-domain is highly structured and predominantly formed by α-helices, results that are in very good agreement with those previously reported for the crystallographic structure.

  8. Amyloid Formation by Human Carboxypeptidase D Transthyretin-like Domain under Physiological Conditions*

    PubMed Central

    Garcia-Pardo, Javier; Graña-Montes, Ricardo; Fernandez-Mendez, Marc; Ruyra, Angels; Roher, Nerea; Aviles, Francesc X.; Lorenzo, Julia; Ventura, Salvador

    2014-01-01

    Protein aggregation is linked to a growing list of diseases, but it is also an intrinsic property of polypeptides, because the formation of functional globular proteins comes at the expense of an inherent aggregation propensity. Certain proteins can access aggregation-prone states from native-like conformations without the need to cross the energy barrier for unfolding. This is the case of transthyretin (TTR), a homotetrameric protein whose dissociation into its monomers initiates the aggregation cascade. Domains with structural homology to TTR exist in a number of proteins, including the M14B subfamily carboxypeptidases. We show here that the monomeric transthyretin-like domain of human carboxypeptidase D aggregates under close to physiological conditions into amyloid structures, with the population of folded but aggregation-prone states being controlled by the conformational stability of the domain. We thus confirm that the TTR fold keeps a generic residual aggregation propensity upon folding, resulting from the presence of preformed amyloidogenic β-strands in the native state. These structural elements should serve for functional/structural purposes, because they have not been purged out by evolution, but at the same time they put proteins like carboxypeptidase D at risk of aggregation in biological environments and thus can potentially lead to deposition diseases. PMID:25294878

  9. On Ramachandran angles, closed strings and knots in protein structure

    NASA Astrophysics Data System (ADS)

    Chen, Si; Niemi, Antti J.

    2016-08-01

    The Ramachandran angles (φ,\\psi ) of a protein backbone form the vertices of a piecewise geodesic curve on the surface of a torus. When the ends of the curve are connected to each other similarly, by a geodesic, the result is a closed string that in general wraps around the torus a number of times both in the meridional and the longitudinal directions. The two wrapping numbers are global characteristics of the protein structure. A statistical analysis of the wrapping numbers in terms of crystallographic x-ray structures in the protein data bank (PDB) reveals that proteins have no net chirality in the ϕ direction but in the ψ direction, proteins prefer to display chirality. A comparison between the wrapping numbers and the concept of folding index discloses a non-linearity in their relationship. Thus these three integer valued invariants can be used in tandem, to scrutinize and classify the global loop structure of individual PDB proteins, in terms of the overall fold topology.

  10. Computational Modeling of Proteins based on Cellular Automata: A Method of HP Folding Approximation.

    PubMed

    Madain, Alia; Abu Dalhoum, Abdel Latif; Sleit, Azzam

    2018-06-01

    The design of a protein folding approximation algorithm is not straightforward even when a simplified model is used. The folding problem is a combinatorial problem, where approximation and heuristic algorithms are usually used to find near optimal folds of proteins primary structures. Approximation algorithms provide guarantees on the distance to the optimal solution. The folding approximation approach proposed here depends on two-dimensional cellular automata to fold proteins presented in a well-studied simplified model called the hydrophobic-hydrophilic model. Cellular automata are discrete computational models that rely on local rules to produce some overall global behavior. One-third and one-fourth approximation algorithms choose a subset of the hydrophobic amino acids to form H-H contacts. Those algorithms start with finding a point to fold the protein sequence into two sides where one side ignores H's at even positions and the other side ignores H's at odd positions. In addition, blocks or groups of amino acids fold the same way according to a predefined normal form. We intend to improve approximation algorithms by considering all hydrophobic amino acids and folding based on the local neighborhood instead of using normal forms. The CA does not assume a fixed folding point. The proposed approach guarantees one half approximation minus the H-H endpoints. This lower bound guaranteed applies to short sequences only. This is proved as the core and the folds of the protein will have two identical sides for all short sequences.

  11. Using Local States To Drive the Sampling of Global Conformations in Proteins

    PubMed Central

    2016-01-01

    Conformational changes associated with protein function often occur beyond the time scale currently accessible to unbiased molecular dynamics (MD) simulations, so that different approaches have been developed to accelerate their sampling. Here we investigate how the knowledge of backbone conformations preferentially adopted by protein fragments, as contained in precalculated libraries known as structural alphabets (SA), can be used to explore the landscape of protein conformations in MD simulations. We find that (a) enhancing the sampling of native local states in both metadynamics and steered MD simulations allows the recovery of global folded states in small proteins; (b) folded states can still be recovered when the amount of information on the native local states is reduced by using a low-resolution version of the SA, where states are clustered into macrostates; and (c) sequences of SA states derived from collections of structural motifs can be used to sample alternative conformations of preselected protein regions. The present findings have potential impact on several applications, ranging from protein model refinement to protein folding and design. PMID:26808351

  12. Using Local States To Drive the Sampling of Global Conformations in Proteins.

    PubMed

    Pandini, Alessandro; Fornili, Arianna

    2016-03-08

    Conformational changes associated with protein function often occur beyond the time scale currently accessible to unbiased molecular dynamics (MD) simulations, so that different approaches have been developed to accelerate their sampling. Here we investigate how the knowledge of backbone conformations preferentially adopted by protein fragments, as contained in precalculated libraries known as structural alphabets (SA), can be used to explore the landscape of protein conformations in MD simulations. We find that (a) enhancing the sampling of native local states in both metadynamics and steered MD simulations allows the recovery of global folded states in small proteins; (b) folded states can still be recovered when the amount of information on the native local states is reduced by using a low-resolution version of the SA, where states are clustered into macrostates; and (c) sequences of SA states derived from collections of structural motifs can be used to sample alternative conformations of preselected protein regions. The present findings have potential impact on several applications, ranging from protein model refinement to protein folding and design.

  13. Deciphering the Hidden Informational Content of Protein Sequences

    PubMed Central

    Liu, Ming; Hua, Qing-xin; Hu, Shi-Quan; Jia, Wenhua; Yang, Yanwu; Saith, Sunil Evan; Whittaker, Jonathan; Arvan, Peter; Weiss, Michael A.

    2010-01-01

    Protein sequences encode both structure and foldability. Whereas the interrelationship of sequence and structure has been extensively investigated, the origins of folding efficiency are enigmatic. We demonstrate that the folding of proinsulin requires a flexible N-terminal hydrophobic residue that is dispensable for the structure, activity, and stability of the mature hormone. This residue (PheB1 in placental mammals) is variably positioned within crystal structures and exhibits 1H NMR motional narrowing in solution. Despite such flexibility, its deletion impaired insulin chain combination and led in cell culture to formation of non-native disulfide isomers with impaired secretion of the variant proinsulin. Cellular folding and secretion were maintained by hydrophobic substitutions at B1 but markedly perturbed by polar or charged side chains. We propose that, during folding, a hydrophobic side chain at B1 anchors transient long-range interactions by a flexible N-terminal arm (residues B1–B8) to mediate kinetic or thermodynamic partitioning among disulfide intermediates. Evidence for the overall contribution of the arm to folding was obtained by alanine scanning mutagenesis. Together, our findings demonstrate that efficient folding of proinsulin requires N-terminal sequences that are dispensable in the native state. Such arm-dependent folding can be abrogated by mutations associated with β-cell dysfunction and neonatal diabetes mellitus. PMID:20663888

  14. Acceleration of protein folding by four orders of magnitude through a single amino acid substitution

    PubMed Central

    Roderer, Daniel J. A.; Schärer, Martin A.; Rubini, Marina; Glockshuber, Rudi

    2015-01-01

    Cis prolyl peptide bonds are conserved structural elements in numerous protein families, although their formation is energetically unfavorable, intrinsically slow and often rate-limiting for folding. Here we investigate the reasons underlying the conservation of the cis proline that is diagnostic for the fold of thioredoxin-like thiol-disulfide oxidoreductases. We show that replacement of the conserved cis proline in thioredoxin by alanine can accelerate spontaneous folding to the native, thermodynamically most stable state by more than four orders of magnitude. However, the resulting trans alanine bond leads to small structural rearrangements around the active site that impair the function of thioredoxin as catalyst of electron transfer reactions by more than 100-fold. Our data provide evidence for the absence of a strong evolutionary pressure to achieve intrinsically fast folding rates, which is most likely a consequence of proline isomerases and molecular chaperones that guarantee high in vivo folding rates and yields. PMID:26121966

  15. Semiempirical prediction of protein folds

    NASA Astrophysics Data System (ADS)

    Fernández, Ariel; Colubri, Andrés; Appignanesi, Gustavo

    2001-08-01

    We introduce a semiempirical approach to predict ab initio expeditious pathways and native backbone geometries of proteins that fold under in vitro renaturation conditions. The algorithm is engineered to incorporate a discrete codification of local steric hindrances that constrain the movements of the peptide backbone throughout the folding process. Thus, the torsional state of the chain is assumed to be conditioned by the fact that hopping from one basin of attraction to another in the Ramachandran map (local potential energy surface) of each residue is energetically more costly than the search for a specific (Φ, Ψ) torsional state within a single basin. A combinatorial procedure is introduced to evaluate coarsely defined torsional states of the chain defined ``modulo basins'' and translate them into meaningful patterns of long range interactions. Thus, an algorithm for structure prediction is designed based on the fact that local contributions to the potential energy may be subsumed into time-evolving conformational constraints defining sets of restricted backbone geometries whereupon the patterns of nonbonded interactions are constructed. The predictive power of the algorithm is assessed by (a) computing ab initio folding pathways for mammalian ubiquitin that ultimately yield a stable structural pattern reproducing all of its native features, (b) determining the nucleating event that triggers the hydrophobic collapse of the chain, and (c) comparing coarse predictions of the stable folds of moderately large proteins (N~100) with structural information extracted from the protein data bank.

  16. Helix formation and stability in membranes.

    PubMed

    McKay, Matthew J; Afrose, Fahmida; Koeppe, Roger E; Greathouse, Denise V

    2018-02-13

    In this article we review current understanding of basic principles for the folding of membrane proteins, focusing on the more abundant alpha-helical class. Membrane proteins, vital to many biological functions and implicated in numerous diseases, fold into their active conformations in the complex environment of the cell bilayer membrane. While many membrane proteins rely on the translocon and chaperone proteins to fold correctly, others can achieve their functional form in the absence of any translation apparatus or other aides. Nevertheless, the spontaneous folding process is not well understood at the molecular level. Recent findings suggest that helix fraying and loop formation may be important for overall structure, dynamics and regulation of function. Several types of membrane helices with ionizable amino acids change their topology with pH. Additionally we note that some peptides, including many that are rich in arginine, and a particular analogue of gramicidin, are able passively to translocate across cell membranes. The findings indicate that a final protein structure in a lipid-bilayer membrane is sequence-based, with lipids contributing to stability and regulation. While much progress has been made toward understanding the folding process for alpha-helical membrane proteins, it remains a work in progress. This article is part of a Special Issue entitled: Emergence of Complex Behavior in Biomembranes edited by Marjorie Longo. Copyright © 2018 Elsevier B.V. All rights reserved.

  17. Exploring the Evolutionary Accident Hypothesis: Are Extant Protein Folds the Fittest or the Luckiest?

    NASA Technical Reports Server (NTRS)

    Shannon, G.; Wei, C.; Pohorille, A.

    2017-01-01

    Considering the range of functions proteins perform, it is surprising they fold into a relatively small set of structures or "folds" that facilitate such function. One explanation is that only a minority were fit enough to emerge from Darwinian selection during the early evolution of life. Alternatively, perhaps only a fraction of all possible folds were trialed. Understanding proto-catalyst selection will aid understanding of the origins and early evolution of life. To investigate which explanation is correct, we study a protein evolved in vitro to bind ATP by Jack Szostak (Fig. 1). This protein adopts a fold which is absent from nature. We are testing whether this fold would have possessed the capability to evolve that would have been essential to survive natural selection on early Earth. Folds that couldn't improve their fitness and evolve to perform new functions would have been replaced by rivals that could. To determine whether the fold is evolvable, we are attempting to change the function of the protein by rationally redesigning to bind GTP. Two design strategies in the region of the nucleobase have been implemented to provide hydrogen bonding partners for the ligand i) an insertion ii) a MET to ASN mutation. Redesigns are being studied computationally at Ames Research Center including free energy of binding calculations. Binding affinities of promising redesigns are to be validated by experimental collaborators at ForteBio using Super Streptavidin Biosensors. If the fold is found to be non-evolvable, this may suggest that many structures were trialed, but the majority were pruned on the basis of their evolvability. Alternatively, if the fold is demonstrated to be evolvable, it would be difficult to explain its absence from nature without considering the possibility that the fold simply wasn't sampled on early Earth. This would not only further our understanding of the origins of life on Earth but also suggest a common phe-nomenon of proto-catalyst evolution.

  18. Protein vivisection reveals elusive intermediates in folding

    PubMed Central

    Zheng, Zhongzhou; Sosnick, Tobin R.

    2010-01-01

    Although most folding intermediates escape detection, their characterization is crucial to the elucidation of folding mechanisms. Here we outline a powerful strategy to populate partially unfolded intermediates: A buried aliphatic residue is substituted with a charged residue (e.g., Leu→Glu−) to destabilize and unfold a specific region of the protein. We apply this strategy to Ubiquitin, reversibly trapping a folding intermediate in which the β5 strand is unfolded. The intermediate refolds to a native-like structure upon charge neutralization under mildly acidic conditions. Characterization of the trapped intermediate using NMR and hydrogen exchange methods identifies a second folding intermediate and reveals the order and free energies of the two major folding events on the native side of the rate-limiting step. This general strategy may be combined with other methods and have broad applications in the study of protein folding and other reactions that require trapping of high energy states. PMID:20144618

  19. The E. coli thioredoxin folding mechanism: the key role of the C-terminal helix.

    PubMed

    Vazquez, Diego S; Sánchez, Ignacio E; Garrote, Ana; Sica, Mauricio P; Santos, Javier

    2015-02-01

    In this work, the unfolding mechanism of oxidized Escherichia coli thioredoxin (EcTRX) was investigated experimentally and computationally. We characterized seven point mutants distributed along the C-terminal α-helix (CTH) and the preceding loop. The mutations destabilized the protein against global unfolding while leaving the native structure unchanged. Global analysis of the unfolding kinetics of all variants revealed a linear unfolding route with a high-energy on-pathway intermediate state flanked by two transition state ensembles TSE1 and TSE2. The experiments show that CTH is mainly unfolded in TSE1 and the intermediate and becomes structured in TSE2. Structure-based molecular dynamics are in agreement with these experiments and provide protein-wide structural information on transient states. In our model, EcTRX folding starts with structure formation in the β-sheet, while the protein helices coalesce later. As a whole, our results indicate that the CTH is a critical module in the folding process, restraining a heterogeneous intermediate ensemble into a biologically active native state and providing the native protein with thermodynamic and kinetic stability. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Protein 3D Structure Computed from Evolutionary Sequence Variation

    PubMed Central

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes. PMID:22163331

  1. Interactions of hyaluronan grafted on protein surfaces studied using a quartz crystal microbalance and a surface force balance.

    PubMed

    Jiang, Lei; Han, Juan; Yang, Limin; Ma, Hongchao; Huang, Bo

    2015-10-07

    Vocal folds are complex and multilayer-structured where the main layer is widely composed of hyaluronan (HA). The viscoelasticity of HA is key to voice production in the vocal fold as it affects the initiation and maintenance of phonation. In this study a simple layer-structured surface model was set up to mimic the structure of the vocal folds. The interactions between two opposing surfaces bearing HA were measured and characterised to analyse HA's response to the normal and shear compression at a stress level similar to that in the vocal fold. From the measurements of the quartz crystal microbalance, atomic force microscopy and the surface force balance, the osmotic pressure, normal interactions, elasticity change, volume fraction, refractive index and friction of both HA and the supporting protein layer were obtained. These findings may shed light on the physical mechanism of HA function in the vocal fold and the specific role of HA as an important component in the effective treatment of the vocal fold disease.

  2. Structure of the human TRiC/CCT Subunit 5 associated with hereditary sensory neuropathy.

    PubMed

    Pereira, Jose H; McAndrew, Ryan P; Sergeeva, Oksana A; Ralston, Corie Y; King, Jonathan A; Adams, Paul D

    2017-06-16

    The human chaperonin TRiC consists of eight non-identical subunits, and its protein-folding activity is critical for cellular health. Misfolded proteins are associated with many human diseases, such as amyloid diseases, cancer, and neuropathies, making TRiC a potential therapeutic target. A detailed structural understanding of its ATP-dependent folding mechanism and substrate recognition is therefore of great importance. Of particular health-related interest is the mutation Histidine 147 to Arginine (H147R) in human TRiC subunit 5 (CCT5), which has been associated with hereditary sensory neuropathy. In this paper, we describe the crystal structures of CCT5 and the CCT5-H147R mutant, which provide important structural information for this vital protein-folding machine in humans. This first X-ray crystallographic study of a single human CCT subunit in the context of a hexadecameric complex can be expanded in the future to the other 7 subunits that form the TRiC complex.

  3. Cryo-EM structure of aerolysin variants reveals a novel protein fold and the pore-formation process

    NASA Astrophysics Data System (ADS)

    Iacovache, Ioan; de Carlo, Sacha; Cirauqui, Nuria; Dal Peraro, Matteo; van der Goot, F. Gisou; Zuber, Benoît

    2016-07-01

    Owing to their pathogenical role and unique ability to exist both as soluble proteins and transmembrane complexes, pore-forming toxins (PFTs) have been a focus of microbiologists and structural biologists for decades. PFTs are generally secreted as water-soluble monomers and subsequently bind the membrane of target cells. Then, they assemble into circular oligomers, which undergo conformational changes that allow membrane insertion leading to pore formation and potentially cell death. Aerolysin, produced by the human pathogen Aeromonas hydrophila, is the founding member of a major PFT family found throughout all kingdoms of life. We report cryo-electron microscopy structures of three conformational intermediates and of the final aerolysin pore, jointly providing insight into the conformational changes that allow pore formation. Moreover, the structures reveal a protein fold consisting of two concentric β-barrels, tightly kept together by hydrophobic interactions. This fold suggests a basis for the prion-like ultrastability of aerolysin pore and its stoichiometry.

  4. A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive.

    PubMed

    Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

    2016-01-01

    Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.

  5. A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive

    PubMed Central

    Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

    2016-01-01

    Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms. PMID:27517583

  6. ``Sequence space soup'' of proteins and copolymers

    NASA Astrophysics Data System (ADS)

    Chan, Hue Sun; Dill, Ken A.

    1991-09-01

    To study the protein folding problem, we use exhaustive computer enumeration to explore ``sequence space soup,'' an imaginary solution containing the ``native'' conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymer sequence. The model is of short self-avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two-dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acid monomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.

  7. Spontaneous Unfolding-Refolding of Fibronectin Type III Domains Assayed by Thiol Exchange

    PubMed Central

    Shah, Riddhi; Ohashi, Tomoo; Erickson, Harold P.; Oas, Terrence G.

    2017-01-01

    Globular proteins are not permanently folded but spontaneously unfold and refold on time scales that can span orders of magnitude for different proteins. A longstanding debate in the protein-folding field is whether unfolding rates or folding rates correlate to the stability of a protein. In the present study, we have determined the unfolding and folding kinetics of 10 FNIII domains. FNIII domains are one of the most common protein folds and are present in 2% of animal proteins. FNIII domains are ideal for this study because they have an identical seven-strand β-sandwich structure, but they vary widely in sequence and thermodynamic stability. We assayed thermodynamic stability of each domain by equilibrium denaturation in urea. We then assayed the kinetics of domain opening and closing by a technique known as thiol exchange. For this we introduced a buried Cys at the identical location in each FNIII domain and measured the kinetics of labeling with DTNB over a range of urea concentrations. A global fit of the kinetics data gave the kinetics of spontaneous unfolding and refolding in zero urea. We found that the folding rates were relatively similar, ∼0.1–1 s−1, for the different domains. The unfolding rates varied widely and correlated with thermodynamic stability. Our study is the first to address this question using a set of domains that are structurally homologous but evolved with widely varying sequence identity and thermodynamic stability. These data add new evidence that thermodynamic stability correlates primarily with unfolding rate rather than folding rate. The study also has implications for the question of whether opening of FNIII domains contributes to the stretching of fibronectin matrix fibrils. PMID:27909052

  8. Structural analysis of Bacillus pumilus phenolic acid decarboxylase, a lipocalin-fold enzyme

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matte, Allan; Grosse, Stephan; Bergeron, Hélène

    The decarboxylation of phenolic acids, including ferulic and p-coumaric acids, to their corresponding vinyl derivatives is of importance in the flavoring and polymer industries. Here, the crystal structure of phenolic acid decarboxylase (PAD) from Bacillus pumilus strain UI-670 is reported. The enzyme is a 161-residue polypeptide that forms dimers both in the crystal and in solution. The structure of PAD as determined by X-ray crystallography revealed a -barrel structure and two -helices, with a cleft formed at one edge of the barrel. The PAD structure resembles those of the lipocalin-fold proteins, which often bind hydrophobic ligands. Superposition of structurally relatedmore » proteins bound to their cognate ligands shows that they and PAD bind their ligands in a conserved location within the -barrel. Analysis of the residue-conservation pattern for PAD-related sequences mapped onto the PAD structure reveals that the conservation mainly includes residues found within the hydrophobic core of the protein, defining a common lipocalin-like fold for this enzyme family. A narrow cleft containing several conserved amino acids was observed as a structural feature and a potential ligand-binding site.« less

  9. Disulfide bonds in ER protein folding and homeostasis

    PubMed Central

    Feige, Matthias J.; Hendershot, Linda M.

    2010-01-01

    Proteins that are expressed outside the cell must be synthesized, folded and assembled in a way that ensures they can function in their designate location. Accordingly these proteins are primarily synthesized in the endoplasmic reticulum (ER), which has developed a chemical environment more similar to that outside the cell. This organelle is equipped with a variety of molecular chaperones and folding enzymes that both assist the folding process, while at the same time exerting tight quality control measures that are largely absent outside the cell. A major post-translational modification of ER-synthesized proteins is disulfide bridge formation, which is catalyzed by the family of protein disulfide isomerases. As this covalent modification provides unique structural advantages to extracellular proteins, multiple pathways to their formation have evolved. However, the advantages that disulfide bonds impart to these proteins come at a high cost to the cell. Very recent reports have shed light on how the cell can deal with or even exploit the side reactions of disulfide bond formation to maintain homeostasis of the ER and its folding machinery. PMID:21144725

  10. Structures composing protein domains.

    PubMed

    Kubrycht, Jaroslav; Sigler, Karel; Souček, Pavel; Hudeček, Jiří

    2013-08-01

    This review summarizes available data concerning intradomain structures (IS) such as functionally important amino acid residues, short linear motifs, conserved or disordered regions, peptide repeats, broadly occurring secondary structures or folds, etc. IS form structural features (units or elements) necessary for interactions with proteins or non-peptidic ligands, enzyme reactions and some structural properties of proteins. These features have often been related to a single structural level (e.g. primary structure) mostly requiring certain structural context of other levels (e.g. secondary structures or supersecondary folds) as follows also from some examples reported or demonstrated here. In addition, we deal with some functionally important dynamic properties of IS (e.g. flexibility and different forms of accessibility), and more special dynamic changes of IS during enzyme reactions and allosteric regulation. Selected notes concern also some experimental methods, still more necessary tools of bioinformatic processing and clinically interesting relationships. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  11. Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure.

    PubMed

    Kono, H; Saven, J G

    2001-02-23

    Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.

  12. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

    PubMed

    Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

    2017-04-15

    Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.

  13. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979

  14. A method for partitioning the information contained in a protein sequence between its structure and function.

    PubMed

    Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido

    2018-05-23

    Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  15. Exploring the repeat protein universe through computational protein design

    DOE PAGES

    Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...

    2015-12-16

    A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less

  16. RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching

    NASA Astrophysics Data System (ADS)

    Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.

    Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.

  17. X-ray solution scattering combined with computation characterizing protein folds and multiple conformational states : computation and application.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, S.; Park, S.; Makowski, L.

    Small angle X-ray scattering (SAXS) is an increasingly powerful technique to characterize the structure of biomolecules in solution. We present a computational method for accurately and efficiently computing the solution scattering curve from a protein with dynamical fluctuations. The method is built upon a coarse-grained (CG) representation of the protein. This CG approach takes advantage of the low-resolution character of solution scattering. It allows rapid determination of the scattering pattern from conformations extracted from CG simulations to obtain scattering characterization of the protein conformational landscapes. Important elements incorporated in the method include an effective residue-based structure factor for each aminomore » acid, an explicit treatment of the hydration layer at the surface of the protein, and an ensemble average of scattering from all accessible conformations to account for macromolecular flexibility. The CG model is calibrated and illustrated to accurately reproduce the experimental scattering curve of Hen egg white lysozyme. We then illustrate the computational method by calculating the solution scattering pattern of several representative protein folds and multiple conformational states. The results suggest that solution scattering data, when combined with a reliable computational method, have great potential for a better structural description of multi-domain complexes in different functional states, and for recognizing structural folds when sequence similarity to a protein of known structure is low. Possible applications of the method are discussed.« less

  18. Topology-based modeling of intrinsically disordered proteins: balancing intrinsic folding and intermolecular interactions.

    PubMed

    Ganguly, Debabani; Chen, Jianhan

    2011-04-01

    Coupled binding and folding is frequently involved in specific recognition of so-called intrinsically disordered proteins (IDPs), a newly recognized class of proteins that rely on a lack of stable tertiary fold for function. Here, we exploit topology-based Gō-like modeling as an effective tool for the mechanism of IDP recognition within the theoretical framework of minimally frustrated energy landscape. Importantly, substantial differences exist between IDPs and globular proteins in both amino acid sequence and binding interface characteristics. We demonstrate that established Gō-like models designed for folded proteins tend to over-estimate the level of residual structures in unbound IDPs, whereas under-estimating the strength of intermolecular interactions. Such systematic biases have important consequences in the predicted mechanism of interaction. A strategy is proposed to recalibrate topology-derived models to balance intrinsic folding propensities and intermolecular interactions, based on experimental knowledge of the overall residual structure level and binding affinity. Applied to pKID/KIX, the calibrated Gō-like model predicts a dominant multistep sequential pathway for binding-induced folding of pKID that is initiated by KIX binding via the C-terminus in disordered conformations, followed by binding and folding of the rest of C-terminal helix and finally the N-terminal helix. This novel mechanism is consistent with key observations derived from a recent NMR titration and relaxation dispersion study and provides a molecular-level interpretation of kinetic rates derived from dispersion curve analysis. These case studies provide important insight into the applicability and potential pitfalls of topology-based modeling for studying IDP folding and interaction in general. Copyright © 2011 Wiley-Liss, Inc.

  19. Computational design of water-soluble α-helical barrels.

    PubMed

    Thomson, Andrew R; Wood, Christopher W; Burton, Antony J; Bartlett, Gail J; Sessions, Richard B; Brady, R Leo; Woolfson, Derek N

    2014-10-24

    The design of protein sequences that fold into prescribed de novo structures is challenging. General solutions to this problem require geometric descriptions of protein folds and methods to fit sequences to these. The α-helical coiled coils present a promising class of protein for this and offer considerable scope for exploring hitherto unseen structures. For α-helical barrels, which have more than four helices and accessible central channels, many of the possible structures remain unobserved. Here, we combine geometrical considerations, knowledge-based scoring, and atomistic modeling to facilitate the design of new channel-containing α-helical barrels. X-ray crystal structures of the resulting designs match predicted in silico models. Furthermore, the observed channels are chemically defined and have diameters related to oligomer state, which present routes to design protein function. Copyright © 2014, American Association for the Advancement of Science.

  20. Deletion of internal structured repeats increases the stability of a leucine-rich repeat protein, YopM

    PubMed Central

    Barrick, Doug

    2011-01-01

    Mapping the stability distributions of proteins in their native folded states provides a critical link between structure, thermodynamics, and function. Linear repeat proteins have proven more amenable to this kind of mapping than globular proteins. C-terminal deletion studies of YopM, a large, linear leucine-rich repeat (LRR) protein, show that stability is distributed quite heterogeneously, yet a high level of cooperativity is maintained [1]. Key components of this distribution are three interfaces that strongly stabilize adjacent sequences, thereby maintaining structural integrity and promoting cooperativity. To better understand the distribution of interaction energy around these critical interfaces, we studied internal (rather than terminal) deletions of three LRRs in this region, including one of these stabilizing interfaces. Contrary to our expectation that deletion of structured repeats should be destabilizing, we find that internal deletion of folded repeats can actually stabilize the native state, suggesting that these repeats are destabilizing, although paradoxically, they are folded in the native state. We identified two residues within this destabilizing segment that deviate from the consensus sequence at a position that normally forms a stacked leucine ladder in the hydrophobic core. Replacement of these nonconsensus residues with leucine is stabilizing. This stability enhancement can be reproduced in the context of nonnative interfaces, but it requires an extended hydrophobic core. Our results demonstrate that different LRRs vary widely in their contribution to stability, and that this variation is context-dependent. These two factors are likely to determine the types of rearrangements that lead to folded, functional proteins, and in turn, are likely to restrict the pathways available for the evolution of linear repeat proteins. PMID:21764506

  1. De Novo Proteins with Life-Sustaining Functions Are Structurally Dynamic.

    PubMed

    Murphy, Grant S; Greisman, Jack B; Hecht, Michael H

    2016-01-29

    Designing and producing novel proteins that fold into stable structures and provide essential biological functions are key goals in synthetic biology. In initial steps toward achieving these goals, we constructed a combinatorial library of de novo proteins designed to fold into 4-helix bundles. As described previously, screening this library for sequences that function in vivo to rescue conditionally lethal mutants of Escherichia coli (auxotrophs) yielded several de novo sequences, termed SynRescue proteins, which rescued four different E. coli auxotrophs. In an effort to understand the structural requirements necessary for auxotroph rescue, we investigated the biophysical properties of the SynRescue proteins, using both computational and experimental approaches. Results from circular dichroism, size-exclusion chromatography, and NMR demonstrate that the SynRescue proteins are α-helical and relatively stable. Surprisingly, however, they do not form well-ordered structures. Instead, they form dynamic structures that fluctuate between monomeric and dimeric states. These findings show that a well-ordered structure is not a prerequisite for life-sustaining functions, and suggests that dynamic structures may have been important in the early evolution of protein function. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Kinetic studies of the folding of heterodimeric monellin: evidence for switching between alternative parallel pathways.

    PubMed

    Aghera, Nilesh; Udgaonkar, Jayant B

    2012-07-13

    Determining whether or not a protein uses multiple pathways to fold is an important goal in protein folding studies. When multiple pathways are present, defined by transition states that differ in their compactness and structure but not significantly in energy, they may manifest themselves by causing the dependence on denaturant concentration of the logarithm of the observed rate constant of folding to have an upward curvature. In this study, the folding mechanism of heterodimeric monellin [double-chain monellin (dcMN)] has been studied over a range of protein and guanidine hydrochloride (GdnHCl) concentrations, using the intrinsic tryptophan fluorescence of the protein as the probe for the folding reaction. Refolding is shown to occur in multiple kinetic phases. In the first stage of refolding, which is silent to any change in intrinsic fluorescence, the two chains of monellin bind to one another to form an encounter complex. Interrupted folding experiments show that the initial encounter complex folds to native dcMN via two folding routes. A productive folding intermediate population is identified on one route but not on both of these routes. Two intermediate subpopulations appear to form in a fast kinetic phase, and native dcMN forms in a slow kinetic phase. The chevron arms for both the fast and slow phases of refolding are shown to have upward curvatures, suggesting that at least two pathways each defined by a different intermediate are operational during these kinetic phases of structure formation. Refolding switches from one pathway to the other as the GdnHCl concentration is increased. Copyright © 2012 Elsevier Ltd. All rights reserved.

  3. The crystal structure of choline kinase reveals a eukaryotic protein kinase fold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peisach, D.; Gee, P.; Kent, K.

    2010-03-08

    Choline kinase catalyzes the ATP-dependent phosphorylation of choline, the first committed step in the CDP-choline pathway for the biosynthesis of phosphatidylcholine. The 2.0 {angstrom} crystal structure of a choline kinase from C. elegans (CKA-2) reveals that the enzyme is a homodimeric protein with each monomer organized into a two-domain fold. The structure is remarkably similar to those of protein kinases and aminoglycoside phosphotransferases, despite no significant similarity in amino acid sequence. Comparisons to the structures of other kinases suggest that ATP binds to CKA-2 in a pocket formed by highly conserved and catalytically important residues. In addition, a choline bindingmore » site is proposed to be near the ATP binding pocket and formed by several structurally flexible loops.« less

  4. How cooperative are protein folding and unfolding transitions?

    PubMed Central

    Malhotra, Pooja

    2016-01-01

    Abstract A thermodynamically and kinetically simple picture of protein folding envisages only two states, native (N) and unfolded (U), separated by a single activation free energy barrier, and interconverting by cooperative two‐state transitions. The folding/unfolding transitions of many proteins occur, however, in multiple discrete steps associated with the formation of intermediates, which is indicative of reduced cooperativity. Furthermore, much advancement in experimental and computational approaches has demonstrated entirely non‐cooperative (gradual) transitions via a continuum of states and a multitude of small energetic barriers between the N and U states of some proteins. These findings have been instrumental towards providing a structural rationale for cooperative versus noncooperative transitions, based on the coupling between interaction networks in proteins. The cooperativity inherent in a folding/unfolding reaction appears to be context dependent, and can be tuned via experimental conditions which change the stabilities of N and U. The evolution of cooperativity in protein folding transitions is linked closely to the evolution of function as well as the aggregation propensity of the protein. A large activation energy barrier in a fully cooperative transition can provide the kinetic control required to prevent the accumulation of partially unfolded forms, which may promote aggregation. Nevertheless, increasing evidence for barrier‐less “downhill” folding, as well as for continuous “uphill” unfolding transitions, indicate that gradual non‐cooperative processes may be ubiquitous features on the free energy landscape of protein folding. PMID:27522064

  5. Evaluation of the effect of post-translational modification toward protein structure: Chemical synthesis of glycosyl crambins having either a high mannose-type or a complex-type oligosaccharide.

    PubMed

    Dedola, Simone; Izumi, Masayuki; Makimura, Yutaka; Ito, Yukishige; Kajihara, Yasuhiro

    2016-11-04

    Glycoproteins are assembled and folded in the endoplasmic reticulum (ER) and transported to the Golgi for further processing of their oligosaccharides. During these processes, two types of oligosaccharides are used: that is, high mannose-type oligosaccharide in the ER and complex-type oligosaccharide in the Golgi. We were interested to know how two different types of oligosaccharides could influence the folding pathway or the final three-dimensional structure of the glycoproteins. For this purpose, we synthesized a new glycosyl crambin having complex-type oligosaccharide and evaluated the folding process, the final protein structure analyzed by NMR, and compared the CD spectra with previously synthesized glycosyl crambin bearing high mannose-type oligosaccharides. From our analysis, we found that the two different oligosaccharides do not influence the folding pathway in vitro and the final structure of the small glycoproteins. © 2015 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 106: 446-452, 2016. © 2015 Wiley Periodicals, Inc.

  6. Micro-scale NMR Screening of New Detergents for Membrane Protein Structural Biology

    PubMed Central

    Zhang, Qinghai; Horst, Reto; Geralt, Michael; Ma, Xingquan; Hong, Wen-Xu; Finn, M. G.; Stevens, Raymond C.; Wüthrich, Kurt

    2008-01-01

    The rate limiting step in biophysical characterization of membrane proteins is often the availability of suitable amounts of protein material. It was therefore of interest to demonstrate that micro-coil nuclear magnetic resonance (NMR) technology can be used to screen microscale quantities of membrane proteins for proper folding in samples destined for structural studies. Micoscale NMR was then used to screen a series of newly designed zwitterionic phosphocholine detergents for their ability to reconstitute membrane proteins, using the previously well characterized β-barrel E.coli outer membrane protein OmpX as a test case. Fold screening was thus achieved with μg-amounts of uniformly 2H,15N-labeld OmpX and affordable amounts of the detergents, and prescreening with SDS-gel electrophoresis ensured efficient selection of the targets for NMR studies. A systematic approach to optimize the phosphocholine motif for membrane protein refolding led to the identification of two new detergents, 138-Fos and 179-Fos, that yield 2D [15N,1H]-TROSY correlation NMR spectra of natively folded reconstituted OmpX. PMID:18479092

  7. The inverted free energy landscape of an intrinsically disordered peptide by simulations and experiments.

    PubMed

    Granata, Daniele; Baftizadeh, Fahimeh; Habchi, Johnny; Galvagnion, Celine; De Simone, Alfonso; Camilloni, Carlo; Laio, Alessandro; Vendruscolo, Michele

    2015-10-26

    The free energy landscape theory has been very successful in rationalizing the folding behaviour of globular proteins, as this representation provides intuitive information on the number of states involved in the folding process, their populations and pathways of interconversion. We extend here this formalism to the case of the Aβ40 peptide, a 40-residue intrinsically disordered protein fragment associated with Alzheimer's disease. By using an advanced sampling technique that enables free energy calculations to reach convergence also in the case of highly disordered states of proteins, we provide a precise structural characterization of the free energy landscape of this peptide. We find that such landscape has inverted features with respect to those typical of folded proteins. While the global free energy minimum consists of highly disordered structures, higher free energy regions correspond to a large variety of transiently structured conformations with secondary structure elements arranged in several different manners, and are not separated from each other by sizeable free energy barriers. From this peculiar structure of the free energy landscape we predict that this peptide should become more structured and not only more compact, with increasing temperatures, and we show that this is the case through a series of biophysical measurements.

  8. The inverted free energy landscape of an intrinsically disordered peptide by simulations and experiments

    PubMed Central

    Granata, Daniele; Baftizadeh, Fahimeh; Habchi, Johnny; Galvagnion, Celine; De Simone, Alfonso; Camilloni, Carlo; Laio, Alessandro; Vendruscolo, Michele

    2015-01-01

    The free energy landscape theory has been very successful in rationalizing the folding behaviour of globular proteins, as this representation provides intuitive information on the number of states involved in the folding process, their populations and pathways of interconversion. We extend here this formalism to the case of the Aβ40 peptide, a 40-residue intrinsically disordered protein fragment associated with Alzheimer’s disease. By using an advanced sampling technique that enables free energy calculations to reach convergence also in the case of highly disordered states of proteins, we provide a precise structural characterization of the free energy landscape of this peptide. We find that such landscape has inverted features with respect to those typical of folded proteins. While the global free energy minimum consists of highly disordered structures, higher free energy regions correspond to a large variety of transiently structured conformations with secondary structure elements arranged in several different manners, and are not separated from each other by sizeable free energy barriers. From this peculiar structure of the free energy landscape we predict that this peptide should become more structured and not only more compact, with increasing temperatures, and we show that this is the case through a series of biophysical measurements. PMID:26498066

  9. Evidence for close side-chain packing in an early protein folding intermediate previously assumed to be a molten globule

    PubMed Central

    Rosen, Laura E.; Connell, Katelyn B.; Marqusee, Susan

    2014-01-01

    The molten globule, a conformational ensemble with significant secondary structure but only loosely packed tertiary structure, has been suggested to be a ubiquitous intermediate in protein folding. However, it is difficult to assess the tertiary packing of transiently populated species to evaluate this hypothesis. Escherichia coli RNase H is known to populate an intermediate before the rate-limiting barrier to folding that has long been thought to be a molten globule. We investigated this hypothesis by making mimics of the intermediate that are the ground-state conformation at equilibrium, using two approaches: a truncation to generate a fragment mimic of the intermediate, and selective destabilization of the native state using point mutations. Spectroscopic characterization and the response of the mimics to further mutation are consistent with studies on the transient kinetic intermediate, indicating that they model the early intermediate. Both mimics fold cooperatively and exhibit NMR spectra indicative of a closely packed conformation, in contrast to the hypothesis of molten tertiary packing. This result is important for understanding the nature of the subsequent rate-limiting barrier to folding and has implications for the assumption that many other proteins populate molten globule folding intermediates. PMID:25258414

  10. Evidence for close side-chain packing in an early protein folding intermediate previously assumed to be a molten globule.

    PubMed

    Rosen, Laura E; Connell, Katelyn B; Marqusee, Susan

    2014-10-14

    The molten globule, a conformational ensemble with significant secondary structure but only loosely packed tertiary structure, has been suggested to be a ubiquitous intermediate in protein folding. However, it is difficult to assess the tertiary packing of transiently populated species to evaluate this hypothesis. Escherichia coli RNase H is known to populate an intermediate before the rate-limiting barrier to folding that has long been thought to be a molten globule. We investigated this hypothesis by making mimics of the intermediate that are the ground-state conformation at equilibrium, using two approaches: a truncation to generate a fragment mimic of the intermediate, and selective destabilization of the native state using point mutations. Spectroscopic characterization and the response of the mimics to further mutation are consistent with studies on the transient kinetic intermediate, indicating that they model the early intermediate. Both mimics fold cooperatively and exhibit NMR spectra indicative of a closely packed conformation, in contrast to the hypothesis of molten tertiary packing. This result is important for understanding the nature of the subsequent rate-limiting barrier to folding and has implications for the assumption that many other proteins populate molten globule folding intermediates.

  11. Outer Membrane Protein Folding and Topology from a Computational Transfer Free Energy Scale.

    PubMed

    Lin, Meishan; Gessmann, Dennis; Naveed, Hammad; Liang, Jie

    2016-03-02

    Knowledge of the transfer free energy of amino acids from aqueous solution to a lipid bilayer is essential for understanding membrane protein folding and for predicting membrane protein structure. Here we report a computational approach that can calculate the folding free energy of the transmembrane region of outer membrane β-barrel proteins (OMPs) by combining an empirical energy function with a reduced discrete state space model. We quantitatively analyzed the transfer free energies of 20 amino acid residues at the center of the lipid bilayer of OmpLA. Our results are in excellent agreement with the experimentally derived hydrophobicity scales. We further exhaustively calculated the transfer free energies of 20 amino acids at all positions in the TM region of OmpLA. We found that the asymmetry of the Gram-negative bacterial outer membrane as well as the TM residues of an OMP determine its functional fold in vivo. Our results suggest that the folding process of an OMP is driven by the lipid-facing residues in its hydrophobic core, and its NC-IN topology is determined by the differential stabilities of OMPs in the asymmetrical outer membrane. The folding free energy is further reduced by lipid A and assisted by general depth-dependent cooperativities that exist between polar and ionizable residues. Moreover, context-dependency of transfer free energies at specific positions in OmpLA predict regions important for protein function as well as structural anomalies. Our computational approach is fast, efficient and applicable to any OMP.

  12. Mapping the energy landscape for second-stage folding of a single membrane protein

    PubMed Central

    Min, Duyoung; Jefferson, Robert E; Bowie, James U; Yoon, Tae-Young

    2016-01-01

    Membrane proteins are designed to fold and function in a lipid membrane, yet folding experiments within a native membrane environment are challenging to design. Here we show that single-molecule forced unfolding experiments can be adapted to study helical membrane protein folding under native-like bicelle conditions. Applying force using magnetic tweezers, we find that a transmembrane helix protein, Escherichia coli rhomboid protease GlpG, unfolds in a highly cooperative manner, largely unraveling as one physical unit in response to mechanical tension above 25 pN. Considerable hysteresis is observed, with refolding occurring only at forces below 5 pN. Characterizing the energy landscape reveals only modest thermodynamic stability (ΔG = 6.5 kBT) but a large unfolding barrier (21.3 kBT) that can maintain the protein in a folded state for long periods of time (t1/2 ~3.5 h). The observed energy landscape may have evolved to limit the existence of troublesome partially unfolded states and impart rigidity to the structure. PMID:26479439

  13. Time-resolved distance determination by tryptophan fluorescence quenching: probing intermediates in membrane protein folding.

    PubMed

    Kleinschmidt, J H; Tamm, L K

    1999-04-20

    The mechanism of insertion and folding of an integral membrane protein has been investigated with the beta-barrel forming outer membrane protein A (OmpA) of Escherichia coli. This work describes a new approach to this problem by combining structural information obtained from tryptophan fluorescence quenching at different depths in the lipid bilayer with the kinetics of the refolding process. Experiments carried out over a temperature range between 2 and 40 degrees C allowed us to detect, trap, and characterize previously unidentified folding intermediates on the pathway of OmpA insertion and folding into lipid bilayers. Three membrane-bound intermediates were found in which the average distances of the Trps were 14-16, 10-11, and 0-5 A, respectively, from the bilayer center. The first folding intermediate is stable at 2 degrees C for at least 1 h. A second intermediate has been isolated at temperatures between 7 and 20 degrees C. The Trps move 4-5 A closer to the center of the bilayer at this stage. Subsequently, in an intermediate that is observable at 26-28 degrees C, the Trps move another 5-10 A closer to the center of the bilayer. The final (native) structure is observed at higher temperatures of refolding. In this structure, the Trps are located on average about 9-10 A from the bilayer center. Monitoring the evolution of Trp fluorescence quenching by a set of brominated lipids during refolding at various temperatures therefore allowed us to identify and characterize intermediate states in the folding process of an integral membrane protein.

  14. GPCR-I-TASSER: A Hybrid Approach to G Protein-Coupled Receptor Structure Modeling and the Application to the Human Genome.

    PubMed

    Zhang, Jian; Yang, Jianyi; Jang, Richard; Zhang, Yang

    2015-08-04

    Experimental structure determination remains difficult for G protein-coupled receptors (GPCRs). We propose a new hybrid protocol to construct GPCR structure models that integrates experimental mutagenesis data with ab initio transmembrane (TM) helix assembly simulations. The method was tested on 24 known GPCRs where the ab initio TM-helix assembly procedure constructed the correct fold for 20 cases. When combined with weak homology and sparse mutagenesis restraints, the method generated correct folds for all the tested cases with an average Cα root-mean-square deviation 2.4 Å in the TM regions. The new hybrid protocol was applied to model all 1,026 GPCRs in the human genome, where 923 have a high confidence score and are expected to have correct folds; these contain many pharmaceutically important families with no previously solved structures, including Trace amine, Prostanoids, Releasing hormones, Melanocortins, Vasopressin, and Neuropeptide Y receptors. The results demonstrate new progress on genome-wide structure modeling of TM proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Taxonomic distribution, repeats, and functions of the S1 domain-containing proteins as members of the OB-fold family.

    PubMed

    Deryusheva, Evgeniia I; Machulin, Andrey V; Selivanova, Olga M; Galzitskaya, Oxana V

    2017-04-01

    Proteins of the nucleic acid-binding proteins superfamily perform such functions as processing, transport, storage, stretching, translation, and degradation of RNA. It is one of the 16 superfamilies containing the OB-fold in protein structures. Here, we have analyzed the superfamily of nucleic acid-binding proteins (the number of sequences exceeds 200,000) and obtained that this superfamily prevalently consists of proteins containing the cold shock DNA-binding domain (ca. 131,000 protein sequences). Proteins containing the S1 domain compose 57% from the cold shock DNA-binding domain family. Furthermore, we have found that the S1 domain was identified mainly in the bacterial proteins (ca. 83%) compared to the eukaryotic and archaeal proteins, which are available in the UniProt database. We have found that the number of multiple repeats of S1 domain in the S1 domain-containing proteins depends on the taxonomic affiliation. All archaeal proteins contain one copy of the S1 domain, while the number of repeats in the eukaryotic proteins varies between 1 and 15 and correlates with the protein size. In the bacterial proteins, the number of repeats is no more than 6, regardless of the protein size. The large variation of the repeat number of S1 domain as one of the structural variants of the OB-fold is a distinctive feature of S1 domain-containing proteins. Proteins from the other families and superfamilies have either one OB-fold or change slightly the repeat numbers. On the whole, it can be supposed that the repeat number is a vital for multifunctional activity of the S1 domain-containing proteins. Proteins 2017; 85:602-613. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  16. What amyloidoses may tell us about normal protein folding: The Alzheimer's disease story

    NASA Astrophysics Data System (ADS)

    Teplow, David B.

    2002-03-01

    Alzheimer's disease (AD) is a progressive, neurodegenerative disorder characterized by severe neuronal injury and death. A prominent histopathologic feature of AD is disseminated parenchymal and vascular amyloid deposition. The fibrils in these deposits are composed of the amyloid β-protein (Aβ), a peptide of 4 kDa mass. In vitro and in vivo studies of Aβ fibril formation have shown that both oligomeric and polymeric Aβ assemblies have neurotoxic activity. Understanding how these assemblies form thus could be of direct therapeutic relevance. However, the aggregation and fibril-forming propensities of Aβ have complicated structure determination. Nevertheless, careful morphologic, spectroscopic, protein chemical, and physiologic analyses of the time-dependent changes in Aβ conformation, assembly state, and biological activity which occur during fibrillogenesis have significantly advanced our understanding of this clinically important process. Here, I will discuss recent findings about the pathway(s) of Aβ folding and assembly and about key structural features of Aβ which control the associated kinetics. Interestingly, the amyloidogenic folding pathway of Aβ is in some respects the mirror image of that through which natively folded amyloidogenic proteins proceed.

  17. Protein denaturation in vacuo: intrinsic unfolding pathways associated with the native tertiary structure of lysozyme

    NASA Astrophysics Data System (ADS)

    Arteca, Gustavo A.; Tapia, O.

    Using computer-simulated molecular dynamics, we study the effect of sequence mutation on the unfolding mechanism of a native fold. The system considered is the native fold of hen egg-white lysozyme, exposed to centrifugal unfolding in vacuo. This unfolding bias elicits configurational transitions that imitate the behaviour of anhydrous proteins diffusing after electrospraying from neutral-pH solutions. By changing the sequences threaded onto the native fold of lysozyme, we probe the role of disulfide bridges and the effect of a global mutation. We find that the initial denaturing steps share common characteristics for the tested sequences. Recurrent features are: (i) the presence of dumbbell conformers with significant residual secondary structure, (ii) the ubiquitous formation of hairpins and two-stranded β-sheets regardless of disulfide bridges, and (iii) an unfolding pattern where the reduction in folding complexity is highly correlated with the decrease in chain compactness. These findings appear to be intrinsic to the shape of the native fold, suggesting that similar unfolding pathways may be accessible to many protein sequences.

  18. Thermodynamics and kinetics of protein folding on the ribosome: Alteration in energy landscapes, denatured state, and transition state ensembles

    NASA Astrophysics Data System (ADS)

    O'Brien, Edward; Vendruscolo, Michele; Dobson, Christopher

    2010-03-01

    In vitro experiments examining cotranslational folding utilize ribosome-nascent chain complexes (RNCs) in which the nascent chain is stalled at different points of its biosynthesis on the ribosome. We investigate the thermodynamics, kinetics, and structural properties of RNCs containing five different globular and repeat proteins stalled at ten different nascent chain lengths using coarse grained replica exchange simulations. We find that when the proteins are stalled near the ribosome exit tunnel opening they exhibit altered folding coopserativity, quantified by the van't Hoff enthalpy criterion; a significantly altered denatured state ensemble, in terms of Rg and shape parameters (Rg tensor); and the appearance of partially folded intermediates during cotranslation, evidenced by the appearance of a third basin in the free energy profile. These trends are due in part to excluded volume (crowding) interactions between the ribosome and nascent chain. We perform in silico temperature-jump experiments on the RNCs and examine nascent chain folding kinetics and structural changes in the transition state ensemble at various stall lengths.

  19. A new topology of the HK97-like fold revealed in Bordetella bacteriophage by cryoEM at 3.5 Å resolution

    PubMed Central

    Zhang, Xing; Guo, Huatao; Jin, Lei; Czornyj, Elizabeth; Hodes, Asher; Hui, Wong H; Nieh, Angela W; Miller, Jeff F; Zhou, Z Hong

    2013-01-01

    Bacteriophage BPP-1 infects and kills Bordetella species that cause whooping cough. Its diversity-generating retroelement (DGR) provides a naturally occurring phage-display system, but engineering efforts are hampered without atomic structures. Here, we report a cryo electron microscopy structure of the BPP-1 head at 3.5 Å resolution. Our atomic model shows two of the three protein folds representing major viral lineages: jellyroll for its cement protein (CP) and HK97-like (‘Johnson’) for its major capsid protein (MCP). Strikingly, the fold topology of MCP is permuted non-circularly from the Johnson fold topology previously seen in viral and cellular proteins. We illustrate that the new topology is likely the only feasible alternative of the old topology. β-sheet augmentation and electrostatic interactions contribute to the formation of non-covalent chainmail in BPP-1, unlike covalent inter-protein linkages of the HK97 chainmail. Despite these complex interactions, the termini of both CP and MCP are ideally positioned for DGR-based phage-display engineering. DOI: http://dx.doi.org/10.7554/eLife.01299.001 PMID:24347545

  20. Recent developments in the theory of protein folding: searching for the global energy minimum.

    PubMed

    Scheraga, H A

    1996-04-16

    Statistical mechanical theories and computer simulation are being used to gain an understanding of the fundamental features of protein folding. A major obstacle in the computation of protein structures is the multiple-minima problem arising from the existence of many local minima in the multidimensional energy landscape of the protein. This problem has been surmounted for small open-chain and cyclic peptides, and for regular-repeating sequences of models of fibrous proteins. Progress is being made in resolving this problem for globular proteins.

  1. Folding anomalies of neuroligin3 caused by a mutation in the alpha/beta-hydrolase fold domain.

    PubMed

    De Jaco, Antonella; Dubi, Noga; Comoletti, Davide; Taylor, Palmer

    2010-09-06

    Proteins of the alpha/beta-hydrolase fold family share a common structural fold, but perform a diverse set of functions. We have been studying natural mutations occurring in association with congenital disorders in the alpha/beta-hydrolase fold domain of neuroligin (NLGN), butyrylcholinesterase (BChE), acetylcholinesterase (AChE). Starting from the autism-related R451C mutation in the alpha/beta-hydrolase fold domain of NLGN3, we had previously shown that the Arg to Cys substitution is responsible for endoplasmic reticulum (ER) retention of the mutant protein and that a similar trafficking defect is observed when the mutation is inserted at the homologous positions in AChE and BChE. Herein we show further characterization of the R451C mutation in NLGN3 when expressed in HEK-293, and by protease digestion sensitivity, we reveal that the phenotype results from protein misfolding. However, the presence of an extra Cys does not interfere with the formation of disulfide bonds as shown by reaction with PEG-maleimide and estimation of the molecular mass changes. These findings highlight the role of proper protein folding in protein processing and localization. Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.

  2. FOLDING ANOMALIES OF NEUROLIGIN3 CAUSED BY A MUTATION IN THE α/β-HYDROLASE FOLD DOMAIN

    PubMed Central

    De Jaco, Antonella; Dubi, Noga; Comoletti, Davide; Taylor, Palmer

    2017-01-01

    Proteins of the α/β-hydrolase fold family share a common structural fold, but perform a diverse set of functions. We have been studying natural mutations occurring in association with congenital disorders in the α/β-hydrolase fold domain of neuroligin (NLGN), butyrylcholinesterase (BChE), acetylcholinesterase (AChE). Starting from the autism-related R451C mutation in the α/β-hydrolase fold domain of NLGN3, we had previously shown that the Arg to Cys substitution is responsible for endoplasmic reticulum (ER) retention of the mutant protein and that a similar trafficking defect is observed when the mutation is inserted at the homologous positions in AChE and BChE. Herein we show further characterization of the R451C mutation in NLGN3 when expressed in HEK-293, and by protease digestion sensitivity, we reveal that the phenotype results from protein misfolding. However, the presence of an extra Cys doesn’t interfere with the formation of disulfide bonds as shown by reaction with PEG-maleimide and estimation of the molecular mass changes. These findings highlight the role of proper protein folding in protein processing and localization. PMID:20227402

  3. Topography of funneled landscapes determines the thermodynamics and kinetics of protein folding

    PubMed Central

    Wang, Jin; Oliveira, Ronaldo J.; Chu, Xiakun; Whitford, Paul C.; Chahine, Jorge; Han, Wei; Wang, Erkang; Onuchic, José N.; Leite, Vitor B.P.

    2012-01-01

    The energy landscape approach has played a fundamental role in advancing our understanding of protein folding. Here, we quantify protein folding energy landscapes by exploring the underlying density of states. We identify three quantities essential for characterizing landscape topography: the stabilizing energy gap between the native and nonnative ensembles δE, the energetic roughness ΔE, and the scale of landscape measured by the entropy S. We show that the dimensionless ratio between the gap, roughness, and entropy of the system accurately predicts the thermodynamics, as well as the kinetics of folding. Large Λ implies that the energy gap (or landscape slope towards the native state) is dominant, leading to more funneled landscapes. We investigate the role of topological and energetic roughness for proteins of different sizes and for proteins of the same size, but with different structural topologies. The landscape topography ratio Λ is shown to be monotonically correlated with the thermodynamic stability against trapping, as characterized by the ratio of folding temperature versus trapping temperature. Furthermore, Λ also monotonically correlates with the folding kinetic rates. These results provide the quantitative bridge between the landscape topography and experimental folding measurements. PMID:23019359

  4. A First-Principles Model of Early Evolution: Emergence of Gene Families, Species, and Preferred Protein Folds

    PubMed Central

    Zeldovich, Konstantin B; Chen, Peiqiu; Shakhnovich, Boris E; Shakhnovich, Eugene I

    2007-01-01

    In this work we develop a microscopic physical model of early evolution where phenotype—organism life expectancy—is directly related to genotype—the stability of its proteins in their native conformations—which can be determined exactly in the model. Simulating the model on a computer, we consistently observe the “Big Bang” scenario whereby exponential population growth ensues as soon as favorable sequence–structure combinations (precursors of stable proteins) are discovered. Upon that, random diversity of the structural space abruptly collapses into a small set of preferred proteins. We observe that protein folds remain stable and abundant in the population at timescales much greater than mutation or organism lifetime, and the distribution of the lifetimes of dominant folds in a population approximately follows a power law. The separation of evolutionary timescales between discovery of new folds and generation of new sequences gives rise to emergence of protein families and superfamilies whose sizes are power-law distributed, closely matching the same distributions for real proteins. On the population level we observe emergence of species—subpopulations that carry similar genomes. Further, we present a simple theory that relates stability of evolving proteins to the sizes of emerging genomes. Together, these results provide a microscopic first-principles picture of how first-gene families developed in the course of early evolution. PMID:17630830

  5. A first-principles model of early evolution: emergence of gene families, species, and preferred protein folds.

    PubMed

    Zeldovich, Konstantin B; Chen, Peiqiu; Shakhnovich, Boris E; Shakhnovich, Eugene I

    2007-07-01

    In this work we develop a microscopic physical model of early evolution where phenotype--organism life expectancy--is directly related to genotype--the stability of its proteins in their native conformations-which can be determined exactly in the model. Simulating the model on a computer, we consistently observe the "Big Bang" scenario whereby exponential population growth ensues as soon as favorable sequence-structure combinations (precursors of stable proteins) are discovered. Upon that, random diversity of the structural space abruptly collapses into a small set of preferred proteins. We observe that protein folds remain stable and abundant in the population at timescales much greater than mutation or organism lifetime, and the distribution of the lifetimes of dominant folds in a population approximately follows a power law. The separation of evolutionary timescales between discovery of new folds and generation of new sequences gives rise to emergence of protein families and superfamilies whose sizes are power-law distributed, closely matching the same distributions for real proteins. On the population level we observe emergence of species--subpopulations that carry similar genomes. Further, we present a simple theory that relates stability of evolving proteins to the sizes of emerging genomes. Together, these results provide a microscopic first-principles picture of how first-gene families developed in the course of early evolution.

  6. Leishmania replication protein A-1 binds in vivo single-stranded telomeric DNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Neto, J.L. Siqueira; Instituto de Biologia, UNICAMP, Campinas, SP; Lira, C.B.B.

    Replication protein A (RPA) is a highly conserved heterotrimeric single-stranded DNA-binding protein involved in different events of DNA metabolism. In yeast, subunits 1 (RPA-1) and 2 (RPA-2) work also as telomerase recruiters and, in humans, the complex unfolds G-quartet structures formed by the 3' G-rich telomeric strand. In most eukaryotes, RPA-1 and RPA-2 bind DNA using multiple OB fold domains. In trypanosomatids, including Leishmania, RPA-1 has a canonical OB fold and a truncated RFA-1 structural domain. In Leishmania amazonensis, RPA-1 alone can form a complex in vitro with the telomeric G-rich strand. In this work, we show that LaRPA-1 ismore » a nuclear protein that associates in vivo with Leishmania telomeres. We mapped the boundaries of the OB fold DNA-binding domain using deletion mutants. Since Leishmania and other trypanosomatids lack homologues of known telomere end binding proteins, our results raise questions about the function of RPA-1 in parasite telomeres.« less

  7. Thermodynamic properties of an extremely rapid protein folding reaction.

    PubMed

    Schindler, T; Schmid, F X

    1996-12-24

    The cold-shock protein CspB from Bacillus subtilis is a very small beta-barrel protein, which folds with a time constant of 1 ms (at 25 degrees C) in a U reversible N two-state reaction. To elucidate the energetics of this extremely fast reaction we investigated the folding kinetics of CspB as a function of both temperature and denaturant concentration between 2 and 45 degrees C and between 1 and 8 M urea. Under all these conditions unfolding and refolding were reversible monoexponential reactions. By using transition state theory, data from 327 kinetic curves were jointly analyzed to determine the thermodynamic activation parameters delta H H2O++, delta S H2O++, delta G H2O++, and delta C p H2O++ for unfolding and refolding and their dependences on the urea concentration. 90% of the total change in heat capacity and 96% of the change in the m value (m = d delta G/d[urea]) occur between the unfolded state and the activated state. This suggests that for CspB the activated state of folding is unusually well structured and almost equivalent to the native protein in its interactions with the solvent. As a consequence of this native-like activated state a strong temperature-dependent enthalpy/entropy compensation is observed for the refolding kinetics, and the barrier to refolding shifts from being largely enthalpic at low temperature to largely entropic at high temperature. This shift originates not from the changes in the folding protein chains itself, but from the changes in the protein-solvent interactions. We speculate that the absence of intermediates and the native-like activated state in the folding of CspB are correlated with the small size and the structural type of this protein. The stabilization of a small beta-sheet as in CspB requires extensive non-local interactions, and therefore incomplete sheets are unstable. As a consequence, the critical activated state is reached only very late in folding. The instability of partially folded structure is a means to avoid misfolding prior to the rate-limiting step, and a native-like activated state reduces the risk of non-productive side reactions during the final steps to the native state.

  8. A lattice protein with an amyloidogenic latent state: stability and folding kinetics.

    PubMed

    Palyanov, Andrey Yu; Krivov, Sergei V; Karplus, Martin; Chekmarev, Sergei F

    2007-03-15

    We have designed a model lattice protein that has two stable folded states, the lower free energy native state and a latent state of somewhat higher energy. The two states have a sizable part of their structures in common (two "alpha-helices") and differ in the content of "alpha-helices" and "beta-strands" in the rest of their structures; i.e. for the native state, this part is alpha-helical, and for the latent state it is composed of beta-strands. Thus, the lattice protein free energy surface mimics that of amyloidogenic proteins that form well organized fibrils under appropriate conditions. A Go-like potential was used and the folding process was simulated with a Monte Carlo method. To gain insight into the equilibrium free energy surface and the folding kinetics, we have combined standard approaches (reduced free energy surfaces, contact maps, time-dependent populations of the characteristic states, and folding time distributions) with a new approach. The latter is based on a principal coordinate analysis of the entire set of contacts, which makes possible the introduction of unbiased reaction coordinates and the construction of a kinetic network for the folding process. The system is found to have four characteristic basins, namely a semicompact globule, an on-pathway intermediate (the bifurcation basin), and the native and latent states. The bifurcation basin is shallow and consists of the structure common to the native and latent states, with the rest disorganized. On the basis of the simulation results, a simple kinetic model describing the transitions between the characteristic states was developed, and the rate constants for the essential transitions were estimated. During the folding process the system dwells in the bifurcation basin for a relatively short time before it proceeds to the native or latent state. We suggest that such a bifurcation may occur generally for proteins in which native and latent states have a sizable part of their structures in common. Moreover, there is the possibility of introducing changes in the system (e.g., mutations), which guide the system toward the native or misfolded state.

  9. Biophysical and structural considerations for protein sequence evolution

    PubMed Central

    2011-01-01

    Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550

  10. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

    PubMed

    Chira, Camelia; Horvath, Dragos; Dumitrescu, D

    2011-07-30

    Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  11. Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold.

    PubMed

    Osipiuk, J; Górnicki, P; Maj, L; Dementieva, I; Laskowski, R; Joachimiak, A

    2001-11-01

    The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 A. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 A from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer alpha/beta sandwich with the overall shape of a cylinder and shows no structural homology to proteins of known structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the alpha-beta plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.

  12. Engineering Aromatic-Aromatic Interactions To Nucleate Folding in Intrinsically Disordered Regions of Proteins.

    PubMed

    Balakrishnan, Swati; Sarma, Siddhartha P

    2017-08-22

    Aromatic interactions are an important force in protein folding as they combine the stability of a hydrophobic interaction with the selectivity of a hydrogen bond. Much of our understanding of aromatic interactions comes from "bioinformatics" based analyses of protein structures and from the contribution of these interactions to stabilizing secondary structure motifs in model peptides. In this study, the structural consequences of aromatic interactions on protein folding have been explored in engineered mutants of the molten globule protein apo-cytochrome b 5 . Structural changes from disorder to order due to aromatic interactions in two variants of the protein, viz., WF-cytb5 and FF-cytb5, result in significant long-range secondary and tertiary structure. The results show that 54 and 52% of the residues in WF-cytb5 and FF-cytb5, respectively, occupy ordered regions versus 26% in apo-cytochrome b 5 . The interactions between the aromatic groups are offset-stacked and edge-to-face for the Trp-Phe and Phe-Phe mutants, respectively. Urea denaturation studies indicate that both mutants have a C m higher than that of apo-cytochrome b 5 and are more stable to chaotropic agents than apo-cytochrome b 5 . The introduction of these aromatic residues also results in "trimer" interactions with existing aromatic groups, reaffirming the selectivity of the aromatic interactions. These studies provide insights into the aromatic interactions that drive disorder-to-order transitions in intrinsically disordered regions of proteins and will aid in de novo protein design beyond small peptide scaffolds.

  13. Layers: A molecular surface peeling algorithm and its applications to analyze protein structures

    PubMed Central

    Karampudi, Naga Bhushana Rao; Bahadur, Ranjit Prasad

    2015-01-01

    We present an algorithm ‘Layers’ to peel the atoms of proteins as layers. Using Layers we show an efficient way to transform protein structures into 2D pattern, named residue transition pattern (RTP), which is independent of molecular orientations. RTP explains the folding patterns of proteins and hence identification of similarity between proteins is simple and reliable using RTP than with the standard sequence or structure based methods. Moreover, Layers generates a fine-tunable coarse model for the molecular surface by using non-random sampling. The coarse model can be used for shape comparison, protein recognition and ligand design. Additionally, Layers can be used to develop biased initial configuration of molecules for protein folding simulations. We have developed a random forest classifier to predict the RTP of a given polypeptide sequence. Layers is a standalone application; however, it can be merged with other applications to reduce the computational load when working with large datasets of protein structures. Layers is available freely at http://www.csb.iitkgp.ernet.in/applications/mol_layers/main. PMID:26553411

  14. Folding of a LysM Domain: Entropy-Enthalpy Compensation in the Transition State of an Ideal Two-state Folder

    PubMed Central

    Nickson, Adrian A.; Stoll, Kate E.; Clarke, Jane

    2008-01-01

    Protein-engineering methods (Φ-values) were used to investigate the folding transition state of a lysin motif (LysM) domain from Escherichia coli membrane-bound lytic murein transglycosylase D. This domain consists of just 48 structured residues in a symmetrical βααβ arrangement and is the smallest αβ protein yet investigated using these methods. An extensive mutational analysis revealed a highly robust folding pathway with no detectable transition state plasticity, indicating that LysM is an example of an ideal two-state folder. The pattern of Φ-values denotes a highly polarised transition state, with significant formation of the helices but no structure within the β-sheet. Remarkably, this transition state remains polarised after circularisation of the domain, and exhibits an identical Φ-value pattern; however, the interactions within the transition state are uniformly weaker in the circular variant. This observation is supported by results from an Eyring analysis of the folding rates of the two proteins. We propose that the folding pathway of LysM is dominated by enthalpic rather than entropic considerations, and suggest that the lower entropy cost of formation of the circular transition state is balanced, to some extent, by the lower enthalpy of contacts within this structure. PMID:18538343

  15. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    DOE PAGES

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; ...

    2017-03-08

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. We report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structuremore » of the excited state ensemble. The resulting prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. We then predict incisive single molecule FRET experiments, using these results, as a means of model validation. Our study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.« less

  16. Histopathologic study of human vocal fold mucosa unphonated over a decade.

    PubMed

    Sato, Kiminori; Umeno, Hirohito; Ono, Takeharu; Nakashima, Tadashi

    2011-12-01

    Mechanotransduction caused by vocal fold vibration could possibly be an important factor in the maintenance of extracellular matrices and layered structure of the human adult vocal fold mucosa as a vibrating tissue after the layered structure has been completed. Vocal fold stellate cells (VFSCs) in the human maculae flavae of the vocal fold mucosa are inferred to be involved in the metabolism of extracellular matrices of the vocal fold mucosa. Maculae flavae are also considered to be an important structure in the growth and development of the human vocal fold mucosa. Tension caused by phonation (vocal fold vibration) is hypothesized to stimulate the VFSCs to accelerate production of extracellular matrices. A human adult vocal fold mucosa unphonated over a decade was investigated histopathologically. Vocal fold mucosa unphonated for 11 years and 2 months of a 64-year-old male with cerebral hemorrhage was investigated by light and electron microscopy. The vocal fold mucosae (including maculae flavae) were atrophic. The vocal fold mucosa did not have a vocal ligament, Reinke's space or a layered structure. The lamina propria appeared as a uniform structure. Morphologically, the VFSCs synthesized fewer extracellular matrices, such as fibrous protein and glycosaminoglycan. Consequently, VFSCs appeared to decrease their level of activity.

  17. Local energetic frustration affects the dependence of green fluorescent protein folding on the chaperonin GroEL.

    PubMed

    Bandyopadhyay, Boudhayan; Goldenzweig, Adi; Unger, Tamar; Adato, Orit; Fleishman, Sarel J; Unger, Ron; Horovitz, Amnon

    2017-12-15

    The GroE chaperonin system in Escherichia coli comprises GroEL and GroES and facilitates ATP-dependent protein folding in vivo and in vitro Proteins with very similar sequences and structures can differ in their dependence on GroEL for efficient folding. One potential but unverified source for GroEL dependence is frustration, wherein not all interactions in the native state are optimized energetically, thereby potentiating slow folding and misfolding. Here, we chose enhanced green fluorescent protein as a model system and subjected it to random mutagenesis, followed by screening for variants whose in vivo folding displays increased or decreased GroEL dependence. We confirmed the altered GroEL dependence of these variants with in vitro folding assays. Strikingly, mutations at positions predicted to be highly frustrated were found to correlate with decreased GroEL dependence. Conversely, mutations at positions with low frustration were found to correlate with increased GroEL dependence. Further support for this finding was obtained by showing that folding of an enhanced green fluorescent protein variant designed computationally to have reduced frustration is indeed less GroEL-dependent. Our results indicate that changes in local frustration also affect partitioning in vivo between spontaneous and chaperonin-mediated folding. Hence, the design of minimally frustrated sequences can reduce chaperonin dependence and improve protein expression levels. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. An alternative view of protein fold space.

    PubMed

    Shindyalov, I N; Bourne, P E

    2000-02-15

    Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3-dimensional structures. Classification schemes have focused on biological function found within protein domains and on structure classification based on topology. Here an alternative view is presented that groups substructures. Substructures are long (50-150 residue) highly repetitive near-contiguous pieces of polypeptide chain that occur frequently in a set of proteins from the PDB defined as structurally non-redundant over the complete polypeptide chain. The substructure classification is based on a previously reported Combinatorial Extension (CE) algorithm that provides a significantly different set of structure alignments than those previously described, having, for example, only a 40% overlap with FSSP. Qualitatively the algorithm provides longer contiguous aligned segments at the price of a slightly higher root-mean-square deviation (rmsd). Clustering these alignments gives a discreet and highly repetitive set of substructures not detectable by sequence similarity alone. In some cases different substructures represent all or different parts of well known folds indicative of the Russian doll effect--the continuity of protein fold space. In other cases they fall into different structure and functional classifications. It is too early to determine whether these newly classified substructures represent new insights into the evolution of a structural framework important to many proteins. What is apparent from on-going work is that these substructures have the potential to be useful probes in finding remote sequence homology and in structure prediction studies. The characteristics of the complete all-by-all comparison of the polypeptide chains present in the PDB and details of the filtering procedure by pair-wise structure alignment that led to the emergent substructure gallery are discussed. Substructure classification, alignments, and tools to analyze them are available at http://cl.sdsc.edu/ce.html.

  19. Purifying Properly Folded Cysteine-rich, Zinc Finger Containing Recombinant Proteins for Structural Drug Targeting Studies: the CH1 Domain of p300 as a Case Example

    PubMed Central

    Kim, Yong Joon; Kaluz, Stefan; Mehta, Anil; Weinert, Emily; Rivera, Shannon; Van Meir, Erwin G.

    2017-01-01

    The transcription factor Hypoxia-Inducible Factor (HIF) complexes with the coactivator p300, activating the hypoxia response pathway and allowing tumors to grow. The CH1 and CAD domains of each respective protein form the interface between p300 and HIF. Small molecule compounds are in development that target and inhibit HIF/p300 complex formation, with the goal of reducing tumor growth. High resolution NMR spectroscopy is necessary to study ligand interaction with p300-CH1, and purifying high quantities of properly folded p300-CH1 is needed for pursuing structural and biophysical studies. p300-CH1 has 3 zinc fingers and 9 cysteine residues, posing challenges associated with reagent compatibility and protein oxidation. A protocol has been developed to overcome such issues by incorporating zinc during expression and streamlining the purification time, resulting in a high yield of optimally folded protein (120 mg per 4 L expression media) that is suitable for structural NMR studies. The structural integrity of the final recombinant p300-CH1 has been verified to be optimal using onedimensional 1H NMR spectroscopy and circular dichroism. This protocol is applicable for the purification of other zinc finger containing proteins. PMID:28966947

  20. De Novo Evolutionary Emergence of a Symmetrical Protein Is Shaped by Folding Constraints

    PubMed Central

    Smock, Robert G.; Yadid, Itamar; Dym, Orly; Clarke, Jane; Tawfik, Dan S.

    2016-01-01

    Summary Molecular evolution has focused on the divergence of molecular functions, yet we know little about how structurally distinct protein folds emerge de novo. We characterized the evolutionary trajectories and selection forces underlying emergence of β-propeller proteins, a globular and symmetric fold group with diverse functions. The identification of short propeller-like motifs (<50 amino acids) in natural genomes indicated that they expanded via tandem duplications to form extant propellers. We phylogenetically reconstructed 47-residue ancestral motifs that form five-bladed lectin propellers via oligomeric assembly. We demonstrate a functional trajectory of tandem duplications of these motifs leading to monomeric lectins. Foldability, i.e., higher efficiency of folding, was the main parameter leading to improved functionality along the entire evolutionary trajectory. However, folding constraints changed along the trajectory: initially, conflicts between monomer folding and oligomer assembly dominated, whereas subsequently, upon tandem duplication, tradeoffs between monomer stability and foldability took precedence. PMID:26806127

  1. Extending CATH: increasing coverage of the protein structure universe and linking structure with function

    PubMed Central

    Cuff, Alison L.; Sillitoe, Ian; Lewis, Tony; Clegg, Andrew B.; Rentzsch, Robert; Furnham, Nicholas; Pellegrini-Calace, Marialuisa; Jones, David; Thornton, Janet; Orengo, Christine A.

    2011-01-01

    CATH version 3.3 (class, architecture, topology, homology) contains 128 688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version 3.4 we have significantly improved the presentation of sequence information and associated functional information for CATH superfamilies. The CATH superfamily pages now reflect both the functional and structural diversity within the superfamily and include structural alignments of close and distant relatives within the superfamily, annotated with functional information and details of conserved residues. A significantly more efficient search function for CATH has been established by implementing the search server Solr (http://lucene.apache.org/solr/). The CATH v3.4 webpages have been built using the Catalyst web framework. PMID:21097779

  2. Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus

    PubMed Central

    Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.

    2001-01-01

    Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505

  3. pH dependent unfolding characteristics of DLC8 dimer: Residue level details from NMR.

    PubMed

    Mohan, P M Krishna; Hosur, Ramakrishna V

    2008-11-01

    Environment dependence of folding and unfolding of a protein is central to its function. In the same vein, knowledge of pH dependence of stability and folding/unfolding is crucial for many biophysical equilibrium and kinetic studies designed to understand protein folding mechanisms. In the present study we investigated the guanidine induced unfolding transition of dynein light chain protein (DLC8), a cargo adaptor of the dynein complex in the pH range 7-10. It is observed that while the protein remains a dimer in the entire pH range, its stability is somewhat reduced at alkaline pH. Global unfolding features monitored using fluorescence spectroscopy revealed that the unfolding transition of DLC8 at pH 7 is best described by a three-state model, whereas, that at pH 10 is best described by a two-state model. Chemical shift perturbations due to pH change provided insights into the corresponding residue level structural perturbations in the DLC8 dimer. Likewise, backbone (15)N relaxation measurements threw light on the corresponding motional changes in the dimeric protein. These observations have been rationalized on the basis of expected changes with increasing pH in the protonation states of the titratable residues on the structure of the protein. These, in turn provide an explanation for the change from three-state to two-state guanidine induced unfolding transition as the pH is increased from 7 to 10. All these results exemplify and highlight the role of environment vis-à-vis the sequence and structure of a given protein in dictating its folding/unfolding characteristics.

  4. Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.

    PubMed

    Neumann, Sindy; Hartmann, Holger; Martin-Galiano, Antonio J; Fuchs, Angelika; Frishman, Dmitrij

    2012-03-01

    Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ∼1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/. Copyright © 2011 Wiley Periodicals, Inc.

  5. Mutational analysis of the folding transition state of the C-terminal domain of ribosomal protein L9: a protein with an unusual beta-sheet topology.

    PubMed

    Li, Ying; Gupta, Ruchi; Cho, Jae-Hyun; Raleigh, Daniel P

    2007-01-30

    The C-terminal domain of ribosomal protein L9 (CTL9) is a 92-residue alpha-beta protein which contains an unusual three-stranded mixed parallel and antiparallel beta-sheet. The protein folds in a two-state fashion, and the folding rate is slow. It is thought that the slow folding may be caused by the necessity of forming this unusual beta-sheet architecture in the transition state for folding. This hypothesis makes CTL9 an interesting target for folding studies. The transition state for the folding of CTL9 was characterized by phi-value analysis. The folding of a set of hydrophobic core mutants was analyzed together with a set of truncation mutants. The results revealed a few positions with high phi-values (> or = 0.5), notably, V131, L133, H134, V137, and L141. All of these residues were found in the beta-hairpin region, indicating that the formation of this structure is likely to be the rate-limiting step in the folding of CTL9. One face of the beta-hairpin docks against the N-terminal helix. Analysis of truncation mutants of this helix confirmed its importance in folding. Mutations at other sites in the protein gave small phi-values, despite the fact that some of them had major effects on stability. The analysis indicates that formation of the antiparallel hairpin is critical and its interactions with the first helix are also important. Thus, the slow folding is not a consequence of the need to fully form the unusual three-stranded beta-sheet in the transition state. Analysis of the urea dependence of the folding rates indicates that mutations modulate the unfolded state. The folding of CTL9 is broadly consistent with the nucleation-condensation model of protein folding.

  6. The CWB2 Cell Wall-Anchoring Module Is Revealed by the Crystal Structures of the Clostridium difficile Cell Wall Proteins Cwp8 and Cwp6.

    PubMed

    Usenik, Aleksandra; Renko, Miha; Mihelič, Marko; Lindič, Nataša; Borišek, Jure; Perdih, Andrej; Pretnar, Gregor; Müller, Uwe; Turk, Dušan

    2017-03-07

    Bacterial cell wall proteins play crucial roles in cell survival, growth, and environmental interactions. In Gram-positive bacteria, cell wall proteins include several types that are non-covalently attached via cell wall binding domains. Of the two conserved surface-layer (S-layer)-anchoring modules composed of three tandem SLH or CWB2 domains, the latter have so far eluded structural insight. The crystal structures of Cwp8 and Cwp6 reveal multi-domain proteins, each containing an embedded CWB2 module. It consists of a triangular trimer of Rossmann-fold CWB2 domains, a feature common to 29 cell wall proteins in Clostridium difficile 630. The structural basis of the intact module fold necessary for its binding to the cell wall is revealed. A comparison with previously reported atomic force microscopy data of S-layers suggests that C. difficile S-layers are complex oligomeric structures, likely composed of several different proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae.

    PubMed

    Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

    2008-02-01

    RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning-possibly mediated by intrinsically disordered protein segments-is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication.

  8. RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae

    PubMed Central

    Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

    2008-01-01

    RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning—possibly mediated by intrinsically disordered protein segments—is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication. PMID:18033802

  9. Apoprotein Structure and Metal Binding Characterization of a de Novo Designed Peptide, α3DIV, that Sequesters Toxic Heavy Metals.

    PubMed

    Plegaria, Jefferson S; Dzul, Stephen P; Zuiderweg, Erik R P; Stemmler, Timothy L; Pecoraro, Vincent L

    2015-05-12

    De novo protein design is a biologically relevant approach that provides a novel process in elucidating protein folding and modeling the metal centers of metalloproteins in a completely unrelated or simplified fold. An integral step in de novo protein design is the establishment of a well-folded scaffold with one conformation, which is a fundamental characteristic of many native proteins. Here, we report the NMR solution structure of apo α3DIV at pH 7.0, a de novo designed three-helix bundle peptide containing a triscysteine motif (Cys18, Cys28, and Cys67) that binds toxic heavy metals. The structure comprises 1067 NOE restraints derived from multinuclear multidimensional NOESY, as well as 138 dihedral angles (ψ, φ, and χ1). The backbone and heavy atoms of the 20 lowest energy structures have a root mean square deviation from the mean structure of 0.79 (0.16) Å and 1.31 (0.15) Å, respectively. When compared to the parent structure α3D, the substitution of Leu residues to Cys enhanced the α-helical content of α3DIV while maintaining the same overall topology and fold. In addition, solution studies on the metalated species illustrated metal-induced stability. An increase in the melting temperatures was observed for Hg(II), Pb(II), or Cd(II) bound α3DIV by 18-24 °C compared to its apo counterpart. Further, the extended X-ray absorption fine structure analysis on Hg(II)-α3DIV produced an average Hg(II)-S bond length at 2.36 Å, indicating a trigonal T-shaped coordination environment. Overall, the structure of apo α3DIV reveals an asymmetric distorted triscysteine metal binding site, which offers a model for native metalloregulatory proteins with thiol-rich ligands that function in regulating toxic heavy metals, such as ArsR, CadC, MerR, and PbrR.

  10. Predictive energy landscapes for folding membrane protein assemblies

    NASA Astrophysics Data System (ADS)

    Truong, Ha H.; Kim, Bobby L.; Schafer, Nicholas P.; Wolynes, Peter G.

    2015-12-01

    We study the energy landscapes for membrane protein oligomerization using the Associative memory, Water mediated, Structure and Energy Model with an implicit membrane potential (AWSEM-membrane), a coarse-grained molecular dynamics model previously optimized under the assumption that the energy landscapes for folding α-helical membrane protein monomers are funneled once their native topology within the membrane is established. In this study we show that the AWSEM-membrane force field is able to sample near native binding interfaces of several oligomeric systems. By predicting candidate structures using simulated annealing, we further show that degeneracies in predicting structures of membrane protein monomers are generally resolved in the folding of the higher order assemblies as is the case in the assemblies of both nicotinic acetylcholine receptor and V-type Na+-ATPase dimers. The physics of the phenomenon resembles domain swapping, which is consistent with the landscape following the principle of minimal frustration. We revisit also the classic Khorana study of the reconstitution of bacteriorhodopsin from its fragments, which is the close analogue of the early Anfinsen experiment on globular proteins. Here, we show the retinal cofactor likely plays a major role in selecting the final functional assembly.

  11. Computational approaches for rational design of proteins with novel functionalities

    PubMed Central

    Tiwari, Manish Kumar; Singh, Ranjitha; Singh, Raushan Kumar; Kim, In-Won; Lee, Jung-Kul

    2012-01-01

    Proteins are the most multifaceted macromolecules in living systems and have various important functions, including structural, catalytic, sensory, and regulatory functions. Rational design of enzymes is a great challenge to our understanding of protein structure and physical chemistry and has numerous potential applications. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. The field of de novo protein design, although only a few decades old, is beginning to produce exciting results. Developments in this field are already having a significant impact on biotechnology and chemical biology. The application of powerful computational methods for functional protein designing has recently succeeded at engineering target activities. Here, we review recently reported de novo functional proteins that were developed using various protein design approaches, including rational design, computational optimization, and selection from combinatorial libraries, highlighting recent advances and successes. PMID:24688643

  12. Protocols for efficient simulations of long-time protein dynamics using coarse-grained CABS model.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2014-01-01

    Coarse-grained (CG) modeling is a well-acknowledged simulation approach for getting insight into long-time scale protein folding events at reasonable computational cost. Depending on the design of a CG model, the simulation protocols vary from highly case-specific-requiring user-defined assumptions about the folding scenario-to more sophisticated blind prediction methods for which only a protein sequence is required. Here we describe the framework protocol for the simulations of long-term dynamics of globular proteins, with the use of the CABS CG protein model and sequence data. The simulations can start from a random or a selected (e.g., native) structure. The described protocol has been validated using experimental data for protein folding model systems-the prediction results agreed well with the experimental results.

  13. Co-evolutionary constraints of globular proteins correlate with their folding rates.

    PubMed

    Mallik, Saurav; Kundu, Sudip

    2015-08-04

    Folding rates (lnkf) of globular proteins correlate with their biophysical properties, but relationship between lnkf and patterns of sequence evolution remains elusive. We introduce 'relative co-evolution order' (rCEO) as length-normalized average primary chain separation of co-evolving pairs (CEPs), which negatively correlates with lnkf. In addition to pairs in native 3D contact, indirectly connected and structurally remote CEPs probably also play critical roles in protein folding. Correlation between rCEO and lnkf is stronger in multi-state proteins than two-state proteins, contrasting the case of contact order (co), where stronger correlation is found in two-state proteins. Finally, rCEO, co and lnkf are fitted into a 3D linear correlation. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  14. Redox-Assisted Protein Folding Systems in Eukaryotic Parasites

    PubMed Central

    Haque, Saikh Jaharul; Majumdar, Tanmay

    2012-01-01

    Abstract Significance: The cysteine (Cys) residues of proteins play two fundamentally important roles. They serve as sites of post-translational redox modifications as well as influence the conformation of the protein through the formation of disulfide bonds. Recent Advances: Redox-related and redox-associated protein folding in protozoan parasites has been found to be a major mode of regulation, affecting myriad aspects of the parasitic life cycle, host-parasite interactions, and the disease pathology. Available genome sequences of various parasites have begun to complement the classical biochemical and enzymological studies of these processes. In this article, we summarize the reversible Cys disulfide (S-S) bond formation in various classes of strategically important parasitic proteins, and its structural consequence and functional relevance. Critical Issues: Molecular mechanisms of folding remain under-studied and often disconnected from functional relevance. Future Directions: The clinical benefit of redox research will require a comprehensive characterization of the various isoforms and paralogs of the redox enzymes and their concerted effect on the structure and function of the specific parasitic client proteins. Antioxid. Redox Signal. 17, 674–683. PMID:22122448

  15. Influence of protein fold stability on immunogenicity and its implications for vaccine design

    PubMed Central

    Scheiblhofer, Sandra; Laimer, Josef; Machado, Yoan; Weiss, Richard; Thalhamer, Josef

    2017-01-01

    ABSTRACT Introduction: In modern vaccinology and immunotherapy, recombinant proteins more and more replace whole organisms to induce protective or curative immune responses. Structural stability of proteins is of crucial importance for efficient presentation of antigenic peptides on MHC, which plays a decisive role for triggering strong immune reactions. Areas covered: In this review, we discuss structural stability as a key factor for modulating the potency of recombinant vaccines and its importance for antigen proteolysis, presentation, and stimulation of B and T cells. Moreover, the impact of fold stability on downstream events determining the differentiation of T cells into effector cells is reviewed. We summarize studies investigating the impact of protein fold stability on the outcome of the immune response and provide an overview on computational methods to estimate the effects of point mutations on protein stability. Expert commentary: Based on this information, the rational design of up-to-date vaccines is discussed. A model for predicting immunogenicity of proteins based on their conformational stability at different pH values is proposed. PMID:28290225

  16. Assessment of Detection and Refinement Strategies for de novo Protein Structures using Force Field and Statistical Potentials

    DTIC Science & Technology

    2007-01-01

    energy landscape of real proteins . As such, real proteins may have a subtle free energy gradient toward the native that requires long folding times...some leaning, however slight, toward the lowest free - energy basin .9 One caveat in the connection between the scoring funnel and the folding funnel is... protein sets. The average DFIRE-AA scores from each cluster were ranked, and the lowest- energy conformers from each of the top 16 clusters

  17. Structural and Functional Characterization of the Recombinant Death Domain from Death-Associated Protein Kinase

    PubMed Central

    Dioletis, Evangelos; Dingley, Andrew J.; Driscoll, Paul C.

    2013-01-01

    Death-associated protein kinase (DAPk) is a calcium/calmodulin-regulated Ser/Thr-protein kinase that functions at an important point of integration for cell death signaling pathways. DAPk has a structurally unique multi-domain architecture, including a C-terminally positioned death domain (DD) that is a positive regulator of DAPk activity. In this study, recombinant DAPk-DD was observed to aggregate readily and could not be prepared in sufficient yield for structural analysis. However, DAPk-DD could be obtained as a soluble protein in the form of a translational fusion protein with the B1 domain of streptococcal protein G. In contrast to other DDs that adopt the canonical six amphipathic α-helices arranged in a compact fold, the DAPk-DD was found to possess surprisingly low regular secondary structure content and an absence of a stable globular fold, as determined by circular dichroism (CD), NMR spectroscopy and a temperature-dependent fluorescence assay. Furthermore, we measured the in vitro interaction between extracellular-regulated kinase-2 (ERK2) and various recombinant DAPk-DD constructs. Despite the low level of structural order, the recombinant DAPk-DD retained the ability to interact with ERK2 in a 1∶1 ratio with a K d in the low micromolar range. Only the full-length DAPk-DD could bind ERK2, indicating that the apparent ‘D-motif’ located in the putative sixth helix of DAPk-DD is not sufficient for ERK2 recognition. CD analysis revealed that binding of DAPk-DD to ERK2 is not accompanied by a significant change in secondary structure. Taken together our data argue that the DAPk-DD, when expressed in isolation, does not adopt a classical DD fold, yet in this state retains the capacity to interact with at least one of its binding partners. The lack of a stable globular structure for the DAPk-DD may reflect either that its folding would be supported by interactions absent in our experimental set-up, or a limitation in the structural bioinformatics assignment of the three-dimensional structure. PMID:23922916

  18. Structural and functional characterization of the recombinant death domain from death-associated protein kinase.

    PubMed

    Dioletis, Evangelos; Dingley, Andrew J; Driscoll, Paul C

    2013-01-01

    Death-associated protein kinase (DAPk) is a calcium/calmodulin-regulated Ser/Thr-protein kinase that functions at an important point of integration for cell death signaling pathways. DAPk has a structurally unique multi-domain architecture, including a C-terminally positioned death domain (DD) that is a positive regulator of DAPk activity. In this study, recombinant DAPk-DD was observed to aggregate readily and could not be prepared in sufficient yield for structural analysis. However, DAPk-DD could be obtained as a soluble protein in the form of a translational fusion protein with the B1 domain of streptococcal protein G. In contrast to other DDs that adopt the canonical six amphipathic α-helices arranged in a compact fold, the DAPk-DD was found to possess surprisingly low regular secondary structure content and an absence of a stable globular fold, as determined by circular dichroism (CD), NMR spectroscopy and a temperature-dependent fluorescence assay. Furthermore, we measured the in vitro interaction between extracellular-regulated kinase-2 (ERK2) and various recombinant DAPk-DD constructs. Despite the low level of structural order, the recombinant DAPk-DD retained the ability to interact with ERK2 in a 1∶1 ratio with a K d in the low micromolar range. Only the full-length DAPk-DD could bind ERK2, indicating that the apparent 'D-motif' located in the putative sixth helix of DAPk-DD is not sufficient for ERK2 recognition. CD analysis revealed that binding of DAPk-DD to ERK2 is not accompanied by a significant change in secondary structure. Taken together our data argue that the DAPk-DD, when expressed in isolation, does not adopt a classical DD fold, yet in this state retains the capacity to interact with at least one of its binding partners. The lack of a stable globular structure for the DAPk-DD may reflect either that its folding would be supported by interactions absent in our experimental set-up, or a limitation in the structural bioinformatics assignment of the three-dimensional structure.

  19. Peptide models XLV: conformational properties of N-formyl-L-methioninamide and its relevance to methionine in proteins.

    PubMed

    Láng, András; Csizmadia, Imre G; Perczel, András

    2005-02-15

    The conformational space of the most biologically significant backbone folds of a suitable methionine peptide model was explored by density functional computational method. Using a medium [6-31G(d)] and a larger basis set [6-311++G(2d,2p)], the systematic exploration of low-energy backbone structures restricted for the "L-region" in the Ramachandran map of N-formyl-L-methioninamide results in conformers corresponding to the building units of an extended backbone structure (betaL), an inverse gamma-turn (gammaL), or a right-handed helical structure (alphaL). However, no poly-proline II type (epsilonL) fold was found, indicating that this conformer has no intrinsic stability, and highlighting the effect of molecular environment in stabilizing this backbone structure. This is in agreement with the abundance of the epsilonL-type backbone conformation of methionine found in proteins. Stability properties (DeltaE) and distinct backbone-side-chain interactions support the idea that specific intramolecular contacts are operative in the selection of the lowest energy conformers. Apart from the number of different folds, all stable conformers are within a 10 kcal x mol(-1) energy range, indicating the highly flexible behavior of methionine. This conformational feature can be important in supporting catalytic processes, facilitating protein folding and dimerization via metal ion binding. In both of the biological examples discussed (HIV-1 reverse transcriptase and PcoC copper-resistant protein), the conformational properties of Met residues were found to be of key importance. Spatial proximity to other types of residues or the same type of residue seems to be crucial for the structural integrity of a protein, whether Met is buried or exposed.

  20. Structural analysis of kinetic folding intermediates for a TIM barrel protein, indole-3-glycerol phosphate synthase, by hydrogen exchange mass spectrometry and Gō-model simulation

    PubMed Central

    Gu, Zhenyu; Rao, Maithreyi K.; Forsyth, William R.

    2009-01-01

    The structures of partially-folded states appearing during the folding of a (βα)8 TIM barrel protein, the indole-3-glycerol phosphate synthase from S. solfataricus (sIGPS), was assessed by hydrogen exchange mass spectrometry (HX-MS) and Gō-model simulations. HX-MS analysis of the peptic peptides derived from the pulse-labeled product of the sub-millisecond folding reaction from the urea-denatured state revealed strong protection in the (βα)4 region, modest protection in the neighboring (βα)1–3 and (βα)5β6 segments and no significant protection in the remaining N- and C-terminal segments. These results demonstrate that this species is not a collapsed form of the unfolded state under native-favoring conditions nor is it the native state formed via fast-track folding. However, the striking contrast of these results with the strong protection observed in the (βα)2–5β6 region after 5 s of folding demonstrates that these species represent kinetically-distinct folding intermediates that are not identical as previously thought. A re-examination of the kinetic folding mechanism by chevron analysis of fluorescence data confirmed distinct roles for these two species: the burst-phase intermediate is predicted to be a misfolded, off-pathway intermediate while the subsequent 5 s intermediate corresponds to an on-pathway equilibrium intermediate. Comparison with the predictions using a Cα Gō-model simulation of the kinetic folding reaction for sIGPS shows good agreement with the core of structure offering protection against exchange in the on-pathway intermediate(s). Because the native-centric Gō-model simulations do not explicitly include sequence-specific information, the simulation results support the hypothesis that the topology of TIM barrel proteins is a primary determinant of the folding free energy surface for the productive folding reaction. The early misfolding reaction must involve aspects of non-native structure not detected by the Gō-model simulation. PMID:17942114

Top