Sample records for protein structure analysis

  1. Residue-residue contacts: application to analysis of secondary structure interactions.

    PubMed

    Potapov, Vladimir; Edelman, Marvin; Sobolev, Vladimir

    2013-01-01

    Protein structures and their complexes are formed and stabilized by interactions, both inside and outside of the protein. Analysis of such interactions helps in understanding different levels of structures (secondary, super-secondary, and oligomeric states). It can also assist molecular biologists in understanding structural consequences of modifying proteins and/or ligands. In this chapter, our definition of atom-atom and residue-residue contacts is described and applied to analysis of protein-protein interactions in dimeric β-sandwich proteins.

  2. Protein structure similarity from Principle Component Correlation analysis.

    PubMed

    Zhou, Xiaobo; Chou, James; Wong, Stephen T C

    2006-01-25

    Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.

  3. PDBFlex: exploring flexibility in protein structures

    PubMed Central

    Hrabe, Thomas; Li, Zhanwen; Sedova, Mayya; Rotkiewicz, Piotr; Jaroszewski, Lukasz; Godzik, Adam

    2016-01-01

    The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software. PMID:26615193

  4. Advances in structural and functional analysis of membrane proteins by electron crystallography

    PubMed Central

    Wisedchaisri, Goragot; Reichow, Steve L.; Gonen, Tamir

    2011-01-01

    Summary Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. PMID:22000511

  5. Advances in structural and functional analysis of membrane proteins by electron crystallography.

    PubMed

    Wisedchaisri, Goragot; Reichow, Steve L; Gonen, Tamir

    2011-10-12

    Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Synchrotron IR microspectroscopy for protein structure analysis: Potential and questions

    DOE PAGES

    Yu, Peiqiang

    2006-01-01

    Synchrotron radiation-based Fourier transform infrared microspectroscopy (S-FTIR) has been developed as a rapid, direct, non-destructive, bioanalytical technique. This technique takes advantage of synchrotron light brightness and small effective source size and is capable of exploring the molecular chemical make-up within microstructures of a biological tissue without destruction of inherent structures at ultra-spatial resolutions within cellular dimension. To date there has been very little application of this advanced technique to the study of pure protein inherent structure at a cellular level in biological tissues. In this review, a novel approach was introduced to show the potential of the newly developed, advancedmore » synchrotron-based analytical technology, which can be used to localize relatively “pure“ protein in the plant tissues and relatively reveal protein inherent structure and protein molecular chemical make-up within intact tissue at cellular and subcellular levels. Several complex protein IR spectra data analytical techniques (Gaussian and Lorentzian multi-component peak modeling, univariate and multivariate analysis, principal component analysis (PCA), and hierarchical cluster analysis (CLA) are employed to relatively reveal features of protein inherent structure and distinguish protein inherent structure differences between varieties/species and treatments in plant tissues. By using a multi-peak modeling procedure, RELATIVE estimates (but not EXACT determinations) for protein secondary structure analysis can be made for comparison purpose. The issues of pro- and anti-multi-peaking modeling/fitting procedure for relative estimation of protein structure were discussed. By using the PCA and CLA analyses, the plant molecular structure can be qualitatively separate one group from another, statistically, even though the spectral assignments are not known. The synchrotron-based technology provides a new approach for protein structure research in biological tissues at ultraspatial resolutions.« less

  7. Robust, high-throughput solution structural analyses by small angle X-ray scattering (SAXS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hura, Greg L.; Menon, Angeli L.; Hammel, Michal

    2009-07-20

    We present an efficient pipeline enabling high-throughput analysis of protein structure in solution with small angle X-ray scattering (SAXS). Our SAXS pipeline combines automated sample handling of microliter volumes, temperature and anaerobic control, rapid data collection and data analysis, and couples structural analysis with automated archiving. We subjected 50 representative proteins, mostly from Pyrococcus furiosus, to this pipeline and found that 30 were multimeric structures in solution. SAXS analysis allowed us to distinguish aggregated and unfolded proteins, define global structural parameters and oligomeric states for most samples, identify shapes and similar structures for 25 unknown structures, and determine envelopes formore » 41 proteins. We believe that high-throughput SAXS is an enabling technology that may change the way that structural genomics research is done.« less

  8. Evaluation of variability in high-resolution protein structures by global distance scoring.

    PubMed

    Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji

    2018-01-01

    Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.

  9. Structural determination of intact proteins using mass spectrometry

    DOEpatents

    Kruppa, Gary [San Francisco, CA; Schoeniger, Joseph S [Oakland, CA; Young, Malin M [Livermore, CA

    2008-05-06

    The present invention relates to novel methods of determining the sequence and structure of proteins. Specifically, the present invention allows for the analysis of intact proteins within a mass spectrometer. Therefore, preparatory separations need not be performed prior to introducing a protein sample into the mass spectrometer. Also disclosed herein are new instrumental developments for enhancing the signal from the desired modified proteins, methods for producing controlled protein fragments in the mass spectrometer, eliminating complex microseparations, and protein preparatory chemical steps necessary for cross-linking based protein structure determination.Additionally, the preferred method of the present invention involves the determination of protein structures utilizing a top-down analysis of protein structures to search for covalent modifications. In the preferred method, intact proteins are ionized and fragmented within the mass spectrometer.

  10. Analysis of protein-protein docking decoys using interaction fingerprints: application to the reconstruction of CaM-ligand complexes.

    PubMed

    Uchikoga, Nobuyuki; Hirokawa, Takatsugu

    2010-05-11

    Protein-protein docking for proteins with large conformational changes was analyzed by using interaction fingerprints, one of the scales for measuring similarities among complex structures, utilized especially for searching near-native protein-ligand or protein-protein complex structures. Here, we have proposed a combined method for analyzing protein-protein docking by taking large conformational changes into consideration. This combined method consists of ensemble soft docking with multiple protein structures, refinement of complexes, and cluster analysis using interaction fingerprints and energy profiles. To test for the applicability of this combined method, various CaM-ligand complexes were reconstructed from the NMR structures of unbound CaM. For the purpose of reconstruction, we used three known CaM-ligands, namely, the CaM-binding peptides of cyclic nucleotide gateway (CNG), CaM kinase kinase (CaMKK) and the plasma membrane Ca2+ ATPase pump (PMCA), and thirty-one structurally diverse CaM conformations. For each ligand, 62000 CaM-ligand complexes were generated in the docking step and the relationship between their energy profiles and structural similarities to the native complex were analyzed using interaction fingerprint and RMSD. Near-native clusters were obtained in the case of CNG and CaMKK. The interaction fingerprint method discriminated near-native structures better than the RMSD method in cluster analysis. We showed that a combined method that includes the interaction fingerprint is very useful for protein-protein docking analysis of certain cases.

  11. In Silico Analysis for the Study of Botulinum Toxin Structure

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2010-01-01

    Protein-protein interactions play many important roles in biological function. Knowledge of protein-protein complex structure is required for understanding the function. The determination of protein-protein complex structure by experimental studies remains difficult, therefore computational prediction of protein structures by structure modeling and docking studies is valuable method. In addition, MD simulation is also one of the most popular methods for protein structure modeling and characteristics. Here, we attempt to predict protein-protein complex structure and property using some of bioinformatic methods, and we focus botulinum toxin complex as target structure.

  12. XLinkDB 2.0: integrated, large-scale structural analysis of protein crosslinking data

    PubMed Central

    Schweppe, Devin K.; Zheng, Chunxiang; Chavez, Juan D.; Navare, Arti T.; Wu, Xia; Eng, Jimmy K.; Bruce, James E.

    2016-01-01

    Motivation: Large-scale chemical cross-linking with mass spectrometry (XL-MS) analyses are quickly becoming a powerful means for high-throughput determination of protein structural information and protein–protein interactions. Recent studies have garnered thousands of cross-linked interactions, yet the field lacks an effective tool to compile experimental data or access the network and structural knowledge for these large scale analyses. We present XLinkDB 2.0 which integrates tools for network analysis, Protein Databank queries, modeling of predicted protein structures and modeling of docked protein structures. The novel, integrated approach of XLinkDB 2.0 enables the holistic analysis of XL-MS protein interaction data without limitation to the cross-linker or analytical system used for the analysis. Availability and Implementation: XLinkDB 2.0 can be found here, including documentation and help: http://xlinkdb.gs.washington.edu/. Contact: jimbruce@uw.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153666

  13. Integrating protein structural dynamics and evolutionary analysis with Bio3D.

    PubMed

    Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J

    2014-12-10

    Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

  14. Proteins as sponges: a statistical journey along protein structure organization principles.

    PubMed

    Paola, Luisa Di; Paci, Paola; Santoni, Daniele; Ruvo, Micol De; Giuliani, Alessandro

    2012-02-27

    The analysis of a large database of protein structures by means of topological and shape indexes inspired by complex network and fractal analysis shed light on some organizational principles of proteins. Proteins appear much more similar to "fractal" sponges than to closely packed spheres, casting doubts on the tenability of the hydrophobic core concept. Principal component analysis highlighted three main order parameters shaping the protein universe: (1) "size", with the consequent generation of progressively less dense and more empty structures at an increasing number of residues, (2) "microscopic structuring", linked to the existence of a spectrum going from the prevalence of heterologous (different hydrophobicity) to the prevalence of homologous (similar hydrophobicity) contacts, and (3) "fractal shape", an organizing protein data set along a continuum going from approximately linear to very intermingled structures. Perhaps the time has come for seriously taking into consideration the real relevance of time-honored principles like the hydrophobic core and hydrophobic effect.

  15. Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space

    PubMed Central

    2014-01-01

    Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993

  16. Deciphering the shape and deformation of secondary structures through local conformation analysis

    PubMed Central

    2011-01-01

    Background Protein deformation has been extensively analysed through global methods based on RMSD, torsion angles and Principal Components Analysis calculations. Here we use a local approach, able to distinguish among the different backbone conformations within loops, α-helices and β-strands, to address the question of secondary structures' shape variation within proteins and deformation at interface upon complexation. Results Using a structural alphabet, we translated the 3 D structures of large sets of protein-protein complexes into sequences of structural letters. The shape of the secondary structures can be assessed by the structural letters that modeled them in the structural sequences. The distribution analysis of the structural letters in the three protein compartments (surface, core and interface) reveals that secondary structures tend to adopt preferential conformations that differ among the compartments. The local description of secondary structures highlights that curved conformations are preferred on the surface while straight ones are preferred in the core. Interfaces display a mixture of local conformations either preferred in core or surface. The analysis of the structural letters transition occurring between protein-bound and unbound conformations shows that the deformation of secondary structure is tightly linked to the compartment preference of the local conformations. Conclusion The conformation of secondary structures can be further analysed and detailed thanks to a structural alphabet which allows a better description of protein surface, core and interface in terms of secondary structures' shape and deformation. Induced-fit modification tendencies described here should be valuable information to identify and characterize regions under strong structural constraints for functional reasons. PMID:21284872

  17. Deciphering the shape and deformation of secondary structures through local conformation analysis.

    PubMed

    Baussand, Julie; Camproux, Anne-Claude

    2011-02-01

    Protein deformation has been extensively analysed through global methods based on RMSD, torsion angles and Principal Components Analysis calculations. Here we use a local approach, able to distinguish among the different backbone conformations within loops, α-helices and β-strands, to address the question of secondary structures' shape variation within proteins and deformation at interface upon complexation. Using a structural alphabet, we translated the 3 D structures of large sets of protein-protein complexes into sequences of structural letters. The shape of the secondary structures can be assessed by the structural letters that modeled them in the structural sequences. The distribution analysis of the structural letters in the three protein compartments (surface, core and interface) reveals that secondary structures tend to adopt preferential conformations that differ among the compartments. The local description of secondary structures highlights that curved conformations are preferred on the surface while straight ones are preferred in the core. Interfaces display a mixture of local conformations either preferred in core or surface. The analysis of the structural letters transition occurring between protein-bound and unbound conformations shows that the deformation of secondary structure is tightly linked to the compartment preference of the local conformations. The conformation of secondary structures can be further analysed and detailed thanks to a structural alphabet which allows a better description of protein surface, core and interface in terms of secondary structures' shape and deformation. Induced-fit modification tendencies described here should be valuable information to identify and characterize regions under strong structural constraints for functional reasons.

  18. Structural Analysis of PTM Hotspots (SAPH-ire) – A Quantitative Informatics Method Enabling the Discovery of Novel Regulatory Elements in Protein Families*

    PubMed Central

    Dewhurst, Henry M.; Choudhury, Shilpa; Torres, Matthew P.

    2015-01-01

    Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. PMID:26070665

  19. StructAlign, a Program for Alignment of Structures of DNA-Protein Complexes.

    PubMed

    Popov, Ya V; Galitsyna, A A; Alexeevski, A V; Karyagina, A S; Spirin, S A

    2015-11-01

    Comparative analysis of structures of complexes of homologous proteins with DNA is important in the analysis of DNA-protein recognition. Alignment is a necessary stage of the analysis. An alignment is a matching of amino acid residues and nucleotides of one complex to residues and nucleotides of the other. Currently, there are no programs available for aligning structures of DNA-protein complexes. We present the program StructAlign, which should fill this gap. The program inputs a pair of complexes of DNA double helix with proteins and outputs an alignment of DNA chains corresponding to the best spatial fit of the protein chains.

  20. Mining protein loops using a structural alphabet and statistical exceptionality

    PubMed Central

    2010-01-01

    Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/. PMID:20132552

  1. Mining protein loops using a structural alphabet and statistical exceptionality.

    PubMed

    Regad, Leslie; Martin, Juliette; Nuel, Gregory; Camproux, Anne-Claude

    2010-02-04

    Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 A). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.

  2. PDBStat: a universal restraint converter and restraint analysis software package for protein NMR.

    PubMed

    Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M; Montelione, Gaetano T

    2013-08-01

    The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.

  3. PDBStat: A Universal Restraint Converter and Restraint Analysis Software Package for Protein NMR

    PubMed Central

    Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M.; Montelione, Gaetano T

    2013-01-01

    The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data. PMID:23897031

  4. Future directions of electron crystallography.

    PubMed

    Fujiyoshi, Yoshinori

    2013-01-01

    In biological science, there are still many interesting and fundamental yet difficult questions, such as those in neuroscience, remaining to be answered. Structural and functional studies of membrane proteins, which are key molecules of signal transduction in neural and other cells, are essential for understanding the molecular mechanisms of many fundamental biological processes. Technological and instrumental advancements of electron microscopy have facilitated comprehension of structural studies of biological components, such as membrane proteins. While X-ray crystallography has been the main method of structure analysis of proteins including membrane proteins, electron crystallography is now an established technique to analyze structures of membrane proteins in the lipid bilayer, which is close to their natural biological environment. By utilizing cryo-electron microscopes with helium-cooled specimen stages, structures of membrane proteins were analyzed at a resolution better than 3 Å. Such high-resolution structural analysis of membrane proteins by electron crystallography opens up the new research field of structural physiology. Considering the fact that the structures of integral membrane proteins in their native membrane environment without artifacts from crystal contacts are critical in understanding their physiological functions, electron crystallography will continue to be an important technology for structural analysis. In this chapter, I will present several examples to highlight important advantages and to suggest future directions of this technique.

  5. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes.

    PubMed

    Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude

    2008-11-15

    Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.

  6. An overview of the structures of protein-DNA complexes

    PubMed Central

    Luscombe, Nicholas M; Austin, Susan E; Berman , Helen M; Thornton, Janet M

    2000-01-01

    On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes. PMID:11104519

  7. ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes

    PubMed Central

    Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan

    2009-01-01

    We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624

  8. What are the structural features that drive partitioning of proteins in aqueous two-phase systems?

    PubMed

    Wu, Zhonghua; Hu, Gang; Wang, Kui; Zaslavsky, Boris Yu; Kurgan, Lukasz; Uversky, Vladimir N

    2017-01-01

    Protein partitioning in aqueous two-phase systems (ATPSs) represents a convenient, inexpensive, and easy to scale-up protein separation technique. Since partition behavior of a protein dramatically depends on an ATPS composition, it would be highly beneficial to have reliable means for (even qualitative) prediction of partitioning of a target protein under different conditions. Our aim was to understand which structural features of proteins contribute to partitioning of a query protein in a given ATPS. We undertook a systematic empirical analysis of relations between 57 numerical structural descriptors derived from the corresponding amino acid sequences and crystal structures of 10 well-characterized proteins and the partition behavior of these proteins in 29 different ATPSs. This analysis revealed that just a few structural characteristics of proteins can accurately determine behavior of these proteins in a given ATPS. However, partition behavior of proteins in different ATPSs relies on different structural features. In other words, we could not find a unique set of protein structural features derived from their crystal structures that could be used for the description of the protein partition behavior of all proteins in all ATPSs analyzed in this study. We likely need to gain better insight into relationships between protein-solvent interactions and protein structure peculiarities, in particular given limitations of the used here crystal structures, to be able to construct a model that accurately predicts protein partition behavior across all ATPSs. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Structure-sequence based analysis for identification of conserved regions in proteins

    DOEpatents

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  10. Molecular basis of protein structure in proanthocyanidin and anthocyanin-enhanced Lc-transgenic alfalfa in relation to nutritive value using synchrotron-radiation FTIR microspectroscopy: A novel approach

    NASA Astrophysics Data System (ADS)

    Yu, Peiqiang; Jonker, Arjan; Gruber, Margaret

    2009-09-01

    To date there has been very little application of synchrotron radiation-based Fourier transform infrared microspectroscopy (SRFTIRM) to the study of molecular structures in plant forage in relation to livestock digestive behavior and nutrient availability. Protein inherent structure, among other factors such as protein matrix, affects nutritive quality, fermentation and degradation behavior in both humans and animals. The relative percentage of protein secondary structure influences protein value. A high percentage of β-sheets usually reduce the access of gastrointestinal digestive enzymes to the protein. Reduced accessibility results in poor digestibility and as a result, low protein value. The objective of this study was to use SRFTIRM to compare protein molecular structure of alfalfa plant tissues transformed with the maize Lc regulatory gene with non-transgenic alfalfa protein within cellular and subcellular dimensions and to quantify protein inherent structure profiles using Gaussian and Lorentzian methods of multi-component peak modeling. Protein molecular structure revealed by this method included α-helices, β-sheets and other structures such as β-turns and random coils. Hierarchical cluster analysis and principal component analysis of the synchrotron data, as well as accurate spectral analysis based on curve fitting, showed that transgenic alfalfa contained a relatively lower ( P < 0.05) percentage of the model-fitted α-helices (29 vs. 34) and model-fitted β-sheets (22 vs. 27) and a higher ( P < 0.05) percentage of other model-fitted structures (49 vs. 39). Transgenic alfalfa protein displayed no difference ( P > 0.05) in the ratio of α-helices to β-sheets (average: 1.4) and higher ( P < 0.05) ratios of α-helices to others (0.7 vs. 0.9) and β-sheets to others (0.5 vs. 0.8) than the non-transgenic alfalfa protein. The transgenic protein structures also exhibited no difference ( P > 0.05) in the vibrational intensity of protein amide I (average of 24) and amide II areas (average of 10) and their ratio (average of 2.4) compared with non-transgenic alfalfa. Cluster analysis and principal component analysis showed no significant differences between the two genotypes in the broad molecular fingerprint region, amides I and II regions, and the carbohydrate molecular region, indicating they are highly related to each other. The results suggest that transgenic Lc-alfalfa leaves contain similar proteins to non-transgenic alfalfa (because amide I and II intensities were identical), but a subtle difference in protein molecular structure after freeze drying. Further study is needed to understand the relationship between these structural profiles and biological features such as protein nutrient availability, protein bypass and digestive behavior of livestock fed with this type of forage.

  11. Molecular Basis of Protein Structure in Proanthocyanidin and Anthocyanin-Enhanced Lc-transgenic Alfalfa in Relation to Nutritive Value Using Synchrotron-Radiation FTIR Microspectroscopy: A Novel Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, P.; Jonker, A; Gruber, M

    2009-01-01

    To date there has been very little application of synchrotron radiation-based Fourier transform infrared microspectroscopy (SRFTIRM) to the study of molecular structures in plant forage in relation to livestock digestive behavior and nutrient availability. Protein inherent structure, among other factors such as protein matrix, affects nutritive quality, fermentation and degradation behavior in both humans and animals. The relative percentage of protein secondary structure influences protein value. A high percentage of e-sheets usually reduce the access of gastrointestinal digestive enzymes to the protein. Reduced accessibility results in poor digestibility and as a result, low protein value. The objective of this studymore » was to use SRFTIRM to compare protein molecular structure of alfalfa plant tissues transformed with the maize Lc regulatory gene with non-transgenic alfalfa protein within cellular and subcellular dimensions and to quantify protein inherent structure profiles using Gaussian and Lorentzian methods of multi-component peak modeling. Protein molecular structure revealed by this method included a-helices, e-sheets and other structures such as e-turns and random coils. Hierarchical cluster analysis and principal component analysis of the synchrotron data, as well as accurate spectral analysis based on curve fitting, showed that transgenic alfalfa contained a relatively lower (P < 0.05) percentage of the model-fitted a-helices (29 vs. 34) and model-fitted e-sheets (22 vs. 27) and a higher (P < 0.05) percentage of other model-fitted structures (49 vs. 39). Transgenic alfalfa protein displayed no difference (P > 0.05) in the ratio of a-helices to e-sheets (average: 1.4) and higher (P < 0.05) ratios of a-helices to others (0.7 vs. 0.9) and e-sheets to others (0.5 vs. 0.8) than the non-transgenic alfalfa protein. The transgenic protein structures also exhibited no difference (P > 0.05) in the vibrational intensity of protein amide I (average of 24) and amide II areas (average of 10) and their ratio (average of 2.4) compared with non-transgenic alfalfa. Cluster analysis and principal component analysis showed no significant differences between the two genotypes in the broad molecular fingerprint region, amides I and II regions, and the carbohydrate molecular region, indicating they are highly related to each other. The results suggest that transgenic Lc-alfalfa leaves contain similar proteins to non-transgenic alfalfa (because amide I and II intensities were identical), but a subtle difference in protein molecular structure after freeze drying. Further study is needed to understand the relationship between these structural profiles and biological features such as protein nutrient availability, protein bypass and digestive behavior of livestock fed with this type of forage.« less

  12. Protein sectors: evolutionary units of three-dimensional structure

    PubMed Central

    Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama

    2011-01-01

    Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402

  13. Analysis of Translocation-Competent Secretory Proteins by HDX-MS.

    PubMed

    Tsirigotaki, A; Papanastasiou, M; Trelle, M B; Jørgensen, T J D; Economou, A

    2017-01-01

    Protein folding is an intricate and precise process in living cells. Most exported proteins evade cytoplasmic folding, become targeted to the membrane, and then trafficked into/across membranes. Their targeting and translocation-competent states are nonnatively folded. However, once they reach the appropriate cellular compartment, they can fold to their native states. The nonnative states of preproteins remain structurally poorly characterized since increased disorder, protein sizes, aggregation propensity, and the observation timescale are often limiting factors for typical structural approaches such as X-ray crystallography and NMR. Here, we present an alternative approach for the in vitro analysis of nonfolded translocation-competent protein states and their comparison with their native states. We make use of hydrogen/deuterium exchange coupled with mass spectrometry (HDX-MS), a method based on differentiated isotope exchange rates in structured vs unstructured protein states/regions, and highly dynamic vs more rigid regions. We present a complete structural characterization pipeline, starting from the preparation of the polypeptides to data analysis and interpretation. Proteolysis and mass spectrometric conditions for the analysis of the labeled proteins are discussed, followed by the analysis and interpretation of HDX-MS data. We highlight the suitability of HDX-MS for identifying short structured regions within otherwise highly flexible protein states, as illustrated by an exported protein example, experimentally tested in our lab. Finally, we discuss statistical analysis in comparative HDX-MS. The protocol is applicable to any protein and protein size, exhibiting slow or fast loss of translocation competence. It could be easily adapted to more complex assemblies, such as the interaction of chaperones with nonnative protein states. © 2017 Elsevier Inc. All rights reserved.

  14. Using Molecular Visualization to Explore Protein Structure and Function and Enhance Student Facility with Computational Tools

    ERIC Educational Resources Information Center

    Terrell, Cassidy R.; Listenberger, Laura L.

    2017-01-01

    Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…

  15. Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?

    PubMed

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.

  16. Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets.

    PubMed

    Ashford, Paul; Moss, David S; Alex, Alexander; Yeap, Siew K; Povia, Alice; Nobeli, Irene; Williams, Mark A

    2012-03-14

    Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible. We have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding. Through post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.

  17. Binding free energy analysis of protein-protein docking model structures by evERdock.

    PubMed

    Takemura, Kazuhiro; Matubayasi, Nobuyuki; Kitao, Akio

    2018-03-14

    To aid the evaluation of protein-protein complex model structures generated by protein docking prediction (decoys), we previously developed a method to calculate the binding free energies for complexes. The method combines a short (2 ns) all-atom molecular dynamics simulation with explicit solvent and solution theory in the energy representation (ER). We showed that this method successfully selected structures similar to the native complex structure (near-native decoys) as the lowest binding free energy structures. In our current work, we applied this method (evERdock) to 100 or 300 model structures of four protein-protein complexes. The crystal structures and the near-native decoys showed the lowest binding free energy of all the examined structures, indicating that evERdock can successfully evaluate decoys. Several decoys that show low interface root-mean-square distance but relatively high binding free energy were also identified. Analysis of the fraction of native contacts, hydrogen bonds, and salt bridges at the protein-protein interface indicated that these decoys were insufficiently optimized at the interface. After optimizing the interactions around the interface by including interfacial water molecules, the binding free energies of these decoys were improved. We also investigated the effect of solute entropy on binding free energy and found that consideration of the entropy term does not necessarily improve the evaluations of decoys using the normal model analysis for entropy calculation.

  18. Binding free energy analysis of protein-protein docking model structures by evERdock

    NASA Astrophysics Data System (ADS)

    Takemura, Kazuhiro; Matubayasi, Nobuyuki; Kitao, Akio

    2018-03-01

    To aid the evaluation of protein-protein complex model structures generated by protein docking prediction (decoys), we previously developed a method to calculate the binding free energies for complexes. The method combines a short (2 ns) all-atom molecular dynamics simulation with explicit solvent and solution theory in the energy representation (ER). We showed that this method successfully selected structures similar to the native complex structure (near-native decoys) as the lowest binding free energy structures. In our current work, we applied this method (evERdock) to 100 or 300 model structures of four protein-protein complexes. The crystal structures and the near-native decoys showed the lowest binding free energy of all the examined structures, indicating that evERdock can successfully evaluate decoys. Several decoys that show low interface root-mean-square distance but relatively high binding free energy were also identified. Analysis of the fraction of native contacts, hydrogen bonds, and salt bridges at the protein-protein interface indicated that these decoys were insufficiently optimized at the interface. After optimizing the interactions around the interface by including interfacial water molecules, the binding free energies of these decoys were improved. We also investigated the effect of solute entropy on binding free energy and found that consideration of the entropy term does not necessarily improve the evaluations of decoys using the normal model analysis for entropy calculation.

  19. PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory.

    PubMed

    Niknam, Niloofar; Khakzad, Hamed; Arab, Seyed Shahriar; Naderi-Manesh, Hossein

    2016-05-01

    The integrative and cooperative nature of protein structure involves the assessment of topological and global features of constituent parts. Network concept takes complete advantage of both of these properties in the analysis concomitantly. High compatibility to structural concepts or physicochemical properties in addition to exploiting a remarkable simplification in the system has made network an ideal tool to explore biological systems. There are numerous examples in which different protein structural and functional characteristics have been clarified by the network approach. Here, we present an interactive and user-friendly Matlab-based toolbox, PDB2Graph, devoted to protein structure network construction, visualization, and analysis. Moreover, PDB2Graph is an appropriate tool for identifying critical nodes involved in protein structural robustness and function based on centrality indices. It maps critical amino acids in protein networks and can greatly aid structural biologists in selecting proper amino acid candidates for manipulating protein structures in a more reasonable and rational manner. To introduce the capability and efficiency of PDB2Graph in detail, the structural modification of Calmodulin through allosteric binding of Ca(2+) is considered. In addition, a mutational analysis for three well-identified model proteins including Phage T4 lysozyme, Barnase and Ribonuclease HI, was performed to inspect the influence of mutating important central residues on protein activity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Antibody Epitope Analysis to Investigate Folded Structure, Allosteric Conformation, and Evolutionary Lineage of Proteins.

    PubMed

    Wong, Sienna; Jin, J-P

    2017-01-01

    Study of folded structure of proteins provides insights into their biological functions, conformational dynamics and molecular evolution. Current methods of elucidating folded structure of proteins are laborious, low-throughput, and constrained by various limitations. Arising from these methods is the need for a sensitive, quantitative, rapid and high-throughput method not only analysing the folded structure of proteins, but also to monitor dynamic changes under physiological or experimental conditions. In this focused review, we outline the foundation and limitations of current protein structure-determination methods prior to discussing the advantages of an emerging antibody epitope analysis for applications in structural, conformational and evolutionary studies of proteins. We discuss the application of this method using representative examples in monitoring allosteric conformation of regulatory proteins and the determination of the evolutionary lineage of related proteins and protein isoforms. The versatility of the method described herein is validated by the ability to modulate a variety of assay parameters to meet the needs of the user in order to monitor protein conformation. Furthermore, the assay has been used to clarify the lineage of troponin isoforms beyond what has been depicted by sequence homology alone, demonstrating the nonlinear evolutionary relationship between primary structure and tertiary structure of proteins. The antibody epitope analysis method is a highly adaptable technique of protein conformation elucidation, which can be easily applied without the need for specialized equipment or technical expertise. When applied in a systematic and strategic manner, this method has the potential to reveal novel and biomedically meaningful information for structure-function relationship and evolutionary lineage of proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  1. Replica exchange molecular dynamics simulation of structure variation from α/4β-fold to 3α-fold protein.

    PubMed

    Lazim, Raudah; Mei, Ye; Zhang, Dawei

    2012-03-01

    Replica exchange molecular dynamics (REMD) simulation provides an efficient conformational sampling tool for the study of protein folding. In this study, we explore the mechanism directing the structure variation from α/4β-fold protein to 3α-fold protein after mutation by conducting REMD simulation on 42 replicas with temperatures ranging from 270 K to 710 K. The simulation began from a protein possessing the primary structure of GA88 but the tertiary structure of GB88, two G proteins with "high sequence identity." Albeit the large Cα-root mean square deviation (RMSD) of the folded protein (4.34 Å at 270 K and 4.75 Å at 304 K), a variation in tertiary structure was observed. Together with the analysis of secondary structure assignment, cluster analysis and principal component, it provides insights to the folding and unfolding pathway of 3α-fold protein and α/4β-fold protein respectively paving the way toward the understanding of the ongoings during conformational variation.

  2. A complementation assay for in vivo protein structure/function analysis in Physcomitrella patens (Funariaceae)

    DOE PAGES

    Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.

    2015-07-14

    Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less

  3. HIV-1 Protease Function and Structure Studies with the Simplicial Neighborhood Analysis of Protein Packing (SNAPP) Method

    PubMed Central

    Zhang, Shuxing; Kaplan, Andrew H.; Tropsha, Alexander

    2009-01-01

    The Simplicial Neighborhood Analysis of Protein Packing (SNAPP) method was used to predict the effect of mutagenesis on the enzymatic activity of the HIV-1 protease (HIVP). SNAPP relies on a four-body statistical scoring function derived from the analysis of spatially nearest neighbor residue compositional preferences in a diverse and representative subset of protein structures from the Protein Data Bank. The method was applied to the analysis of HIVP mutants with residue substitutions in the hydrophobic core as well as at the interface between the two protease monomers. Both wild type and tethered structures were employed in the calculations. We obtained a strong correlation, with R2 as high as 0.96, between ΔSNAPP score (i.e., the difference in SNAPP scores between wild type and mutant proteins) and the protease catalytic activity for tethered structures. A weaker but significant correlation was also obtained for non-tethered structures as well. Our analysis identified residues both in the hydrophobic core and at the dimeric interface (DI) that are very important for the protease function. This study demonstrates a potential utility of the SNAPP method for rational design of mutagenesis studies and protein engineering. PMID:18498108

  4. Structural Analysis of PTM Hotspots (SAPH-ire)--A Quantitative Informatics Method Enabling the Discovery of Novel Regulatory Elements in Protein Families.

    PubMed

    Dewhurst, Henry M; Choudhury, Shilpa; Torres, Matthew P

    2015-08-01

    Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)--a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits--conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit-N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  5. MD simulations of papillomavirus DNA-E2 protein complexes hints at a protein structural code for DNA deformation.

    PubMed

    Falconi, M; Oteri, F; Eliseo, T; Cicero, D O; Desideri, A

    2008-08-01

    The structural dynamics of the DNA binding domains of the human papillomavirus strain 16 and the bovine papillomavirus strain 1, complexed with their DNA targets, has been investigated by modeling, molecular dynamics simulations, and nuclear magnetic resonance analysis. The simulations underline different dynamical features of the protein scaffolds and a different mechanical interaction of the two proteins with DNA. The two protein structures, although very similar, show differences in the relative mobility of secondary structure elements. Protein structural analyses, principal component analysis, and geometrical and energetic DNA analyses indicate that the two transcription factors utilize a different strategy in DNA recognition and deformation. Results show that the protein indirect DNA readout is not only addressable to the DNA molecule flexibility but it is finely tuned by the mechanical and dynamical properties of the protein scaffold involved in the interaction.

  6. Discriminative structural approaches for enzyme active-site prediction.

    PubMed

    Kato, Tsuyoshi; Nagano, Nozomi

    2011-02-15

    Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.

  7. Evaluation of stereo-array isotope labeling (SAIL) patterns for automated structural analysis of proteins with CYANA.

    PubMed

    Ikeya, Teppei; Terauchi, Tsutomu; Güntert, Peter; Kainosho, Masatsune

    2006-07-01

    Recently we have developed the stereo-array isotope labeling (SAIL) technique to overcome the conventional molecular size limitation in NMR protein structure determination by employing complete stereo- and regiospecific patterns of stable isotopes. SAIL sharpens signals and simplifies spectra without the loss of requisite structural information, thus making large classes of proteins newly accessible to detailed solution structure determination. The automated structure calculation program CYANA can efficiently analyze SAIL-NOESY spectra and calculate structures without manual analysis. Nevertheless, the original SAIL method might not be capable of determining the structures of proteins larger than 50 kDa or membrane proteins, for which the spectra are characterized by many broadened and overlapped peaks. Here we have carried out simulations of new SAIL patterns optimized for minimal relaxation and overlap, to evaluate the combined use of SAIL and CYANA for solving the structures of larger proteins and membrane proteins. The modified approach reduces the number of peaks to nearly half of that observed with uniform labeling, while still yielding well-defined structures and is expected to enable NMR structure determinations of these challenging systems.

  8. DWARF – a data warehouse system for analyzing protein families

    PubMed Central

    Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen

    2006-01-01

    Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801

  9. An Introductory Classroom Exercise on Protein Molecular Model Visualization and Detailed Analysis of Protein-Ligand Binding

    ERIC Educational Resources Information Center

    Poeylaut-Palena, Andres, A.; de los Angeles Laborde, Maria

    2013-01-01

    A learning module for molecular level analysis of protein structure and ligand/drug interaction through the visualization of X-ray diffraction is presented. Using DeepView as molecular model visualization software, students learn about the general concepts of protein structure. This Biochemistry classroom exercise is designed to be carried out by…

  10. WEBnm@ v2.0: Web server and services for comparing protein flexibility.

    PubMed

    Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie

    2014-12-30

    Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.

  11. Modularity in protein structures: study on all-alpha proteins.

    PubMed

    Khan, Taushif; Ghosh, Indira

    2015-01-01

    Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.

  12. Introduction to Protein Structure through Genetic Diseases

    ERIC Educational Resources Information Center

    Schneider, Tanya L.; Linton, Brian R.

    2008-01-01

    An illuminating way to learn about protein function is to explore high-resolution protein structures. Analysis of the proteins involved in genetic diseases has been used to introduce students to protein structure and the role that individual mutations can play in the onset of disease. Known mutations can be correlated to changes in protein…

  13. Conformational analysis of processivity clamps in solution demonstrates that tertiary structure does not correlate with protein dynamics.

    PubMed

    Fang, Jing; Nevin, Philip; Kairys, Visvaldas; Venclovas, Česlovas; Engen, John R; Beuning, Penny J

    2014-04-08

    The relationship between protein sequence, structure, and dynamics has been elusive. Here, we report a comprehensive analysis using an in-solution experimental approach to study how the conservation of tertiary structure correlates with protein dynamics. Hydrogen exchange measurements of eight processivity clamp proteins from different species revealed that, despite highly similar three-dimensional structures, clamp proteins display a wide range of dynamic behavior. Differences were apparent both for structurally similar domains within proteins and for corresponding domains of different proteins. Several of the clamps contained regions that underwent local unfolding with different half-lives. We also observed a conserved pattern of alternating dynamics of the α helices lining the inner pore of the clamps as well as a correlation between dynamics and the number of salt bridges in these α helices. Our observations reveal that tertiary structure and dynamics are not directly correlated and that primary structure plays an important role in dynamics. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Hot-spot analysis for drug discovery targeting protein-protein interactions.

    PubMed

    Rosell, Mireia; Fernández-Recio, Juan

    2018-04-01

    Protein-protein interactions are important for biological processes and pathological situations, and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. Areas covered: In this review, the authors cover a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. Expert opinion: A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.

  15. Molecular and ultrastructural analysis of forisome subunits reveals the principles of forisome assembly

    PubMed Central

    Müller, Boje; Groscurth, Sira; Menzel, Matthias; Rüping, Boris A.; Twyman, Richard M.; Prüfer, Dirk; Noll, Gundula A.

    2014-01-01

    Background and Aims Forisomes are specialized structural phloem proteins that mediate sieve element occlusion after wounding exclusively in papilionoid legumes, but most studies of forisome structure and function have focused on the Old World clade rather than the early lineages. A comprehensive phylogenetic, molecular, structural and functional analysis of forisomes from species covering a broad spectrum of the papilionoid legumes was therefore carried out, including the first analysis of Dipteryx panamensis forisomes, representing the earliest branch of the Papilionoideae lineage. The aim was to study the molecular, structural and functional conservation among forisomes from different tribes and to establish the roles of individual forisome subunits. Methods Sequence analysis and bioinformatics were combined with structural and functional analysis of native forisomes and artificial forisome-like protein bodies, the latter produced by expressing forisome genes from different legumes in a heterologous background. The structure of these bodies was analysed using a combination of confocal laser scanning microscopy (CLSM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and the function of individual subunits was examined by combinatorial expression, micromanipulation and light microscopy. Key Results Dipteryx panamensis native forisomes and homomeric protein bodies assembled from the single sieve element occlusion by forisome (SEO-F) subunit identified in this species were structurally and functionally similar to forisomes from the Old World clade. In contrast, homomeric protein bodies assembled from individual SEO-F subunits from Old World species yielded artificial forisomes differing in proportion to their native counterparts, suggesting that multiple SEO-F proteins are required for forisome assembly in these plants. Structural differences between Medicago truncatula native forisomes, homomeric protein bodies and heteromeric bodies containing all possible subunit combinations suggested that combinations of SEO-F proteins may fine-tune the geometric proportions and reactivity of forisomes. Conclusions It is concluded that forisome structure and function have been strongly conserved during evolution and that species-dependent subsets of SEO-F proteins may have evolved to fine-tune the structure of native forisomes. PMID:24694827

  16. A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive.

    PubMed

    Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

    2016-01-01

    Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.

  17. A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive

    PubMed Central

    Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

    2016-01-01

    Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms. PMID:27517583

  18. Online interactive analysis of protein structure ensembles with Bio3D-web.

    PubMed

    Skjærven, Lars; Jariwala, Shashank; Yao, Xin-Qiu; Grant, Barry J

    2016-11-15

    Bio3D-web is an online application for analyzing the sequence, structure and conformational heterogeneity of protein families. Major functionality is provided for identifying protein structure sets for analysis, their alignment and refined structure superposition, sequence and structure conservation analysis, mapping and clustering of conformations and the quantitative comparison of their predicted structural dynamics. Bio3D-web is based on the Bio3D and Shiny R packages. All major browsers are supported and full source code is available under a GPL2 license from http://thegrantlab.org/bio3d-web CONTACT: bjgrant@umich.edu or lars.skjarven@uib.no. © The Author 2016. Published by Oxford University Press.

  19. Web-ware bioinformatical analysis and structure modelling of N-terminus of human multisynthetase complex auxiliary component protein p43.

    PubMed

    Deineko, Viktor

    2006-01-01

    Human multisynthetase complex auxiliary component, protein p43 is an endothelial monocyte-activating polypeptide II precursor. In this study, comprehensive sequence analysis of N-terminus has been performed to identify structural domains, motifs, sites of post-translation modification and other functionally important parameters. The spatial structure model of full-chain protein p43 is obtained.

  20. The Prediction of Botulinum Toxin Structure Based on in Silico and in Vitro Analysis

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2011-01-01

    Many of biological system mediated through protein-protein interactions. Knowledge of protein-protein complex structure is required for understanding the function. The determination of huge size and flexible protein-protein complex structure by experimental studies remains difficult, costly and five-consuming, therefore computational prediction of protein structures by homolog modeling and docking studies is valuable method. In addition, MD simulation is also one of the most powerful methods allowing to see the real dynamics of proteins. Here, we predict protein-protein complex structure of botulinum toxin to analyze its property. These bioinformatics methods are useful to report the relation between the flexibility of backbone structure and the activity.

  1. NMR-based automated protein structure determination.

    PubMed

    Würz, Julia M; Kazemi, Sina; Schmidt, Elena; Bagaria, Anurag; Güntert, Peter

    2017-08-15

    NMR spectra analysis for protein structure determination can now in many cases be performed by automated computational methods. This overview of the computational methods for NMR protein structure analysis presents recent automated methods for signal identification in multidimensional NMR spectra, sequence-specific resonance assignment, collection of conformational restraints, and structure calculation, as implemented in the CYANA software package. These algorithms are sufficiently reliable and integrated into one software package to enable the fully automated structure determination of proteins starting from NMR spectra without manual interventions or corrections at intermediate steps, with an accuracy of 1-2 Å backbone RMSD in comparison with manually solved reference structures. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. An ambiguity principle for assigning protein structural domains.

    PubMed

    Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe

    2017-01-01

    Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object-in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our "multipartitioning" approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules.

  3. Developing a Multiplexed Quantitative Cross-Linking Mass Spectrometry Platform for Comparative Structural Analysis of Protein Complexes.

    PubMed

    Yu, Clinton; Huszagh, Alexander; Viner, Rosa; Novitsky, Eric J; Rychnovsky, Scott D; Huang, Lan

    2016-10-18

    Cross-linking mass spectrometry (XL-MS) represents a recently popularized hybrid methodology for defining protein-protein interactions (PPIs) and analyzing structures of large protein assemblies. In particular, XL-MS strategies have been demonstrated to be effective in elucidating molecular details of PPIs at the peptide resolution, providing a complementary set of structural data that can be utilized to refine existing complex structures or direct de novo modeling of unknown protein structures. To study structural and interaction dynamics of protein complexes, quantitative cross-linking mass spectrometry (QXL-MS) strategies based on isotope-labeled cross-linkers have been developed. Although successful, these approaches are mostly limited to pairwise comparisons. In order to establish a robust workflow enabling comparative analysis of multiple cross-linked samples simultaneously, we have developed a multiplexed QXL-MS strategy, namely, QMIX (Quantitation of Multiplexed, Isobaric-labeled cross (X)-linked peptides) by integrating MS-cleavable cross-linkers with isobaric labeling reagents. This study has established a new analytical platform for quantitative analysis of cross-linked peptides, which can be directly applied for multiplexed comparisons of the conformational dynamics of protein complexes and PPIs at the proteome scale in future studies.

  4. Reaction trajectory revealed by a joint analysis of protein data bank.

    PubMed

    Ren, Zhong

    2013-01-01

    Structural motions along a reaction pathway hold the secret about how a biological macromolecule functions. If each static structure were considered as a snapshot of the protein molecule in action, a large collection of structures would constitute a multidimensional conformational space of an enormous size. Here I present a joint analysis of hundreds of known structures of human hemoglobin in the Protein Data Bank. By applying singular value decomposition to distance matrices of these structures, I demonstrate that this large collection of structural snapshots, derived under a wide range of experimental conditions, arrange orderly along a reaction pathway. The structural motions along this extensive trajectory, including several helical transformations, arrive at a reverse engineered mechanism of the cooperative machinery (Ren, companion article), and shed light on pathological properties of the abnormal homotetrameric hemoglobins from α-thalassemia. This method of meta-analysis provides a general approach to structural dynamics based on static protein structures in this post genomics era.

  5. Reaction Trajectory Revealed by a Joint Analysis of Protein Data Bank

    PubMed Central

    Ren, Zhong

    2013-01-01

    Structural motions along a reaction pathway hold the secret about how a biological macromolecule functions. If each static structure were considered as a snapshot of the protein molecule in action, a large collection of structures would constitute a multidimensional conformational space of an enormous size. Here I present a joint analysis of hundreds of known structures of human hemoglobin in the Protein Data Bank. By applying singular value decomposition to distance matrices of these structures, I demonstrate that this large collection of structural snapshots, derived under a wide range of experimental conditions, arrange orderly along a reaction pathway. The structural motions along this extensive trajectory, including several helical transformations, arrive at a reverse engineered mechanism of the cooperative machinery (Ren, companion article), and shed light on pathological properties of the abnormal homotetrameric hemoglobins from α-thalassemia. This method of meta-analysis provides a general approach to structural dynamics based on static protein structures in this post genomics era. PMID:24244274

  6. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    PubMed

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  7. Acyl carrier protein structural classification and normal mode analysis

    PubMed Central

    Cantu, David C; Forrester, Michael J; Charov, Katherine; Reilly, Peter J

    2012-01-01

    All acyl carrier protein primary and tertiary structures were gathered into the ThYme database. They are classified into 16 families by amino acid sequence similarity, with members of the different families having sequences with statistically highly significant differences. These classifications are supported by tertiary structure superposition analysis. Tertiary structures from a number of families are very similar, suggesting that these families may come from a single distant ancestor. Normal vibrational mode analysis was conducted on experimentally determined freestanding structures, showing greater fluctuations at chain termini and loops than in most helices. Their modes overlap more so within families than between different families. The tertiary structures of three acyl carrier protein families that lacked any known structures were predicted as well. PMID:22374859

  8. Resource for structure related information on transmembrane proteins

    NASA Astrophysics Data System (ADS)

    Tusnády, Gábor E.; Simon, István

    Transmembrane proteins are involved in a wide variety of vital biological processes including transport of water-soluble molecules, flow of information and energy production. Despite significant efforts to determine the structures of these proteins, only a few thousand solved structures are known so far. Here, we review the various resources for structure-related information on these types of proteins ranging from the 3D structure to the topology and from the up-to-date databases to the various Internet sites and servers dealing with structure prediction and structure analysis. Abbreviations: 3D, three dimensional; PDB, Protein Data Bank; TMP, transmembrane protein.

  9. Extraction, integration and analysis of alternative splicing and protein structure distributed information

    PubMed Central

    D'Antonio, Matteo; Masseroli, Marco

    2009-01-01

    Background Alternative splicing has been demonstrated to affect most of human genes; different isoforms from the same gene encode for proteins which differ for a limited number of residues, thus yielding similar structures. This suggests possible correlations between alternative splicing and protein structure. In order to support the investigation of such relationships, we have developed the Alternative Splicing and Protein Structure Scrutinizer (PASS), a Web application to automatically extract, integrate and analyze human alternative splicing and protein structure data sparsely available in the Alternative Splicing Database, Ensembl databank and Protein Data Bank. Primary data from these databases have been integrated and analyzed using the Protein Identifier Cross-Reference, BLAST, CLUSTALW and FeatureMap3D software tools. Results A database has been developed to store the considered primary data and the results from their analysis; a system of Perl scripts has been implemented to automatically create and update the database and analyze the integrated data; a Web interface has been implemented to make the analyses easily accessible; a database has been created to manage user accesses to the PASS Web application and store user's data and searches. Conclusion PASS automatically integrates data from the Alternative Splicing Database with protein structure data from the Protein Data Bank. Additionally, it comprehensively analyzes the integrated data with publicly available well-known bioinformatics tools in order to generate structural information of isoform pairs. Further analysis of such valuable information might reveal interesting relationships between alternative splicing and protein structure differences, which may be significantly associated with different functions. PMID:19828075

  10. NMR studies of protein-nucleic acid interactions.

    PubMed

    Varani, Gabriele; Chen, Yu; Leeper, Thomas C

    2004-01-01

    Protein-DNA and protein-RNA complexes play key functional roles in every living organism. Therefore, the elucidation of their structure and dynamics is an important goal of structural and molecular biology. Nuclear magnetic resonance (NMR) studies of protein and nucleic acid complexes have common features with studies of protein-protein complexes: the interaction surfaces between the molecules must be carefully delineated, the relative orientation of the two species needs to be accurately and precisely determined, and close intermolecular contacts defined by nuclear Overhauser effects (NOEs) must be obtained. However, differences in NMR properties (e.g., chemical shifts) and biosynthetic pathways for sample productions generate important differences. Chemical shift differences between the protein and nucleic acid resonances can aid the NMR structure determination process; however, the relatively limited dispersion of the RNA ribose resonances makes the process of assigning intermolecular NOEs more difficult. The analysis of the resulting structures requires computational tools unique to nucleic acid interactions. This chapter summarizes the most important elements of the structure determination by NMR of protein-nucleic acid complexes and their analysis. The main emphasis is on recent developments (e.g., residual dipolar couplings and new Web-based analysis tools) that have facilitated NMR studies of these complexes and expanded the type of biological problems to which NMR techniques of structural elucidation can now be applied.

  11. In-situ and real-time growth observation of high-quality protein crystals under quasi-microgravity on earth.

    PubMed

    Nakamura, Akira; Ohtsuka, Jun; Kashiwagi, Tatsuki; Numoto, Nobutaka; Hirota, Noriyuki; Ode, Takahiro; Okada, Hidehiko; Nagata, Koji; Kiyohara, Motosuke; Suzuki, Ei-Ichiro; Kita, Akiko; Wada, Hitoshi; Tanokura, Masaru

    2016-02-26

    Precise protein structure determination provides significant information on life science research, although high-quality crystals are not easily obtained. We developed a system for producing high-quality protein crystals with high throughput. Using this system, gravity-controlled crystallization are made possible by a magnetic microgravity environment. In addition, in-situ and real-time observation and time-lapse imaging of crystal growth are feasible for over 200 solution samples independently. In this paper, we also report results of crystallization experiments for two protein samples. Crystals grown in the system exhibited magnetic orientation and showed higher and more homogeneous quality compared with the control crystals. The structural analysis reveals that making use of the magnetic microgravity during the crystallization process helps us to build a well-refined protein structure model, which has no significant structural differences with a control structure. Therefore, the system contributes to improvement in efficiency of structural analysis for "difficult" proteins, such as membrane proteins and supermolecular complexes.

  12. Isolation and in silico analysis of a novel H+-pyrophosphatase gene orthologue from the halophytic grass Leptochloa fusca

    NASA Astrophysics Data System (ADS)

    Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid

    2017-02-01

    Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.

  13. Structural alphabets derived from attractors in conformational space

    PubMed Central

    2010-01-01

    Background The hierarchical and partially redundant nature of protein structures justifies the definition of frequently occurring conformations of short fragments as 'states'. Collections of selected representatives for these states define Structural Alphabets, describing the most typical local conformations within protein structures. These alphabets form a bridge between the string-oriented methods of sequence analysis and the coordinate-oriented methods of protein structure analysis. Results A Structural Alphabet has been derived by clustering all four-residue fragments of a high-resolution subset of the protein data bank and extracting the high-density states as representative conformational states. Each fragment is uniquely defined by a set of three independent angles corresponding to its degrees of freedom, capturing in simple and intuitive terms the properties of the conformational space. The fragments of the Structural Alphabet are equivalent to the conformational attractors and therefore yield a most informative encoding of proteins. Proteins can be reconstructed within the experimental uncertainty in structure determination and ensembles of structures can be encoded with accuracy and robustness. Conclusions The density-based Structural Alphabet provides a novel tool to describe local conformations and it is specifically suitable for application in studies of protein dynamics. PMID:20170534

  14. Protein 3D Structure and Electron Microscopy Map Retrieval Using 3D-SURFER2.0 and EM-SURFER.

    PubMed

    Han, Xusi; Wei, Qing; Kihara, Daisuke

    2017-12-08

    With the rapid growth in the number of solved protein structures stored in the Protein Data Bank (PDB) and the Electron Microscopy Data Bank (EMDB), it is essential to develop tools to perform real-time structure similarity searches against the entire structure database. Since conventional structure alignment methods need to sample different orientations of proteins in the three-dimensional space, they are time consuming and unsuitable for rapid, real-time database searches. To this end, we have developed 3D-SURFER and EM-SURFER, which utilize 3D Zernike descriptors (3DZD) to conduct high-throughput protein structure comparison, visualization, and analysis. Taking an atomic structure or an electron microscopy map of a protein or a protein complex as input, the 3DZD of a query protein is computed and compared with the 3DZD of all other proteins in PDB or EMDB. In addition, local geometrical characteristics of a query protein can be analyzed using VisGrid and LIGSITE CSC in 3D-SURFER. This article describes how to use 3D-SURFER and EM-SURFER to carry out protein surface shape similarity searches, local geometric feature analysis, and interpretation of the search results. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  15. G2S: a web-service for annotating genomic variants on 3D protein structures.

    PubMed

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-06-01

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that supports programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online.

  16. Structure Prediction and Analysis of Neuraminidase Sequence Variants

    ERIC Educational Resources Information Center

    Thayer, Kelly M.

    2016-01-01

    Analyzing protein structure has become an integral aspect of understanding systems of biochemical import. The laboratory experiment endeavors to introduce protein folding to ascertain structures of proteins for which the structure is unavailable, as well as to critically evaluate the quality of the prediction obtained. The model system used is the…

  17. Phylogenetic analysis and protein structure modelling identifies distinct Ca(2+)/Cation antiporters and conservation of gene family structure within Arabidopsis and rice species.

    PubMed

    Pittman, Jon K; Hirschi, Kendal D

    2016-12-01

    The Ca(2+)/Cation Antiporter (CaCA) superfamily is an ancient and widespread family of ion-coupled cation transporters found in nearly all kingdoms of life. In animals, K(+)-dependent and K(+)-indendent Na(+)/Ca(2+) exchangers (NCKX and NCX) are important CaCA members. Recently it was proposed that all rice and Arabidopsis CaCA proteins should be classified as NCX proteins. Here we performed phylogenetic analysis of CaCA genes and protein structure homology modelling to further characterise members of this transporter superfamily. Phylogenetic analysis of rice and Arabidopsis CaCAs in comparison with selected CaCA members from non-plant species demonstrated that these genes form clearly distinct families, with the H(+)/Cation exchanger (CAX) and cation/Ca(2+) exchanger (CCX) families dominant in higher plants but the NCKX and NCX families absent. NCX-related Mg(2+)/H(+) exchanger (MHX) and CAX-related Na(+)/Ca(2+) exchanger-like (NCL) proteins are instead present. Analysis of genomes of ten closely-related rice species and four Arabidopsis-related species found that CaCA gene family structures are highly conserved within related plants, apart from minor variation. Protein structures were modelled for OsCAX1a and OsMHX1. Despite exhibiting broad structural conservation, there are clear structural differences observed between the different CaCA types. Members of the CaCA superfamily form clearly distinct families with different phylogenetic, structural and functional characteristics, and therefore should not be simply classified as NCX proteins, which should remain as a separate gene family.

  18. Crysalis: an integrated server for computational analysis and design of protein crystallization.

    PubMed

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I; Lin, Donghai; Song, Jiangning

    2016-02-24

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/.

  19. Crysalis: an integrated server for computational analysis and design of protein crystallization

    PubMed Central

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I.; Lin, Donghai; Song, Jiangning

    2016-01-01

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/. PMID:26906024

  20. An ambiguity principle for assigning protein structural domains

    PubMed Central

    Postic, Guillaume; Ghouzam, Yassine; Chebrek, Romain; Gelly, Jean-Christophe

    2017-01-01

    Ambiguity is the quality of being open to several interpretations. For an image, it arises when the contained elements can be delimited in two or more distinct ways, which may cause confusion. We postulate that it also applies to the analysis of protein three-dimensional structure, which consists in dividing the molecule into subunits called domains. Because different definitions of what constitutes a domain can be used to partition a given structure, the same protein may have different but equally valid domain annotations. However, knowledge and experience generally displace our ability to accept more than one way to decompose the structure of an object—in this case, a protein. This human bias in structure analysis is particularly harmful because it leads to ignoring potential avenues of research. We present an automated method capable of producing multiple alternative decompositions of protein structure (web server and source code available at www.dsimb.inserm.fr/sword/). Our innovative algorithm assigns structural domains through the hierarchical merging of protein units, which are evolutionarily preserved substructures that describe protein architecture at an intermediate level, between domain and secondary structure. To validate the use of these protein units for decomposing protein structures into domains, we set up an extensive benchmark made of expert annotations of structural domains and including state-of-the-art domain parsing algorithms. The relevance of our “multipartitioning” approach is shown through numerous examples of applications covering protein function, evolution, folding, and structure prediction. Finally, we introduce a measure for the structural ambiguity of protein molecules. PMID:28097215

  1. Structural genomics analysis of uncharacterized protein families overrepresented in human gut bacteria identifies a novel glycoside hydrolase

    PubMed Central

    2014-01-01

    Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328

  2. Restricted mobility of side chains on concave surfaces of solenoid proteins may impart heightened potential for intermolecular interactions.

    PubMed

    Ramya, L; Gautham, N; Chaloin, Laurent; Kajava, Andrey V

    2015-09-01

    Significant progress has been made in the determination of the protein structures with their number today passing over a hundred thousand structures. The next challenge is the understanding and prediction of protein-protein and protein-ligand interactions. In this work we address this problem by analyzing curved solenoid proteins. Many of these proteins are considered as "hub molecules" for their high potential to interact with many different molecules and to be a scaffold for multisubunit protein machineries. Our analysis of these structures through molecular dynamics simulations reveals that the mobility of the side-chains on the concave surfaces of the solenoids is lower than on the convex ones. This result provides an explanation to the observed preferential binding of the ligands, including small and flexible ligands, to the concave surface of the curved solenoid proteins. The relationship between the landscapes and dynamic properties of the protein surfaces can be further generalized to the other types of protein structures and eventually used in the computer algorithms, allowing prediction of protein-ligand interactions by analysis of protein surfaces. © 2015 Wiley Periodicals, Inc.

  3. Computational Analysis Reveals the Association of Threonine 118 Methionine Mutation in PMP22 Resulting in CMT-1A

    PubMed Central

    Swetha, Rayapadi G.

    2014-01-01

    The T118M mutation in PMP22 gene is associated with Charcot Marie Tooth, type 1A (CMT1A). CMT1A is a form of Charcot-Marie-Tooth disease, the most common inherited disorder of the peripheral nervous system. Mutations in CMT related disorder are seen to increase the stability of the protein resulting in the diseased state. We performed SNP analysis for all the nsSNPs of PMP22 protein and carried out molecular dynamics simulation for T118M mutation to compare the stability difference between the wild type protein structure and the mutant protein structure. The mutation T118M resulted in the overall increase in the stability of the mutant protein. The superimposed structure shows marked structural variation between the wild type and the mutant protein structures. PMID:25400662

  4. Design and structure of an equilibrium protein folding intermediate: a hint into dynamical regions of proteins.

    PubMed

    Ayuso-Tejedor, Sara; Angarica, Vladimir Espinosa; Bueno, Marta; Campos, Luis A; Abián, Olga; Bernadó, Pau; Sancho, Javier; Jiménez, M Angeles

    2010-07-23

    Partly unfolded protein conformations close to the native state may play important roles in protein function and in protein misfolding. Structural analyses of such conformations which are essential for their fully physicochemical understanding are complicated by their characteristic low populations at equilibrium. We stabilize here with a single mutation the equilibrium intermediate of apoflavodoxin thermal unfolding and determine its solution structure by NMR. It consists of a large native region identical with that observed in the X-ray structure of the wild-type protein plus an unfolded region. Small-angle X-ray scattering analysis indicates that the calculated ensemble of structures is consistent with the actual degree of expansion of the intermediate. The unfolded region encompasses discontinuous sequence segments that cluster in the 3D structure of the native protein forming the FMN cofactor binding loops and the binding site of a variety of partner proteins. Analysis of the apoflavodoxin inner interfaces reveals that those becoming destabilized in the intermediate are more polar than other inner interfaces of the protein. Natively folded proteins contain hydrophobic cores formed by the packing of hydrophobic surfaces, while natively unfolded proteins are rich in polar residues. The structure of the apoflavodoxin thermal intermediate suggests that the regions of natively folded proteins that are easily responsive to thermal activation may contain cores of intermediate hydrophobicity. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  5. Secure web book to store structural genomics research data.

    PubMed

    Manjasetty, Babu A; Höppner, Klaus; Mueller, Uwe; Heinemann, Udo

    2003-01-01

    Recently established collaborative structural genomics programs aim at significantly accelerating the crystal structure analysis of proteins. These large-scale projects require efficient data management systems to ensure seamless collaboration between different groups of scientists working towards the same goal. Within the Berlin-based Protein Structure Factory, the synchrotron X-ray data collection and the subsequent crystal structure analysis tasks are located at BESSY, a third-generation synchrotron source. To organize file-based communication and data transfer at the BESSY site of the Protein Structure Factory, we have developed the web-based BCLIMS, the BESSY Crystallography Laboratory Information Management System. BCLIMS is a relational data management system which is powered by MySQL as the database engine and Apache HTTP as the web server. The database interface routines are written in Python programing language. The software is freely available to academic users. Here we describe the storage, retrieval and manipulation of laboratory information, mainly pertaining to the synchrotron X-ray diffraction experiments and the subsequent protein structure analysis, using BCLIMS.

  6. BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

    PubMed

    Rodriguez, Abel; Schmidler, Scott C

    The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.

  7. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    PubMed

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osipiuk, J.; Gornicki, P.; Maj, L.

    The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 Angstroms. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 Angstroms from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer {alpha}/{beta} sandwich with the overall shape of a cylinder and shows no structural homology to proteins of knownmore » structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the {alpha}-{beta} plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.« less

  9. Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold.

    PubMed

    Osipiuk, J; Górnicki, P; Maj, L; Dementieva, I; Laskowski, R; Joachimiak, A

    2001-11-01

    The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 A. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 A from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer alpha/beta sandwich with the overall shape of a cylinder and shows no structural homology to proteins of known structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the alpha-beta plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.

  10. Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities

    PubMed Central

    Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert

    2014-01-01

    Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327

  11. Functional evolution of PLP-dependent enzymes based on active-site structural similarities.

    PubMed

    Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert

    2014-10-01

    Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.

  12. High-Throughput Characterization of Intrinsic Disorder in Proteins from the Protein Structure Initiative

    PubMed Central

    Johnson, Derrick E.; Xue, Bin; Sickmeier, Megan D.; Meng, Jingwei; Cortese, Marc S.; Oldfield, Christopher J.; Le Gall, Tanguy; Dunker, A. Keith; Uversky, Vladimir N.

    2012-01-01

    The identification of intrinsically disordered proteins (IDPs) among the targets that fail to form satisfactory crystal structures in the Protein Structure Initiative represent a key to reducing the costs and time for determining three-dimensional structures of proteins. To help in this endeavor, several Protein Structure Initiative Centers were asked to send samples of both crystallizable proteins and proteins that failed to crystallize. The abundance of intrinsic disorder in these proteins was evaluated via computational analysis using Predictors of Natural Disordered Regions (PONDR®) and the potential cleavage sites and corresponding fragments were determined. Then, the target proteins were analyzed for intrinsic disorder by their resistance to limited proteolysis. The rates of tryptic digestion of sample target proteins were compared to those of lysozyme/myoglobin, apo-myoglobin and α-casein as standards of ordered, partially disordered and completely disordered proteins, respectively. At the next stage, the protein samples were subjected to both far-UV and near-UV circular dichroism (CD) analysis. For most of the samples, a good agreement between CD data, predictions of disorder and the rates of limited tryptic digestion was established. Further experimentation is being performed on a smaller subset of these samples in order to obtain more detailed information on the ordered/disordered nature of the proteins. PMID:22651963

  13. Crystal structure of the YDR533c S. cerevisiae protein, a class II member of the Hsp31 family.

    PubMed

    Graille, Marc; Quevillon-Cheruel, Sophie; Leulliot, Nicolas; Zhou, Cong-Zhao; Li de la Sierra Gallay, Ines; Jacquamet, Lilian; Ferrer, Jean-Luc; Liger, Dominique; Poupon, Anne; Janin, Joel; van Tilbeurgh, Herman

    2004-05-01

    The ORF YDR533c from Saccharomyces cerevisiae codes for a 25.5 kDa protein of unknown biochemical function. Transcriptome analysis of yeast has shown that this gene is activated in response to various stress conditions together with proteins belonging to the heat shock family. In order to clarify its biochemical function, we determined the crystal structure of YDR533c to 1.85 A resolution by the single anomalous diffraction method. The protein possesses an alpha/beta hydrolase fold and a putative Cys-His-Glu catalytic triad common to a large enzyme family containing proteases, amidotransferases, lipases, and esterases. The protein has strong structural resemblance with the E. coli Hsp31 protein and the intracellular protease I from Pyrococcus horikoshii, which are considered class I and class III members of the Hsp31 family, respectively. Detailed structural analysis strongly suggests that the YDR533c protein crystal structure is the first one of a class II member of the Hsp31 family.

  14. European Science Notes. Volume 40, Number 3.

    DTIC Science & Technology

    1986-03-01

    to protein structures analysis and the UK Institute in Protein Engineering are discussed. Material 9ciences 9cole des Mine de Paris--France’s Premier...ellipsometry and for network analysis tation a.v.); (4) development of a meth- based on a microcomputer. A current R&D od for the rapid production of monoclon...Engineering, Cornell University, Ithaca, New York. Structure Analysis in Protein Engineering, K.M. Ulmer, University of Maryland, Adelphi, Maryland

  15. Zebra: a web server for bioinformatic analysis of diverse protein families.

    PubMed

    Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas

    2014-01-01

    During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .

  16. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

    PubMed

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.

  17. Structural domains and main-chain flexibility in prion proteins.

    PubMed

    Blinov, N; Berjanskii, M; Wishart, D S; Stepanova, M

    2009-02-24

    In this study we describe a novel approach to define structural domains and to characterize the local flexibility in both human and chicken prion proteins. The approach we use is based on a comprehensive theory of collective dynamics in proteins that was recently developed. This method determines the essential collective coordinates, which can be found from molecular dynamics trajectories via principal component analysis. Under this particular framework, we are able to identify the domains where atoms move coherently while at the same time to determine the local main-chain flexibility for each residue. We have verified this approach by comparing our results for the predicted dynamic domain systems with the computed main-chain flexibility profiles and the NMR-derived random coil indexes for human and chicken prion proteins. The three sets of data show excellent agreement. Additionally, we demonstrate that the dynamic domains calculated in this fashion provide a highly sensitive measure of protein collective structure and dynamics. Furthermore, such an analysis is capable of revealing structural and dynamic properties of proteins that are inaccessible to the conventional assessment of secondary structure. Using the collective dynamic simulation approach described here along with a high-temperature simulations of unfolding of human prion protein, we have explored whether locations of relatively low stability could be identified where the unfolding process could potentially be facilitated. According to our analysis, the locations of relatively low stability may be associated with the beta-sheet formed by strands S1 and S2 and the adjacent loops, whereas helix HC appears to be a relatively stable part of the protein. We suggest that this kind of structural analysis may provide a useful background for a more quantitative assessment of potential routes of spontaneous misfolding in prion proteins.

  18. A General Method for Targeted Quantitative Cross-Linking Mass Spectrometry.

    PubMed

    Chavez, Juan D; Eng, Jimmy K; Schweppe, Devin K; Cilia, Michelle; Rivera, Keith; Zhong, Xuefei; Wu, Xia; Allen, Terrence; Khurgel, Moshe; Kumar, Akhilesh; Lampropoulos, Athanasios; Larsson, Mårten; Maity, Shuvadeep; Morozov, Yaroslav; Pathmasiri, Wimal; Perez-Neut, Mathew; Pineyro-Ruiz, Coriness; Polina, Elizabeth; Post, Stephanie; Rider, Mark; Tokmina-Roszyk, Dorota; Tyson, Katherine; Vieira Parrine Sant'Ana, Debora; Bruce, James E

    2016-01-01

    Chemical cross-linking mass spectrometry (XL-MS) provides protein structural information by identifying covalently linked proximal amino acid residues on protein surfaces. The information gained by this technique is complementary to other structural biology methods such as x-ray crystallography, NMR and cryo-electron microscopy[1]. The extension of traditional quantitative proteomics methods with chemical cross-linking can provide information on the structural dynamics of protein structures and protein complexes. The identification and quantitation of cross-linked peptides remains challenging for the general community, requiring specialized expertise ultimately limiting more widespread adoption of the technique. We describe a general method for targeted quantitative mass spectrometric analysis of cross-linked peptide pairs. We report the adaptation of the widely used, open source software package Skyline, for the analysis of quantitative XL-MS data as a means for data analysis and sharing of methods. We demonstrate the utility and robustness of the method with a cross-laboratory study and present data that is supported by and validates previously published data on quantified cross-linked peptide pairs. This advance provides an easy to use resource so that any lab with access to a LC-MS system capable of performing targeted quantitative analysis can quickly and accurately measure dynamic changes in protein structure and protein interactions.

  19. A generalized analysis of hydrophobic and loop clusters within globular protein sequences

    PubMed Central

    Eudes, Richard; Le Tuan, Khanh; Delettré, Jean; Mornon, Jean-Paul; Callebaut, Isabelle

    2007-01-01

    Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. PMID:17210072

  20. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    PubMed

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  1. Structural changes in gluten protein structure after addition of emulsifier. A Raman spectroscopy study

    NASA Astrophysics Data System (ADS)

    Ferrer, Evelina G.; Gómez, Analía V.; Añón, María C.; Puppo, María C.

    2011-06-01

    Food protein product, gluten protein, was chemically modified by varying levels of sodium stearoyl lactylate (SSL); and the extent of modifications (secondary and tertiary structures) of this protein was analyzed by using Raman spectroscopy. Analysis of the Amide I band showed an increase in its intensity mainly after the addition of the 0.25% of SSL to wheat flour to produced modified gluten protein, pointing the formation of a more ordered structure. Side chain vibrations also confirmed the observed changes.

  2. Local-global alignment for finding 3D similarities in protein structures

    DOEpatents

    Zemla, Adam T [Brentwood, CA

    2011-09-20

    A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.

  3. POLYVIEW-MM: web-based platform for animation and analysis of molecular simulations

    PubMed Central

    Porollo, Aleksey; Meller, Jaroslaw

    2010-01-01

    Molecular simulations offer important mechanistic and functional clues in studies of proteins and other macromolecules. However, interpreting the results of such simulations increasingly requires tools that can combine information from multiple structural databases and other web resources, and provide highly integrated and versatile analysis tools. Here, we present a new web server that integrates high-quality animation of molecular motion (MM) with structural and functional analysis of macromolecules. The new tool, dubbed POLYVIEW-MM, enables animation of trajectories generated by molecular dynamics and related simulation techniques, as well as visualization of alternative conformers, e.g. obtained as a result of protein structure prediction methods or small molecule docking. To facilitate structural analysis, POLYVIEW-MM combines interactive view and analysis of conformational changes using Jmol and its tailored extensions, publication quality animation using PyMol, and customizable 2D summary plots that provide an overview of MM, e.g. in terms of changes in secondary structure states and relative solvent accessibility of individual residues in proteins. Furthermore, POLYVIEW-MM integrates visualization with various structural annotations, including automated mapping of known inter-action sites from structural homologs, mapping of cavities and ligand binding sites, transmembrane regions and protein domains. URL: http://polyview.cchmc.org/conform.html. PMID:20504857

  4. In silico analysis of fragile histidine triad involved in regression of carcinoma.

    PubMed

    Rasheed, Muhammad Asif; Tariq, Fatima; Afzal, Sara; Mannanv, Shazia

    2017-04-01

    Hepatocellular carcinoma (HCCa) is a primary malignancy of the liver. Many different proteins are involved in HCCa including insulin growth factor (IGF) II , signal transducers and activators of transcription (STAT) 3, STAT4, mothers against decapentaplegic homolog 4 (SMAD 4), fragile histidine triad (FHIT) and selective internal radiation therapy (SIRT) etc. The present study is based on the bioinformatics analysis of FHIT protein in order to understand the proteomics aspect and improvement of the diagnosis of the disease based on the protein. Different information related to protein were gathered from different databases, including National Centre for Biotechnology Information (NCBI) Gene, Protein and Online Mendelian Inheritance in Man (OMIM) databases, Uniprot database, String database and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Moreover, the structure of the protein and evaluation of the quality of the structure were included from Easy modeler programme. Hence, this analysis not only helped to gather information related to the protein at one place, but also analysed the structure and quality of the protein to conclude that the protein has a role in carcinoma.

  5. Analysis of sequence repeats of proteins in the PDB.

    PubMed

    Mary Rajathei, David; Selvaraj, Samuel

    2013-12-01

    Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Proteomic analysis of bovine nucleolus.

    PubMed

    Patel, Amrutlal K; Olson, Doug; Tikoo, Suresh K

    2010-09-01

    Nucleolus is the most prominent subnuclear structure, which performs a wide variety of functions in the eukaryotic cellular processes. In order to understand the structural and functional role of the nucleoli in bovine cells, we analyzed the proteomic composition of the bovine nucleoli. The nucleoli were isolated from Madin Darby bovine kidney cells and subjected to proteomic analysis by LC-MS/MS after fractionation by SDS-PAGE and strong cation exchange chromatography. Analysis of the data using the Mascot database search and the GPM database search identified 311 proteins in the bovine nucleoli, which contained 22 proteins previously not identified in the proteomic analysis of human nucleoli. Analysis of the identified proteins using the GoMiner software suggested that the bovine nucleoli contained proteins involved in ribosomal biogenesis, cell cycle control, transcriptional, translational and post-translational regulation, transport, and structural organization. Copyright © 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.

  7. Structural analysis of herpes simplex virus by optical super-resolution imaging

    NASA Astrophysics Data System (ADS)

    Laine, Romain F.; Albecka, Anna; van de Linde, Sebastian; Rees, Eric J.; Crump, Colin M.; Kaminski, Clemens F.

    2015-01-01

    Herpes simplex virus type-1 (HSV-1) is one of the most widespread pathogens among humans. Although the structure of HSV-1 has been extensively investigated, the precise organization of tegument and envelope proteins remains elusive. Here we use super-resolution imaging by direct stochastic optical reconstruction microscopy (dSTORM) in combination with a model-based analysis of single-molecule localization data, to determine the position of protein layers within virus particles. We resolve different protein layers within individual HSV-1 particles using multi-colour dSTORM imaging and discriminate envelope-anchored glycoproteins from tegument proteins, both in purified virions and in virions present in infected cells. Precise characterization of HSV-1 structure was achieved by particle averaging of purified viruses and model-based analysis of the radial distribution of the tegument proteins VP16, VP1/2 and pUL37, and envelope protein gD. From this data, we propose a model of the protein organization inside the tegument.

  8. WONKA: objective novel complex analysis for ensembles of protein-ligand structures.

    PubMed

    Bradley, A R; Wall, I D; von Delft, F; Green, D V S; Deane, C M; Marsden, B D

    2015-10-01

    WONKA is a tool for the systematic analysis of an ensemble of protein-ligand structures. It makes the identification of conserved and unusual features within such an ensemble straightforward. WONKA uses an intuitive workflow to process structural co-ordinates. Ligand and protein features are summarised and then presented within an interactive web application. WONKA's power in consolidating and summarising large amounts of data is described through the analysis of three bromodomain datasets. Furthermore, and in contrast to many current methods, WONKA relates analysis to individual ligands, from which we find unusual and erroneous binding modes. Finally the use of WONKA as an annotation tool to share observations about structures is demonstrated. WONKA is freely available to download and install locally or can be used online at http://wonka.sgc.ox.ac.uk.

  9. Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling.

    PubMed

    Wang, Juexin; Luttrell, Joseph; Zhang, Ning; Khan, Saad; Shi, NianQing; Wang, Michael X; Kang, Jing-Qiong; Wang, Zheng; Xu, Dong

    2016-01-01

    Protein structure prediction and modeling provide a tool for understanding protein functions by computationally constructing protein structures from amino acid sequences and analyzing them. With help from protein prediction tools and web servers, users can obtain the three-dimensional protein structure models and gain knowledge of functions from the proteins. In this chapter, we will provide several examples of such studies. As an example, structure modeling methods were used to investigate the relation between mutation-caused misfolding of protein and human diseases including epilepsy and leukemia. Protein structure prediction and modeling were also applied in nucleotide-gated channels and their interaction interfaces to investigate their roles in brain and heart cells. In molecular mechanism studies of plants, rice salinity tolerance mechanism was studied via structure modeling on crucial proteins identified by systems biology analysis; trait-associated protein-protein interactions were modeled, which sheds some light on the roles of mutations in soybean oil/protein content. In the age of precision medicine, we believe protein structure prediction and modeling will play more and more important roles in investigating biomedical mechanism of diseases and drug design.

  10. Functional correlation of bacterial LuxS with their quaternary associations: interface analysis of the structure networks

    PubMed Central

    Bhattacharyya, Moitrayee; Vishveshwara, Saraswathi

    2009-01-01

    Background The genome of a wide variety of prokaryotes contains the luxS gene homologue, which encodes for the protein S-ribosylhomocysteinelyase (LuxS). This protein is responsible for the production of the quorum sensing molecule, AI-2 and has been implicated in a variety of functions such as flagellar motility, metabolic regulation, toxin production and even in pathogenicity. A high structural similarity is present in the LuxS structures determined from a few species. In this study, we have modelled the structures from several other species and have investigated their dimer interfaces. We have attempted to correlate the interface features of LuxS with the phenotypic nature of the organisms. Results The protein structure networks (PSN) are constructed and graph theoretical analysis is performed on the structures obtained from X-ray crystallography and on the modelled ones. The interfaces, which are known to contain the active site, are characterized from the PSNs of these homodimeric proteins. The key features presented by the protein interfaces are investigated for the classification of the proteins in relation to their function. From our analysis, structural interface motifs are identified for each class in our dataset, which showed distinctly different pattern at the interface of LuxS for the probiotics and some extremophiles. Our analysis also reveals potential sites of mutation and geometric patterns at the interface that was not evident from conventional sequence alignment studies. Conclusion The structure network approach employed in this study for the analysis of dimeric interfaces in LuxS has brought out certain structural details at the side-chain interaction level, which were elusive from the conventional structure comparison methods. The results from this study provide a better understanding of the relation between the luxS gene and its functional role in the prokaryotes. This study also makes it possible to explore the potential direction towards the design of inhibitors of LuxS and thus towards a wide range of antimicrobials. PMID:19243584

  11. Computational and Statistical Analyses of Amino Acid Usage and Physico-Chemical Properties of the Twelve Late Embryogenesis Abundant Protein Classes

    PubMed Central

    Jaspard, Emmanuel; Macherel, David; Hunault, Gilles

    2012-01-01

    Late Embryogenesis Abundant Proteins (LEAPs) are ubiquitous proteins expected to play major roles in desiccation tolerance. Little is known about their structure - function relationships because of the scarcity of 3-D structures for LEAPs. The previous building of LEAPdb, a database dedicated to LEAPs from plants and other organisms, led to the classification of 710 LEAPs into 12 non-overlapping classes with distinct properties. Using this resource, numerous physico-chemical properties of LEAPs and amino acid usage by LEAPs have been computed and statistically analyzed, revealing distinctive features for each class. This unprecedented analysis allowed a rigorous characterization of the 12 LEAP classes, which differed also in multiple structural and physico-chemical features. Although most LEAPs can be predicted as intrinsically disordered proteins, the analysis indicates that LEAP class 7 (PF03168) and probably LEAP class 11 (PF04927) are natively folded proteins. This study thus provides a detailed description of the structural properties of this protein family opening the path toward further LEAP structure - function analysis. Finally, since each LEAP class can be clearly characterized by a unique set of physico-chemical properties, this will allow development of software to predict proteins as LEAPs. PMID:22615859

  12. Role of indirect readout mechanism in TATA box binding protein-DNA interaction.

    PubMed

    Mondal, Manas; Choudhury, Devapriya; Chakrabarti, Jaydeb; Bhattacharyya, Dhananjay

    2015-03-01

    Gene expression generally initiates from recognition of TATA-box binding protein (TBP) to the minor groove of DNA of TATA box sequence where the DNA structure is significantly different from B-DNA. We have carried out molecular dynamics simulation studies of TBP-DNA system to understand how the DNA structure alters for efficient binding. We observed rigid nature of the protein while the DNA of TATA box sequence has an inherent flexibility in terms of bending and minor groove widening. The bending analysis of the free DNA and the TBP bound DNA systems indicate presence of some similar structures. Principal coordinate ordination analysis also indicates some structural features of the protein bound and free DNA are similar. Thus we suggest that the DNA of TATA box sequence regularly oscillates between several alternate structures and the one suitable for TBP binding is induced further by the protein for proper complex formation.

  13. PROFESS: a PROtein Function, Evolution, Structure and Sequence database

    PubMed Central

    Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter

    2010-01-01

    The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718

  14. Tertiary structural propensities reveal fundamental sequence/structure relationships.

    PubMed

    Zheng, Fan; Zhang, Jian; Grigoryan, Gevorg

    2015-05-05

    Extracting useful generalizations from the continually growing Protein Data Bank (PDB) is of central importance. We hypothesize that the PDB contains valuable quantitative information on the level of local tertiary structural motifs (TERMs). We show that by breaking a protein structure into its constituent TERMs, and querying the PDB to characterize the natural ensemble matching each, we can estimate the compatibility of the structure with a given amino acid sequence through a metric we term "structure score." Considering submissions from recent Critical Assessment of Structure Prediction (CASP) experiments, we found a strong correlation (R = 0.69) between structure score and model accuracy, with poorly predicted regions readily identifiable. This performance exceeds that of leading atomistic statistical energy functions. Furthermore, TERM-based analysis of two prototypical multi-state proteins rapidly produced structural insights fully consistent with prior extensive experimental studies. We thus find that TERM-based analysis should have considerable utility for protein structural biology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Phylogenetic continuum indicates "galaxies" in the protein universe: preliminary results on the natural group structures of proteins.

    PubMed

    Ladunga, I

    1992-04-01

    The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.

  16. A comprehensive analysis of the Omp85/TpsB protein superfamily structural diversity, taxonomic occurrence, and evolution

    PubMed Central

    Heinz, Eva; Lithgow, Trevor

    2014-01-01

    Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins (e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins. PMID:25101071

  17. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  18. Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.

    PubMed

    Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino

    2017-01-10

    In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.

  19. Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus

    PubMed Central

    Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.

    2001-01-01

    Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505

  20. Structure-functional prediction and analysis of cancer mutation effects in protein kinases.

    PubMed

    Dixit, Anshuman; Verkhivker, Gennady M

    2014-01-01

    A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal "low" activity state to the "active" state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes.

  1. Identification of three critical regions within mouse interleukin 2 by fine structural deletion analysis.

    PubMed Central

    Zurawski, S M; Zurawski, G

    1988-01-01

    We have analyzed structure--function relationships of the protein hormone murine interleukin 2 by fine structural deletion mapping. A total of 130 deletion mutant proteins, together with some substitution and insertion mutant proteins, was expressed in Escherichia coli and analyzed for their ability to sustain the proliferation of a cloned murine T cell line. This analysis has permitted a functional map of the protein to be drawn and classifies five segments of the protein, which together contain 48% of the sequence, as unessential to the biological activity of the protein. A further 26% of the protein is classified as important, but not crucial, for the activity. Three regions, consisting of amino acids 32-35, 66-77 and 119-141 contain the remaining 26% of the protein and are critical to the biological activity of the protein. The functional map is discussed in the context of the possible role of the identified critical regions in the structure of the hormone and its binding to the interleukin 2 receptor complex. Images PMID:3261239

  2. Computational investigation of the HIV-1 Rev multimerization using molecular dynamics simulations and binding free energy calculations.

    PubMed

    Venken, Tom; Daelemans, Dirk; De Maeyer, Marc; Voet, Arnout

    2012-06-01

    The HIV Rev protein mediates the nuclear export of viral mRNA, and is thereby essential for the production of late viral proteins in the replication cycle. Rev forms a large organized multimeric protein-protein complex for proper functioning. Recently, the three-dimensional structures of a Rev dimer and tetramer have been resolved and provide the basis for a thorough structural analysis of the binding interaction. Here, molecular dynamics (MD) and binding free energy calculations were performed to elucidate the forces thriving dimerization and higher order multimerization of the Rev protein. It is found that despite the structural differences between each crystal structure, both display a similar behavior according to our calculations. Our analysis based on a molecular mechanics-generalized Born surface area (MM/GBSA) and a configurational entropy approach demonstrates that the higher order multimerization site is much weaker than the dimerization site. In addition, a quantitative hot spot analysis combined with a mutational analysis reveals the most contributing amino acid residues for protein interactions in agreement with experimental results. Additional residues were found in each interface, which are important for the protein interaction. The investigation of the thermodynamics of the Rev multimerization interactions performed here could be a further step in the development of novel antiretrovirals using structure based drug design. Moreover, the variability of the angle between each Rev monomer as measured during the MD simulations suggests a role of the Rev protein in allowing flexibility of the arginine rich domain (ARM) to accommodate RNA binding. Copyright © 2012 Wiley Periodicals, Inc.

  3. Structure-related statistical singularities along protein sequences: a correlation study.

    PubMed

    Colafranceschi, Mauro; Colosimo, Alfredo; Zbilut, Joseph P; Uversky, Vladimir N; Giuliani, Alessandro

    2005-01-01

    A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein-protein and protein-DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.

  4. Hot spot of structural ambivalence in prion protein revealed by secondary structure principal component analysis.

    PubMed

    Yamamoto, Norifumi

    2014-08-21

    The conformational conversion of proteins into an aggregation-prone form is a common feature of various neurodegenerative disorders including Alzheimer's, Huntington's, Parkinson's, and prion diseases. In the early stage of prion diseases, secondary structure conversion in prion protein (PrP) causing β-sheet expansion facilitates the formation of a pathogenic isoform with a high content of β-sheets and strong aggregation tendency to form amyloid fibrils. Herein, we propose a straightforward method to extract essential information regarding the secondary structure conversion of proteins from molecular simulations, named secondary structure principal component analysis (SSPCA). The definite existence of a PrP isoform with an increased β-sheet structure was confirmed in a free-energy landscape constructed by mapping protein structural data into a reduced space according to the principal components determined by the SSPCA. We suggest a "spot" of structural ambivalence in PrP-the C-terminal part of helix 2-that lacks a strong intrinsic secondary structure, thus promoting a partial α-helix-to-β-sheet conversion. This result is important to understand how the pathogenic conformational conversion of PrP is initiated in prion diseases. The SSPCA has great potential to solve various challenges in studying highly flexible molecular systems, such as intrinsically disordered proteins, structurally ambivalent peptides, and chameleon sequences.

  5. Domain Hierarchy and closed Loops (DHcL): a server for exploring hierarchy of protein domain structure

    PubMed Central

    Koczyk, Grzegorz; Berezovsky, Igor N.

    2008-01-01

    Domain hierarchy and closed loops (DHcL) (http://sitron.bccs.uib.no/dhcl/) is a web server that delineates energy hierarchy of protein domain structure and detects domains at different levels of this hierarchy. The server also identifies closed loops and van der Waals locks, which constitute a structural basis for the protein domain hierarchy. The DHcL can be a useful tool for an express analysis of protein structures and their alternative domain decompositions. The user submits a PDB identifier(s) or uploads a 3D protein structure in a PDB format. The results of the analysis are the location of domains at different levels of hierarchy, closed loops, van der Waals locks and their interactive visualization. The server maintains a regularly updated database of domains, closed loop and van der Waals locks for all X-ray structures in PDB. DHcL server is available at: http://sitron.bccs.uib.no/dhcl. PMID:18502776

  6. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  7. Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.

    PubMed

    Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai

    2015-12-01

    The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. External cavity-quantum cascade laser infrared spectroscopy for secondary structure analysis of proteins at low concentrations

    PubMed Central

    Schwaighofer, Andreas; Alcaráz, Mirta R.; Araman, Can; Goicoechea, Héctor; Lendl, Bernhard

    2016-01-01

    Fourier transform infrared (FTIR) and circular dichroism (CD) spectroscopy are analytical techniques employed for the analysis of protein secondary structure. The use of CD spectroscopy is limited to low protein concentrations (<2 mg ml−1), while FTIR spectroscopy is commonly used in a higher concentration range (>5 mg ml−1). Here we introduce a quantum cascade laser (QCL)-based IR transmission setup for analysis of protein and polypeptide secondary structure at concentrations as low as 0.25 mg ml−1 in deuterated buffer solution. We present dynamic QCL-IR spectra of the temperature-induced α-helix to β-sheet transition of poly-L-lysine. The concentration dependence of the α-β transition temperature between 0.25 and 10 mg ml−1 was investigated by QCL-IR, FTIR and CD spectroscopy. By using QCL-IR spectroscopy it is possible to perform IR spectroscopic analysis in the same concentration range as CD spectroscopy, thus enabling a combined analysis of biomolecules secondary structure by CD and IR spectroscopy. PMID:27633337

  9. External cavity-quantum cascade laser infrared spectroscopy for secondary structure analysis of proteins at low concentrations.

    PubMed

    Schwaighofer, Andreas; Alcaráz, Mirta R; Araman, Can; Goicoechea, Héctor; Lendl, Bernhard

    2016-09-16

    Fourier transform infrared (FTIR) and circular dichroism (CD) spectroscopy are analytical techniques employed for the analysis of protein secondary structure. The use of CD spectroscopy is limited to low protein concentrations (<2 mg ml(-1)), while FTIR spectroscopy is commonly used in a higher concentration range (>5 mg ml(-1)). Here we introduce a quantum cascade laser (QCL)-based IR transmission setup for analysis of protein and polypeptide secondary structure at concentrations as low as 0.25 mg ml(-1) in deuterated buffer solution. We present dynamic QCL-IR spectra of the temperature-induced α-helix to β-sheet transition of poly-L-lysine. The concentration dependence of the α-β transition temperature between 0.25 and 10 mg ml(-1) was investigated by QCL-IR, FTIR and CD spectroscopy. By using QCL-IR spectroscopy it is possible to perform IR spectroscopic analysis in the same concentration range as CD spectroscopy, thus enabling a combined analysis of biomolecules secondary structure by CD and IR spectroscopy.

  10. Thermostability of In Vitro Evolved Bacillus subtilis Lipase A: A Network and Dynamics Perspective

    PubMed Central

    Srivastava, Ashutosh; Sinha, Somdatta

    2014-01-01

    Proteins in thermophilic organisms remain stable and function optimally at high temperatures. Owing to their important applicability in many industrial processes, such thermostable proteins have been studied extensively, and several structural factors attributed to their enhanced stability. How these factors render the emergent property of thermostability to proteins, even in situations where no significant changes occur in their three-dimensional structures in comparison to their mesophilic counter-parts, has remained an intriguing question. In this study we treat Lipase A from Bacillus subtilis and its six thermostable mutants in a unified manner and address the problem with a combined complex network-based analysis and molecular dynamic studies to find commonality in their properties. The Protein Contact Networks (PCN) of the wild-type and six mutant Lipase A structures developed at a mesoscopic scale were analyzed at global network and local node (residue) level using network parameters and community structure analysis. The comparative PCN analysis of all proteins pointed towards important role of specific residues in the enhanced thermostability. Network analysis results were corroborated with finer-scale molecular dynamics simulations at both room and high temperatures. Our results show that this combined approach at two scales can uncover small but important changes in the local conformations that add up to stabilize the protein structure in thermostable mutants, even when overall conformation differences among them are negligible. Our analysis not only supports the experimentally determined stabilizing factors, but also unveils the important role of contacts, distributed throughout the protein, that lead to thermostability. We propose that this combined mesoscopic-network and fine-grained molecular dynamics approach is a convenient and useful scheme not only to study allosteric changes leading to protein stability in the face of negligible over-all conformational changes due to mutations, but also in other molecular networks where change in function does not accompany significant change in the network structure. PMID:25122499

  11. Protein flexibility in the light of structural alphabets

    PubMed Central

    Craveur, Pierrick; Joseph, Agnel P.; Esque, Jeremy; Narwani, Tarun J.; Noël, Floriane; Shinada, Nicolas; Goguet, Matthieu; Leonard, Sylvain; Poulain, Pierre; Bertrand, Olivier; Faure, Guilhem; Rebehmed, Joseph; Ghozlane, Amine; Swapna, Lakshmipuram S.; Bhaskara, Ramachandra M.; Barnoud, Jonathan; Téletchéa, Stéphane; Jallu, Vincent; Cerny, Jiri; Schneider, Bohdan; Etchebest, Catherine; Srinivasan, Narayanaswamy; Gelly, Jean-Christophe; de Brevern, Alexandre G.

    2015-01-01

    Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e., the classical secondary structures. More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs), have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures. Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g., B-factor) and computational methodology (e.g., Molecular Dynamic simulation), SAs turn out to be powerful tools to analyze protein dynamics, e.g., to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases. PMID:26075209

  12. Analysis of Structural Features Contributing to Weak Affinities of Ubiquitin/Protein Interactions.

    PubMed

    Cohen, Ariel; Rosenthal, Eran; Shifman, Julia M

    2017-11-10

    Ubiquitin is a small protein that enables one of the most common post-translational modifications, where the whole ubiquitin molecule is attached to various target proteins, forming mono- or polyubiquitin conjugations. As a prototypical multispecific protein, ubiquitin interacts non-covalently with a variety of proteins in the cell, including ubiquitin-modifying enzymes and ubiquitin receptors that recognize signals from ubiquitin-conjugated substrates. To enable recognition of multiple targets and to support fast dissociation from the ubiquitin modifying enzymes, ubiquitin/protein interactions are characterized with low affinities, frequently in the higher μM and lower mM range. To determine how structure encodes low binding affinity of ubiquitin/protein complexes, we analyzed structures of more than a hundred such complexes compiled in the Ubiquitin Structural Relational Database. We calculated various structure-based features of ubiquitin/protein binding interfaces and compared them to the same features of general protein-protein interactions (PPIs) with various functions and generally higher affinities. Our analysis shows that ubiquitin/protein binding interfaces on average do not differ in size and shape complementarity from interfaces of higher-affinity PPIs. However, they contain fewer favorable hydrogen bonds and more unfavorable hydrophobic/charge interactions. We further analyzed how binding interfaces change upon affinity maturation of ubiquitin toward its target proteins. We demonstrate that while different features are improved in different experiments, the majority of the evolved complexes exhibit better shape complementarity and hydrogen bond pattern compared to wild-type complexes. Our analysis helps to understand how low-affinity PPIs have evolved and how they could be converted into high-affinity PPIs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Small-angle X-Ray analysis of macromolecular structure: the structure of protein NS2 (NEP) in solution

    NASA Astrophysics Data System (ADS)

    Shtykova, E. V.; Bogacheva, E. N.; Dadinova, L. A.; Jeffries, C. M.; Fedorova, N. V.; Golovko, A. O.; Baratova, L. A.; Batishchev, O. V.

    2017-11-01

    A complex structural analysis of nuclear export protein NS2 (NEP) of influenza virus A has been performed using bioinformatics predictive methods and small-angle X-ray scattering data. The behavior of NEP molecules in a solution (their aggregation, oligomerization, and dissociation, depending on the buffer composition) has been investigated. It was shown that stable associates are formed even in a conventional aqueous salt solution at physiological pH value. For the first time we have managed to get NEP dimers in solution, to analyze their structure, and to compare the models obtained using the method of the molecular tectonics with the spatial protein structure predicted by us using the bioinformatics methods. The results of the study provide a new insight into the structural features of nuclear export protein NS2 (NEP) of the influenza virus A, which is very important for viral infection development.

  14. Rapid analysis of protein backbone resonance assignments using cryogenic probes, a distributed Linux-based computing architecture, and an integrated set of spectral analysis tools.

    PubMed

    Monleón, Daniel; Colson, Kimberly; Moseley, Hunter N B; Anklin, Clemens; Oswald, Robert; Szyperski, Thomas; Montelione, Gaetano T

    2002-01-01

    Rapid data collection, spectral referencing, processing by time domain deconvolution, peak picking and editing, and assignment of NMR spectra are necessary components of any efficient integrated system for protein NMR structure analysis. We have developed a set of software tools designated AutoProc, AutoPeak, and AutoAssign, which function together with the data processing and peak-picking programs NMRPipe and Sparky, to provide an integrated software system for rapid analysis of protein backbone resonance assignments. In this paper we demonstrate that these tools, together with high-sensitivity triple resonance NMR cryoprobes for data collection and a Linux-based computer cluster architecture, can be combined to provide nearly complete backbone resonance assignments and secondary structures (based on chemical shift data) for a 59-residue protein in less than 30 hours of data collection and processing time. In this optimum case of a small protein providing excellent spectra, extensive backbone resonance assignments could also be obtained using less than 6 hours of data collection and processing time. These results demonstrate the feasibility of high throughput triple resonance NMR for determining resonance assignments and secondary structures of small proteins, and the potential for applying NMR in large scale structural proteomics projects.

  15. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  16. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE PAGES

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  17. G23D: Online tool for mapping and visualization of genomic variants on 3D protein structures.

    PubMed

    Solomon, Oz; Kunik, Vered; Simon, Amos; Kol, Nitzan; Barel, Ortal; Lev, Atar; Amariglio, Ninette; Somech, Raz; Rechavi, Gidi; Eyal, Eran

    2016-08-26

    Evaluation of the possible implications of genomic variants is an increasingly important task in the current high throughput sequencing era. Structural information however is still not routinely exploited during this evaluation process. The main reasons can be attributed to the partial structural coverage of the human proteome and the lack of tools which conveniently convert genomic positions, which are the frequent output of genomic pipelines, to proteins and structure coordinates. We present G23D, a tool for conversion of human genomic coordinates to protein coordinates and protein structures. G23D allows mapping of genomic positions/variants on evolutionary related (and not only identical) protein three dimensional (3D) structures as well as on theoretical models. By doing so it significantly extends the space of variants for which structural insight is feasible. To facilitate interpretation of the variant consequence, pathogenic variants, functional sites and polymorphism sites are displayed on protein sequence and structure diagrams alongside the input variants. G23D also provides modeling of the mutant structure, analysis of intra-protein contacts and instant access to functional predictions and predictions of thermo-stability changes. G23D is available at http://www.sheba-cancer.org.il/G23D . G23D extends the fraction of variants for which structural analysis is applicable and provides better and faster accessibility for structural data to biologists and geneticists who routinely work with genomic information.

  18. Hekate: Software Suite for the Mass Spectrometric Analysis and Three-Dimensional Visualization of Cross-Linked Protein Samples

    PubMed Central

    2013-01-01

    Chemical cross-linking of proteins combined with mass spectrometry provides an attractive and novel method for the analysis of native protein structures and protein complexes. Analysis of the data however is complex. Only a small number of cross-linked peptides are produced during sample preparation and must be identified against a background of more abundant native peptides. To facilitate the search and identification of cross-linked peptides, we have developed a novel software suite, named Hekate. Hekate is a suite of tools that address the challenges involved in analyzing protein cross-linking experiments when combined with mass spectrometry. The software is an integrated pipeline for the automation of the data analysis workflow and provides a novel scoring system based on principles of linear peptide analysis. In addition, it provides a tool for the visualization of identified cross-links using three-dimensional models, which is particularly useful when combining chemical cross-linking with other structural techniques. Hekate was validated by the comparative analysis of cytochrome c (bovine heart) against previously reported data.1 Further validation was carried out on known structural elements of DNA polymerase III, the catalytic α-subunit of the Escherichia coli DNA replisome along with new insight into the previously uncharacterized C-terminal domain of the protein. PMID:24010795

  19. Key Structures and Interactions for Binding of Mycobacterium tuberculosis Protein Kinase B Inhibitors from Molecular Dynamics Simulation.

    PubMed

    Punkvang, Auradee; Kamsri, Pharit; Saparpakorn, Patchreenart; Hannongbua, Supa; Wolschann, Peter; Irle, Stephan; Pungpo, Pornpan

    2015-07-01

    Substituted aminopyrimidine inhibitors have recently been introduced as antituberculosis agents. These inhibitors show impressive activity against protein kinase B, a Ser/Thr protein kinase that is essential for cell growth of M. tuberculosis. However, up to now, X-ray structures of the protein kinase B enzyme complexes with the substituted aminopyrimidine inhibitors are currently unavailable. Consequently, structural details of their binding modes are questionable, prohibiting the structural-based design of more potent protein kinase B inhibitors in the future. Here, molecular dynamics simulations, in conjunction with molecular mechanics/Poisson-Boltzmann surface area binding free-energy analysis, were employed to gain insight into the complex structures of the protein kinase B inhibitors and their binding energetics. The complex structures obtained by the molecular dynamics simulations show binding free energies in good agreement with experiment. The detailed analysis of molecular dynamics results shows that Glu93, Val95, and Leu17 are key residues responsible to the binding of the protein kinase B inhibitors. The aminopyrazole group and the pyrimidine core are the crucial moieties of substituted aminopyrimidine inhibitors for interaction with the key residues. Our results provide a structural concept that can be used as a guide for the future design of protein kinase B inhibitors with highly increased antagonistic activity. © 2014 John Wiley & Sons A/S.

  20. Computational investigation of the human SOD1 mutant, Cys146Arg, that directs familial amyotrophic lateral sclerosis.

    PubMed

    Srinivasan, E; Rajasekaran, R

    2017-07-25

    The genetic substitution mutation of Cys146Arg in the SOD1 protein is predominantly found in the Japanese population suffering from familial amyotrophic lateral sclerosis (FALS). A complete study of the biophysical aspects of this particular missense mutation through conformational analysis and producing free energy landscapes could provide an insight into the pathogenic mechanism of ALS disease. In this study, we utilized general molecular dynamics simulations along with computational predictions to assess the structural characterization of the protein as well as the conformational preferences of monomeric wild type and mutant SOD1. Our static analysis, accomplished through multiple programs, predicted the deleterious and destabilizing effect of mutant SOD1. Subsequently, comparative molecular dynamic studies performed on the wild type and mutant SOD1 indicated a loss in the protein conformational stability and flexibility. We observed the mutational consequences not only in local but also in long-range variations in the structural properties of the SOD1 protein. Long-range intramolecular protein interactions decrease upon mutation, resulting in less compact structures in the mutant protein rather than in the wild type, suggesting that the mutant structures are less stable than the wild type SOD1. We also presented the free energy landscape to study the collective motion of protein conformations through principal component analysis for the wild type and mutant SOD1. Overall, the study assisted in revealing the cause of the structural destabilization and protein misfolding via structural characterization, secondary structure composition and free energy landscapes. Hence, the computational framework in our study provides a valuable direction for the search for the cure against fatal FALS.

  1. Molecular details of secretory phospholipase A2 from flax (Linum usitatissimum L.) provide insight into its structure and function.

    PubMed

    Gupta, Payal; Dash, Prasanta K

    2017-09-11

    Secretory phospholipase A 2 (sPLA 2 ) are low molecular weight proteins (12-18 kDa) involved in a suite of plant cellular processes imparting growth and development. With myriad roles in physiological and biochemical processes in plants, detailed analysis of sPLA 2 in flax/linseed is meagre. The present work, first in flax, embodies cloning, expression, purification and molecular characterisation of two distinct sPLA 2 s (I and II) from flax. PLA 2 activity of the cloned sPLA 2 s were biochemically assayed authenticating them as bona fide phospholipase A 2 . Physiochemical properties of both the sPLA 2 s revealed they are thermostable proteins requiring di-valent cations for optimum activity.While, structural analysis of both the proteins revealed deviations in the amino acid sequence at C- & N-terminal regions; hydropathic study revealed LusPLA 2 I as a hydrophobic protein and LusPLA 2 II as a hydrophilic protein. Structural analysis of flax sPLA 2 s revealed that secondary structure of both the proteins are dominated by α-helix followed by random coils. Modular superimposition of LusPLA 2 isoforms with rice sPLA 2 confirmed monomeric structural preservation among plant phospholipase A 2 and provided insight into structure of folded flax sPLA 2 s.

  2. Discovery of an Unexplored Protein Structural Scaffold of Serine Protease from Big Blue Octopus (Octopus cyanea): A New Prospective Lead Molecule.

    PubMed

    Panda, Subhamay; Kumari, Leena

    2017-01-01

    Serine proteases are a group of enzymes that hydrolyses the peptide bonds in proteins. In mammals, these enzymes help in the regulation of several major physiological functions such as digestion, blood clotting, responses of immune system, reproductive functions and the complement system. Serine proteases obtained from the venom of Octopodidae family is a relatively unexplored area of research. In the present work, we tried to effectively utilize comparative composite molecular modeling technique. Our key aim was to propose the first molecular model structure of unexplored serine protease 5 derived from big blue octopus. The other objective of this study was to analyze the distribution of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, hydrophobicity molecular surface analysis and electrostatic potential analysis with the aid of different bioinformatic tools. In the present study, molecular model has been generated with the help of I-TASSER suite. Afterwards the refined structural model was validated with standard methods. For functional annotation of protein molecule we used Protein Information Resource (PIR) database. Serine protease 5 of big blue octopus was analyzed with different bioinformatical algorithms for the distribution of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, hydrophobicity molecular surface analysis and electrostatic potential analysis. The functionally critical amino acids and ligand- binding site (LBS) of the proteins (modeled) were determined using the COACH program. The molecular model data in cooperation to other pertinent post model analysis data put forward molecular insight to proteolytic activity of serine protease 5, which helps in the clear understanding of procoagulant and anticoagulant characteristics of this natural lead molecule. Our approach was to investigate the octopus venom protein as a whole or a part of their structure that may result in the development of new lead molecule. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  3. Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

    PubMed

    Li, Zhan-Chao; Zhou, Xi-Bin; Dai, Zong; Zou, Xiao-Yong

    2009-07-01

    A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.

  4. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

    PubMed

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-12-27

    Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

  5. Quantitative Protein Topography Analysis and High-Resolution Structure Prediction Using Hydroxyl Radical Labeling and Tandem-Ion Mass Spectrometry (MS)*

    PubMed Central

    Kaur, Parminder; Kiselar, Janna; Yang, Sichun; Chance, Mark R.

    2015-01-01

    Hydroxyl radical footprinting based MS for protein structure assessment has the goal of understanding ligand induced conformational changes and macromolecular interactions, for example, protein tertiary and quaternary structure, but the structural resolution provided by typical peptide-level quantification is limiting. In this work, we present experimental strategies using tandem-MS fragmentation to increase the spatial resolution of the technique to the single residue level to provide a high precision tool for molecular biophysics research. Overall, in this study we demonstrated an eightfold increase in structural resolution compared with peptide level assessments. In addition, to provide a quantitative analysis of residue based solvent accessibility and protein topography as a basis for high-resolution structure prediction; we illustrate strategies of data transformation using the relative reactivity of side chains as a normalization strategy and predict side-chain surface area from the footprinting data. We tested the methods by examination of Ca+2-calmodulin showing highly significant correlations between surface area and side-chain contact predictions for individual side chains and the crystal structure. Tandem ion based hydroxyl radical footprinting-MS provides quantitative high-resolution protein topology information in solution that can fill existing gaps in structure determination for large proteins and macromolecular complexes. PMID:25687570

  6. 3Drefine: an interactive web server for efficient protein structure refinement

    PubMed Central

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-01-01

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. PMID:27131371

  7. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    PubMed Central

    Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting

    2016-01-01

    ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181

  8. Hydrophobic cluster analysis of G protein-coupled receptors: a powerful tool to derive structural and functional information from 2D-representation of protein sequences.

    PubMed

    Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A

    1993-01-01

    Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.

  9. Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS

    PubMed Central

    Li, Bi-Qing; Feng, Kai-Yan; Chen, Lei; Huang, Tao; Cai, Yu-Dong

    2012-01-01

    Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. PMID:22937126

  10. Structure-Functional Prediction and Analysis of Cancer Mutation Effects in Protein Kinases

    PubMed Central

    Dixit, Anshuman; Verkhivker, Gennady M.

    2014-01-01

    A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal “low” activity state to the “active” state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes. PMID:24817905

  11. Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.

    PubMed

    Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui

    2012-11-07

    RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.

  12. Identifying protein domains by global analysis of soluble fragment data.

    PubMed

    Bulloch, Esther M M; Kingston, Richard L

    2014-11-15

    The production and analysis of individual structural domains is a common strategy for studying large or complex proteins, which may be experimentally intractable in their full-length form. However, identifying domain boundaries is challenging if there is little structural information concerning the protein target. One experimental procedure for mapping domains is to screen a library of random protein fragments for solubility, since truncation of a domain will typically expose hydrophobic groups, leading to poor fragment solubility. We have coupled fragment solubility screening with global data analysis to develop an effective method for identifying structural domains within a protein. A gene fragment library is generated using mechanical shearing, or by uracil doping of the gene and a uracil-specific enzymatic digest. A split green fluorescent protein (GFP) assay is used to screen the corresponding protein fragments for solubility when expressed in Escherichia coli. The soluble fragment data are then analyzed using two complementary approaches. Fragmentation "hotspots" indicate possible interdomain regions. Clustering algorithms are used to group related fragments, and concomitantly predict domain location. The effectiveness of this Domain Seeking procedure is demonstrated by application to the well-characterized human protein p85α. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Predicted secondary structure similarity in the absence of primary amino acid sequence homology: hepatitis B virus open reading frames.

    PubMed Central

    Schaeffer, E; Sninsky, J J

    1984-01-01

    Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835

  14. Crystal structure of bacillus subtilis YdaF protein : a putative ribosomal N-acetyltransferase.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brunzelle, J. S.; Wu, R.; Korolev, S. V.

    2004-12-01

    Comparative sequence analysis suggests that the ydaF gene encodes a protein (YdaF) that functions as an N-acetyltransferase, more specifically, a ribosomal N-acetyltransferase. Sequence analysis using basic local alignment search tool (BLAST) suggests that YdaF belongs to a large family of proteins (199 proteins found in 88 unique species of bacteria, archaea, and eukaryotes). YdaF also belongs to the COG1670, which includes the Escherichia coli RimL protein that is known to acetylate ribosomal protein L12. N-acetylation (NAT) has been found in all kingdoms. NAT enzymes catalyze the transfer of an acetyl group from acetyl-CoA (AcCoA) to a primary amino group. Formore » example, NATs can acetylate the N-terminal {alpha}-amino group, the {epsilon}-amino group of lysine residues, aminoglycoside antibiotics, spermine/speridine, or arylalkylamines such as serotonin. The crystal structure of the alleged ribosomal NAT protein, YdaF, from Bacillus subtilis presented here was determined as a part of the Midwest Center for Structural Genomics. The structure maintains the conserved tertiary structure of other known NATs and a high sequence similarity in the presumed AcCoA binding pocket in spite of a very low overall level of sequence identity to other NATs of known structure.« less

  15. Network representation of protein interactions: Theory of graph description and analysis.

    PubMed

    Kurzbach, Dennis

    2016-09-01

    A methodological framework is presented for the graph theoretical interpretation of NMR data of protein interactions. The proposed analysis generalizes the idea of network representations of protein structures by expanding it to protein interactions. This approach is based on regularization of residue-resolved NMR relaxation times and chemical shift data and subsequent construction of an adjacency matrix that represents the underlying protein interaction as a graph or network. The network nodes represent protein residues. Two nodes are connected if two residues are functionally correlated during the protein interaction event. The analysis of the resulting network enables the quantification of the importance of each amino acid of a protein for its interactions. Furthermore, the determination of the pattern of correlations between residues yields insights into the functional architecture of an interaction. This is of special interest for intrinsically disordered proteins, since the structural (three-dimensional) architecture of these proteins and their complexes is difficult to determine. The power of the proposed methodology is demonstrated at the example of the interaction between the intrinsically disordered protein osteopontin and its natural ligand heparin. © 2016 The Protein Society.

  16. ProTSAV: A protein tertiary structure analysis and validation server.

    PubMed

    Singh, Ankita; Kaushik, Rahul; Mishra, Avinash; Shanker, Asheesh; Jayaram, B

    2016-01-01

    Quality assessment of predicted model structures of proteins is as important as the protein tertiary structure prediction. A highly efficient quality assessment of predicted model structures directs further research on function. Here we present a new server ProTSAV, capable of evaluating predicted model structures based on some popular online servers and standalone tools. ProTSAV furnishes the user with a single quality score in case of individual protein structure along with a graphical representation and ranking in case of multiple protein structure assessment. The server is validated on ~64,446 protein structures including experimental structures from RCSB and predicted model structures for CASP targets and from public decoy sets. ProTSAV succeeds in predicting quality of protein structures with a specificity of 100% and a sensitivity of 98% on experimentally solved structures and achieves a specificity of 88%and a sensitivity of 91% on predicted protein structures of CASP11 targets under 2Å.The server overcomes the limitations of any single server/method and is seen to be robust in helping in quality assessment. ProTSAV is freely available at http://www.scfbio-iitd.res.in/software/proteomics/protsav.jsp. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Protein Folding—How and Why: By Hydrogen Exchange, Fragment Separation, and Mass Spectrometry

    PubMed Central

    Englander, S. Walter; Mayne, Leland; Kan, Zhong-Yuan; Hu, Wenbing

    2017-01-01

    Advanced hydrogen exchange (HX) methodology can now determine the structure of protein folding intermediates and their progression in folding pathways. Key developments over time include the HX pulse labeling method with nuclear magnetic resonance analysis, development of the fragment separation method, the addition to it of mass spectrometric (MS) analysis, and recent improvements in the HX MS technique and data analysis. Also, the discovery of protein foldons and their role supplies an essential interpretive link. Recent work using HX pulse labeling with HX MS analysis finds that a number of proteins fold by stepping through a reproducible sequence of native-like intermediates in an ordered pathway. The stepwise nature of the pathway is dictated by the cooperative foldon unit construction of the protein. The pathway order is determined by a sequential stabilization principle; prior native-like structure guides the formation of adjacent native-like structure. This view does not match the funneled energy landscape paradigm of a very large number of folding tracks, which was framed before foldons were known. PMID:27145881

  18. Sequence-structure relationship study in all-α transmembrane proteins using an unsupervised learning approach.

    PubMed

    Esque, Jérémy; Urbain, Aurélie; Etchebest, Catherine; de Brevern, Alexandre G

    2015-11-01

    Transmembrane proteins (TMPs) are major drug targets, but the knowledge of their precise topology structure remains highly limited compared with globular proteins. In spite of the difficulties in obtaining their structures, an important effort has been made these last years to increase their number from an experimental and computational point of view. In view of this emerging challenge, the development of computational methods to extract knowledge from these data is crucial for the better understanding of their functions and in improving the quality of structural models. Here, we revisit an efficient unsupervised learning procedure, called Hybrid Protein Model (HPM), which is applied to the analysis of transmembrane proteins belonging to the all-α structural class. HPM method is an original classification procedure that efficiently combines sequence and structure learning. The procedure was initially applied to the analysis of globular proteins. In the present case, HPM classifies a set of overlapping protein fragments, extracted from a non-redundant databank of TMP 3D structure. After fine-tuning of the learning parameters, the optimal classification results in 65 clusters. They represent at best similar relationships between sequence and local structure properties of TMPs. Interestingly, HPM distinguishes among the resulting clusters two helical regions with distinct hydrophobic patterns. This underlines the complexity of the topology of these proteins. The HPM classification enlightens unusual relationship between amino acids in TMP fragments, which can be useful to elaborate new amino acids substitution matrices. Finally, two challenging applications are described: the first one aims at annotating protein functions (channel or not), the second one intends to assess the quality of the structures (X-ray or models) via a new scoring function deduced from the HPM classification.

  19. Raman Spectroscopy Adds Complementary Detail to the High-Resolution X-Ray Crystal Structure of Photosynthetic PsbP from Spinacia oleracea

    PubMed Central

    Lapkouski, Mikalai; Hofbauerova, Katerina; Sovova, Zofie; Ettrichova, Olga; González-Pérez, Sergio; Dulebo, Alexander; Kaftan, David; Kuta Smatanova, Ivana; Revuelta, Jose L.; Arellano, Juan B.; Carey, Jannette; Ettrich, Rüdiger

    2012-01-01

    Raman microscopy permits structural analysis of protein crystals in situ in hanging drops, allowing for comparison with Raman measurements in solution. Nevertheless, the two methods sometimes reveal subtle differences in structure that are often ascribed to the water layer surrounding the protein. The novel method of drop-coating deposition Raman spectropscopy (DCDR) exploits an intermediate phase that, although nominally “dry,” has been shown to preserve protein structural features present in solution. The potential of this new approach to bridge the structural gap between proteins in solution and in crystals is explored here with extrinsic protein PsbP of photosystem II from Spinacia oleracea. In the high-resolution (1.98 Å) x-ray crystal structure of PsbP reported here, several segments of the protein chain are present but unresolved. Analysis of the three kinds of Raman spectra of PsbP suggests that most of the subtle differences can indeed be attributed to the water envelope, which is shown here to have a similar Raman intensity in glassy and crystal states. Using molecular dynamics simulations cross-validated by Raman solution data, two unresolved segments of the PsbP crystal structure were modeled as loops, and the amino terminus was inferred to contain an additional beta segment. The complete PsbP structure was compared with that of the PsbP-like protein CyanoP, which plays a more peripheral role in photosystem II function. The comparison suggests possible interaction surfaces of PsbP with higher-plant photosystem II. This work provides the first complete structural picture of this key protein, and it represents the first systematic comparison of Raman data from solution, glassy, and crystalline states of a protein. PMID:23071614

  20. Lessons on RNA Silencing Mechanisms in Plants from Eukaryotic Argonaute Structures[W

    PubMed Central

    Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter

    2013-01-01

    RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed. PMID:23303917

  1. Lessons on RNA silencing mechanisms in plants from eukaryotic argonaute structures.

    PubMed

    Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter

    2013-01-01

    RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed.

  2. Toward structure prediction of cyclic peptides.

    PubMed

    Yu, Hongtao; Lin, Yu-Shan

    2015-02-14

    Cyclic peptides are a promising class of molecules that can be used to target specific protein-protein interactions. A computational method to accurately predict their structures would substantially advance the development of cyclic peptides as modulators of protein-protein interactions. Here, we develop a computational method that integrates bias-exchange metadynamics simulations, a Boltzmann reweighting scheme, dihedral principal component analysis and a modified density peak-based cluster analysis to provide a converged structural description for cyclic peptides. Using this method, we evaluate the performance of a number of popular protein force fields on a model cyclic peptide. All the tested force fields seem to over-stabilize the α-helix and PPII/β regions in the Ramachandran plot, commonly populated by linear peptides and proteins. Our findings suggest that re-parameterization of a force field that well describes the full Ramachandran plot is necessary to accurately model cyclic peptides.

  3. KFC Server: interactive forecasting of protein interaction hot spots.

    PubMed

    Darnell, Steven J; LeGault, Laura; Mitchell, Julie C

    2008-07-01

    The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model-a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein-protein or protein-DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org.

  4. Text Mining Improves Prediction of Protein Functional Sites

    PubMed Central

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  5. Right- and left-handed three-helix proteins. I. Experimental and simulation analysis of differences in folding and structure.

    PubMed

    Glyakina, Anna V; Pereyaslavets, Leonid B; Galzitskaya, Oxana V

    2013-09-01

    Despite the large number of publications on three-helix protein folding, there is no study devoted to the influence of handedness on the rate of three-helix protein folding. From the experimental studies, we make a conclusion that the left-handed three-helix proteins fold faster than the right-handed ones. What may explain this difference? An important question arising in this paper is whether the modeling of protein folding can catch the difference between the protein folding rates of proteins with similar structures but with different folding mechanisms. To answer this question, the folding of eight three-helix proteins (four right-handed and four left-handed), which are similar in size, was modeled using the Monte Carlo and dynamic programming methods. The studies allowed us to determine the orders of folding of the secondary-structure elements in these domains and amino acid residues which are important for the folding. The obtained data are in good correlation with each other and with the experimental data. Structural analysis of these proteins demonstrated that the left-handed domains have a lesser number of contacts per residue and a smaller radius of cross section than the right-handed domains. This may be one of the explanations of the observed fact. The same tendency is observed for the large dataset consisting of 332 three-helix proteins (238 right- and 94 left-handed). From our analysis, we found that the left-handed three-helix proteins have some less-dense packing that should result in faster folding for some proteins as compared to the case of right-handed proteins. Copyright © 2013 Wiley Periodicals, Inc.

  6. Structure and orientation of interfacial proteins determined by sum frequency generation vibrational spectroscopy: method and application.

    PubMed

    Ye, Shuji; Wei, Feng; Li, Hongchun; Tian, Kangzhen; Luo, Yi

    2013-01-01

    In situ and real-time characterization of molecular structures and orientation of proteins at interfaces is essential to understand the nature of interfacial protein interaction. Such work will undoubtedly provide important clues to control biointerface in a desired manner. Sum frequency generation vibrational spectroscopy (SFG-VS) has been demonstrated to be a powerful technique to study the interfacial structures and interactions at the molecular level. This paper first systematically introduced the methods for the calculation of the Raman polarizability tensor, infrared transition dipole moment, and SFG molecular hyperpolarizability tensor elements of proteins/peptides with the secondary structures of α-helix, 310-helix, antiparallel β-sheet, and parallel β-sheet, as well as the methodology to determine the orientation of interfacial protein secondary structures using SFG amide I spectra. After that, recent progresses on the determination of protein structure and orientation at different interfaces by SFG-VS were then reviewed, which provides a molecular-level understanding of the structures and interactions of interfacial proteins, specially understanding the nature of driving force behind such interactions. Although this review has focused on analysis of amide I spectra, it will be expected to offer a basic idea for the spectral analysis of amide III SFG signals and other complicated molecular systems such as RNA and DNA. Copyright © 2013 Elsevier Inc. All rights reserved.

  7. Vaccinia Virus Immunomodulator A46: A Lipid and Protein-Binding Scaffold for Sequestering Host TIR-Domain Proteins

    PubMed Central

    Radakovics, Katharina; Smith, Terry K.; Bobik, Nina; Round, Adam; Djinović-Carugo, Kristina; Usón, Isabel

    2016-01-01

    Vaccinia virus interferes with early events of the activation pathway of the transcriptional factor NF-kB by binding to numerous host TIR-domain containing adaptor proteins. We have previously determined the X-ray structure of the A46 C-terminal domain; however, the structure and function of the A46 N-terminal domain and its relationship to the C-terminal domain have remained unclear. Here, we biophysically characterize residues 1–83 of the N-terminal domain of A46 and present the X-ray structure at 1.55 Å. Crystallographic phases were obtained by a recently developed ab initio method entitled ARCIMBOLDO_BORGES that employs tertiary structure libraries extracted from the Protein Data Bank; data analysis revealed an all β-sheet structure. This is the first such structure solved by this method which should be applicable to any protein composed entirely of β-sheets. The A46(1–83) structure itself is a β-sandwich containing a co-purified molecule of myristic acid inside a hydrophobic pocket and represents a previously unknown lipid-binding fold. Mass spectrometry analysis confirmed the presence of long-chain fatty acids in both N-terminal and full-length A46; mutation of the hydrophobic pocket reduced the lipid content. Using a combination of high resolution X-ray structures of the N- and C-terminal domains and SAXS analysis of full-length protein A46(1–240), we present here a structural model of A46 in a tetrameric assembly. Integrating affinity measurements and structural data, we propose how A46 simultaneously interferes with several TIR-domain containing proteins to inhibit NF-κB activation and postulate that A46 employs a bipartite binding arrangement to sequester the host immune adaptors TRAM and MyD88. PMID:27973613

  8. Protein classification using sequential pattern mining.

    PubMed

    Exarchos, Themis P; Papaloukas, Costas; Lampros, Christos; Fotiadis, Dimitrios I

    2006-01-01

    Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered.

  9. Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

    PubMed

    Yang, Jian-Yi; Peng, Zhen-Ling; Yu, Zu-Guo; Zhang, Rui-Jie; Anh, Vo; Wang, Desheng

    2009-04-21

    In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.

  10. Energetically Unfavorable Amide Conformations for N6-Acetyllysine Side Chains in Refined Protein Structures

    PubMed Central

    Genshaft, Alexander; Moser, Joe-Ann S.; D'Antonio, Edward L.; Bowman, Christine M.; Christianson, David W.

    2013-01-01

    The reversible acetylation of lysine to form N6-acetyllysine in the regulation of protein function is a hallmark of epigenetics. Acetylation of the positively charged amino group of the lysine side chain generates a neutral N-alkylacetamide moiety that serves as a molecular “switch” for the modulation of protein function and protein-protein interactions. We now report the analysis of 381 N6-acetyllysine side chain amide conformations as found in 79 protein crystal structures and 11 protein NMR structures deposited in the Protein Data Bank (PDB) of the Research Collaboratory for Structural Bioinformatics. We find that only 74.3% of N6-acetyllysine residues in protein crystal structures and 46.5% in protein NMR structures contain amide groups with energetically preferred trans or generously trans conformations. Surprisingly, 17.6% of N6-acetyllysine residues in protein crystal structures and 5.3% in protein NMR structures contain amide groups with energetically unfavorable cis or generously cis conformations. Even more surprisingly, 8.1% of N6-acetyllysine residues in protein crystal structures and 48.2% in NMR structures contain amide groups with energetically prohibitive twisted conformations that approach the transition state structure for cis-trans isomerization. In contrast, 109 unique N-alkylacetamide groups contained in 84 highly-accurate small molecule crystal structures retrieved from the Cambridge Structural Database exclusively adopt energetically preferred trans conformations. Therefore, we conclude that cis and twisted N6-acetyllysine amides in protein structures deposited in the PDB are erroneously modeled due to their energetically unfavorable or prohibitive conformations. PMID:23401043

  11. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    PubMed

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  12. Structure of the Newcastle disease virus F protein in the post-fusion conformation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swanson, Kurt; Wen, Xiaolin; Leser, George P.

    2010-11-17

    The paramyxovirus F protein is a class I viral membrane fusion protein which undergoes a significant refolding transition during virus entry. Previous studies of the Newcastle disease virus, human parainfluenza virus 3 and parainfluenza virus 5 F proteins revealed differences in the pre- and post-fusion structures. The NDV Queensland (Q) F structure lacked structural elements observed in the other two structures, which are key to the refolding and fusogenic activity of F. Here we present the NDV Australia-Victoria (AV) F protein post-fusion structure and provide EM evidence for its folding to a pre-fusion form. The NDV AV F structure containsmore » heptad repeat elements missing in the previous NDV Q F structure, forming a post-fusion six-helix bundle (6HB) similar to the post-fusion hPIV3 F structure. Electrostatic and temperature factor analysis of the F structures points to regions of these proteins that may be functionally important in their membrane fusion activity.« less

  13. In situ structural analysis of the Yersinia enterocolitica injectisome

    PubMed Central

    Kudryashev, Mikhail; Stenta, Marco; Schmelz, Stefan; Amstutz, Marlise; Wiesand, Ulrich; Castaño-Díez, Daniel; Degiacomi, Matteo T; Münnich, Stefan; Bleck, Christopher KE; Kowal, Julia; Diepold, Andreas; Heinz, Dirk W; Dal Peraro, Matteo; Cornelis, Guy R; Stahlberg, Henning

    2013-01-01

    Injectisomes are multi-protein transmembrane machines allowing pathogenic bacteria to inject effector proteins into eukaryotic host cells, a process called type III secretion. Here we present the first three-dimensional structure of Yersinia enterocolitica and Shigella flexneri injectisomes in situ and the first structural analysis of the Yersinia injectisome. Unexpectedly, basal bodies of injectisomes inside the bacterial cells showed length variations of 20%. The in situ structures of the Y. enterocolitica and S. flexneri injectisomes had similar dimensions and were significantly longer than the isolated structures of related injectisomes. The crystal structure of the inner membrane injectisome component YscD appeared elongated compared to a homologous protein, and molecular dynamics simulations documented its elongation elasticity. The ring-shaped secretin YscC at the outer membrane was stretched by 30–40% in situ, compared to its isolated liposome-embedded conformation. We suggest that elasticity is critical for some two-membrane spanning protein complexes to cope with variations in the intermembrane distance. DOI: http://dx.doi.org/10.7554/eLife.00792.001 PMID:23908767

  14. Functional and Structural Analysis of the Conserved EFhd2 Protein

    PubMed Central

    Acosta, Yancy Ferrer; Rodríguez Cruz, Eva N.; Vaquer, Ana del C.; Vega, Irving E.

    2013-01-01

    EFhd2 is a novel protein conserved from C. elegans to H. sapiens. This novel protein was originally identified in cells of the immune and central nervous systems. However, it is most abundant in the central nervous system, where it has been found associated with pathological forms of the microtubule-associated protein tau. The physiological or pathological roles of EFhd2 are poorly understood. In this study, a functional and structural analysis was carried to characterize the molecular requirements for EFhd2’s calcium binding activity. The results showed that mutations of a conserved aspartate on either EF-hand motif disrupted the calcium binding activity, indicating that these motifs work in pair as a functional calcium binding domain. Furthermore, characterization of an identified single-nucleotide polymorphisms (SNP) that introduced a missense mutation indicates the importance of a conserved phenylalanine on EFhd2 calcium binding activity. Structural analysis revealed that EFhd2 is predominantly composed of alpha helix and random coil structures and that this novel protein is thermostable. EFhd2’s thermo stability depends on its N-terminus. In the absence of the N-terminus, calcium binding restored EFhd2’s thermal stability. Overall, these studies contribute to our understanding on EFhd2 functional and structural properties, and introduce it into the family of canonical EF-hand domain containing proteins. PMID:22973849

  15. Prediction of Spontaneous Protein Deamidation from Sequence-Derived Secondary Structure and Intrinsic Disorder.

    PubMed

    Lorenzo, J Ramiro; Alonso, Leonardo G; Sánchez, Ignacio E

    2015-01-01

    Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage "Protein and nucleic acid structure and sequence analysis".

  16. Engineering Proteins for Thermostability with iRDP Web Server

    PubMed Central

    Ghanate, Avinash; Ramasamy, Sureshkumar; Suresh, C. G.

    2015-01-01

    Engineering protein molecules with desired structure and biological functions has been an elusive goal. Development of industrially viable proteins with improved properties such as stability, catalytic activity and altered specificity by modifying the structure of an existing protein has widely been targeted through rational protein engineering. Although a range of factors contributing to thermal stability have been identified and widely researched, the in silico implementation of these as strategies directed towards enhancement of protein stability has not yet been explored extensively. A wide range of structural analysis tools is currently available for in silico protein engineering. However these tools concentrate on only a limited number of factors or individual protein structures, resulting in cumbersome and time-consuming analysis. The iRDP web server presented here provides a unified platform comprising of iCAPS, iStability and iMutants modules. Each module addresses different facets of effective rational engineering of proteins aiming towards enhanced stability. While iCAPS aids in selection of target protein based on factors contributing to structural stability, iStability uniquely offers in silico implementation of known thermostabilization strategies in proteins for identification and stability prediction of potential stabilizing mutation sites. iMutants aims to assess mutants based on changes in local interaction network and degree of residue conservation at the mutation sites. Each module was validated using an extensively diverse dataset. The server is freely accessible at http://irdp.ncl.res.in and has no login requirements. PMID:26436543

  17. Engineering Proteins for Thermostability with iRDP Web Server.

    PubMed

    Panigrahi, Priyabrata; Sule, Manas; Ghanate, Avinash; Ramasamy, Sureshkumar; Suresh, C G

    2015-01-01

    Engineering protein molecules with desired structure and biological functions has been an elusive goal. Development of industrially viable proteins with improved properties such as stability, catalytic activity and altered specificity by modifying the structure of an existing protein has widely been targeted through rational protein engineering. Although a range of factors contributing to thermal stability have been identified and widely researched, the in silico implementation of these as strategies directed towards enhancement of protein stability has not yet been explored extensively. A wide range of structural analysis tools is currently available for in silico protein engineering. However these tools concentrate on only a limited number of factors or individual protein structures, resulting in cumbersome and time-consuming analysis. The iRDP web server presented here provides a unified platform comprising of iCAPS, iStability and iMutants modules. Each module addresses different facets of effective rational engineering of proteins aiming towards enhanced stability. While iCAPS aids in selection of target protein based on factors contributing to structural stability, iStability uniquely offers in silico implementation of known thermostabilization strategies in proteins for identification and stability prediction of potential stabilizing mutation sites. iMutants aims to assess mutants based on changes in local interaction network and degree of residue conservation at the mutation sites. Each module was validated using an extensively diverse dataset. The server is freely accessible at http://irdp.ncl.res.in and has no login requirements.

  18. Minimalist design of water-soluble cross-[beta] architecture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Biancalana, Matthew; Makabe, Koki; Koide, Shohei

    Demonstrated successes of protein design and engineering suggest significant potential to produce diverse protein architectures and assemblies beyond those found in nature. Here, we describe a new class of synthetic protein architecture through the successful design and atomic structures of water-soluble cross-{beta} proteins. The cross-{beta} motif is formed from the lamination of successive {beta}-sheet layers, and it is abundantly observed in the core of insoluble amyloid fibrils associated with protein-misfolding diseases. Despite its prominence, cross-{beta} has been designed only in the context of insoluble aggregates of peptides or proteins. Cross-{beta}'s recalcitrance to protein engineering and conspicuous absence among the knownmore » atomic structures of natural proteins thus makes it a challenging target for design in a water-soluble form. Through comparative analysis of the cross-{beta} structures of fibril-forming peptides, we identified rows of hydrophobic residues ('ladders') running across {beta}-strands of each {beta}-sheet layer as a minimal component of the cross-{beta} motif. Grafting a single ladder of hydrophobic residues designed from the Alzheimer's amyloid-{beta} peptide onto a large {beta}-sheet protein formed a dimeric protein with a cross-{beta} architecture that remained water-soluble, as revealed by solution analysis and x-ray crystal structures. These results demonstrate that the cross-{beta} motif is a stable architecture in water-soluble polypeptides and can be readily designed. Our results provide a new route for accessing the cross-{beta} structure and expanding the scope of protein design.« less

  19. Minimalist design of water-soluble cross-beta architecture.

    PubMed

    Biancalana, Matthew; Makabe, Koki; Koide, Shohei

    2010-02-23

    Demonstrated successes of protein design and engineering suggest significant potential to produce diverse protein architectures and assemblies beyond those found in nature. Here, we describe a new class of synthetic protein architecture through the successful design and atomic structures of water-soluble cross-beta proteins. The cross-beta motif is formed from the lamination of successive beta-sheet layers, and it is abundantly observed in the core of insoluble amyloid fibrils associated with protein-misfolding diseases. Despite its prominence, cross-beta has been designed only in the context of insoluble aggregates of peptides or proteins. Cross-beta's recalcitrance to protein engineering and conspicuous absence among the known atomic structures of natural proteins thus makes it a challenging target for design in a water-soluble form. Through comparative analysis of the cross-beta structures of fibril-forming peptides, we identified rows of hydrophobic residues ("ladders") running across beta-strands of each beta-sheet layer as a minimal component of the cross-beta motif. Grafting a single ladder of hydrophobic residues designed from the Alzheimer's amyloid-beta peptide onto a large beta-sheet protein formed a dimeric protein with a cross-beta architecture that remained water-soluble, as revealed by solution analysis and x-ray crystal structures. These results demonstrate that the cross-beta motif is a stable architecture in water-soluble polypeptides and can be readily designed. Our results provide a new route for accessing the cross-beta structure and expanding the scope of protein design.

  20. In silico characterization and analysis of RTBP1 and NgTRF1 protein through MD simulation and molecular docking - A comparative study.

    PubMed

    Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran

    2015-02-06

    Gaining access to sequence and structure information of telomere binding proteins helps in understanding the essential biological processes involve in conserved sequence specific interaction between DNA and the proteins. Rice telomere binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix turn helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain but till now there is very less communication on the in silico studies of these complete proteins.Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK web server.Digging up all the facts about the proteins it was reveled that around 120 amino acids in the tail part was showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicates the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and Energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.

  1. In Silico Characterization and Analysis of RTBP1 and NgTRF1 Protein Through MD Simulation and Molecular Docking: A Comparative Study.

    PubMed

    Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran

    2015-09-01

    Gaining access to sequence and structure information of telomere-binding proteins helps in understanding the essential biological processes involve in conserved sequence-specific interaction between DNA and the proteins. Rice telomere-binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix-turn-helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain, but till now there is very less communication on the in silico studies of these complete proteins. Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK Web server. By digging up all the facts about the proteins, it was revealed that around 120 amino acids in the tail part were showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicate the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA-binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.

  2. PBxplore: a tool to analyze local protein structure and deformability with Protein Blocks

    PubMed Central

    Craveur, Pierrick; Joseph, Agnel Praveen; Jallu, Vincent

    2017-01-01

    This paper describes the development and application of a suite of tools, called PBxplore, to analyze the dynamics and deformability of protein structures using Protein Blocks (PBs). Proteins are highly dynamic macromolecules, and a classical way to analyze their inherent flexibility is to perform molecular dynamics simulations. The advantage of using small structural prototypes such as PBs is to give a good approximation of the local structure of the protein backbone. More importantly, by reducing the conformational complexity of protein structures, PBs allow analysis of local protein deformability which cannot be done with other methods and had been used efficiently in different applications. PBxplore is able to process large amounts of data such as those produced by molecular dynamics simulations. It produces frequencies, entropy and information logo outputs as text and graphics. PBxplore is available at https://github.com/pierrepo/PBxplore and is released under the open-source MIT license. PMID:29177113

  3. Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST.

    PubMed

    Goonesekere, Nalin Cw

    2009-01-01

    The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.

  4. Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment

    PubMed Central

    2010-01-01

    Background Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. Results We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. Conclusions In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets. PMID:21034488

  5. Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment.

    PubMed

    Capriles, Priscila V S Z; Guimarães, Ana C R; Otto, Thomas D; Miranda, Antonio B; Dardenne, Laurent E; Degrave, Wim M

    2010-10-29

    Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets.

  6. Building protein-protein interaction networks for Leishmania species through protein structural information.

    PubMed

    Dos Santos Vasconcelos, Crhisllane Rafaele; de Lima Campos, Túlio; Rezende, Antonio Mauro

    2018-03-06

    Systematic analysis of a parasite interactome is a key approach to understand different biological processes. It makes possible to elucidate disease mechanisms, to predict protein functions and to select promising targets for drug development. Currently, several approaches for protein interaction prediction for non-model species incorporate only small fractions of the entire proteomes and their interactions. Based on this perspective, this study presents an integration of computational methodologies, protein network predictions and comparative analysis of the protozoan species Leishmania braziliensis and Leishmania infantum. These parasites cause Leishmaniasis, a worldwide distributed and neglected disease, with limited treatment options using currently available drugs. The predicted interactions were obtained from a meta-approach, applying rigid body docking tests and template-based docking on protein structures predicted by different comparative modeling techniques. In addition, we trained a machine-learning algorithm (Gradient Boosting) using docking information performed on a curated set of positive and negative protein interaction data. Our final model obtained an AUC = 0.88, with recall = 0.69, specificity = 0.88 and precision = 0.83. Using this approach, it was possible to confidently predict 681 protein structures and 6198 protein interactions for L. braziliensis, and 708 protein structures and 7391 protein interactions for L. infantum. The predicted networks were integrated to protein interaction data already available, analyzed using several topological features and used to classify proteins as essential for network stability. The present study allowed to demonstrate the importance of integrating different methodologies of interaction prediction to increase the coverage of the protein interaction of the studied protocols, besides it made available protein structures and interactions not previously reported.

  7. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins

    NASA Astrophysics Data System (ADS)

    Champeimont, Raphaël; Laine, Elodie; Hu, Shuang-Wei; Penin, Francois; Carbone, Alessandra

    2016-05-01

    A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.

  8. The 15-K neutron structure of saccharide-free concanavalin A.

    PubMed

    Blakeley, M P; Kalb, A J; Helliwell, J R; Myles, D A A

    2004-11-23

    The positions of the ordered hydrogen isotopes of a protein and its bound solvent can be determined by using neutron crystallography. Furthermore, by collecting neutron data at cryo temperatures, the dynamic disorder within a protein crystal is reduced, which may lead to improved definition of the nuclear density. It has proved possible to cryo-cool very large Con A protein crystals (>1.5 mm3) suitable for high-resolution neutron and x-ray structure analysis. We can thereby report the neutron crystal structure of the saccharide-free form of Con A and its bound water, including 167 intact D2O molecules and 60 oxygen atoms at 15 K to 2.5-A resolution, along with the 1.65-A x-ray structure of an identical crystal at 100 K. Comparison with the 293-K neutron structure shows that the bound water molecules are better ordered and have lower average B factors than those at room temperature. Overall, twice as many bound waters (as D2O) are identified at 15 K than at 293 K. We note that alteration of bound water orientations occurs between 293 and 15 K; such changes, as illustrated here with this example, could be important more generally in protein crystal structure analysis and ligand design. Methodologically, this successful neutron cryo protein structure refinement opens up categories of neutron protein crystallography, including freeze-trapped structures and cryo to room temperature comparisons.

  9. Molecular dynamics simulations and statistical coupling analysis reveal functional coevolution network of oncogenic mutations in the CDKN2A-CDK6 complex.

    PubMed

    Wang, Jingwen; Zhao, Yuqi; Wang, Yanjie; Huang, Jingfei

    2013-01-16

    Coevolution between proteins is crucial for understanding protein-protein interaction. Simultaneous changes allow a protein complex to maintain its overall structural-functional integrity. In this study, we combined statistical coupling analysis (SCA) and molecular dynamics simulations on the CDK6-CDKN2A protein complex to evaluate coevolution between proteins. We reconstructed an inter-protein residue coevolution network, consisting of 37 residues and 37 interactions. It shows that most of the coevolved residue pairs are spatially proximal. When the mutations happened, the stable local structures were broken up and thus the protein interaction was decreased or inhibited, with a following increased risk of melanoma. The identification of inter-protein coevolved residues in the CDK6-CDKN2A complex can be helpful for designing protein engineering experiments. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  10. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    PubMed

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

  11. Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.

    PubMed

    Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2015-01-01

    Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.

  12. A method of searching for related literature on protein structure analysis by considering a user's intention

    PubMed Central

    2015-01-01

    Background In recent years, with advances in techniques for protein structure analysis, the knowledge about protein structure and function has been published in a vast number of articles. A method to search for specific publications from such a large pool of articles is needed. In this paper, we propose a method to search for related articles on protein structure analysis by using an article itself as a query. Results Each article is represented as a set of concepts in the proposed method. Then, by using similarities among concepts formulated from databases such as Gene Ontology, similarities between articles are evaluated. In this framework, the desired search results vary depending on the user's search intention because a variety of information is included in a single article. Therefore, the proposed method provides not only one input article (primary article) but also additional articles related to it as an input query to determine the search intention of the user, based on the relationship between two query articles. In other words, based on the concepts contained in the input article and additional articles, we actualize a relevant literature search that considers user intention by varying the degree of attention given to each concept and modifying the concept hierarchy graph. Conclusions We performed an experiment to retrieve relevant papers from articles on protein structure analysis registered in the Protein Data Bank by using three query datasets. The experimental results yielded search results with better accuracy than when user intention was not considered, confirming the effectiveness of the proposed method. PMID:25952498

  13. 3Drefine: an interactive web server for efficient protein structure refinement.

    PubMed

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-07-08

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Expression, refolding and bio-structural analysis of a tetravalent recombinant dengue envelope domain III protein for serological diagnosis.

    PubMed

    Combe, Maxime; Lacoux, Xavier; Martinez, Jérôme; Méjan, Odile; Luciani, Françoise; Daniel, Soizic

    2017-05-01

    Dengue is a mosquito-borne disease caused by four genetically and serologically related viruses that affect several millions of people. Envelope domain III (EDIII) of the viral envelope protein contains dengue virus (DENV) type-specific and DENV complex-reactive antigenic sites. Here, we describe the expression in Escherichia coli, the refolding and bio-structural analysis of envelope domain III of the four dengue serotypes as a tetravalent dengue protein (EDIIIT2), generating an attractive diagnostic candidate. In vitro refolding of denatured EDIIIT2 was performed by successive dialysis with decreasing concentrations of chaotropic reagent and in the presence of oxidized glutathione. The efficiency of refolding was demonstrated by protein mobility shifting and fluorescent visualization of labeled cysteine in non-reducing SDS-PAGE. The identity and the fully oxidized state of the protein were verified by mass spectrometry. Analysis of the structure by fluorescence, differential scanning calorimetry and circular dichroism showed a well-formed structural conformation mainly composed of β-strands. A label-free immunoassay based on biolayer interferometry technology was subsequently used to evaluate antigenic properties of folded EDIIIT2 protein using a panel of dengue IgM positive and negative human sera. Our data collectively support the use of an oxidatively refolded EDIIIT2 recombinant chimeric protein as a promising antigen in the serological diagnosis of dengue virus infections. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Local backbone structure prediction of proteins

    PubMed Central

    De Brevern, Alexandre G.; Benros, Cristina; Gautier, Romain; Valadié, Hélène; Hazout, Serge; Etchebest, Catherine

    2004-01-01

    Summary A statistical analysis of the PDB structures has led us to define a new set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one is defined by the (φ, Ψ) dihedral angles of 5 consecutive residues. The amino acid distributions observed in sequence windows encompassing these PBs are used to predict by a Bayesian approach the local 3D structure of proteins from the sole knowledge of their sequences. LocPred is a software which allows the users to submit a protein sequence and performs a prediction in terms of PBs. The prediction results are given both textually and graphically. PMID:15724288

  16. Structural Analysis Of CD59 Of Chinese Tree Shrew: A New Reference Molecule For Human Immune System Specific CD59 Drug Discovery.

    PubMed

    Panda, Subhamay; Kumari, Leena; Panda, Santamay

    2017-11-17

    Chinese tree shrews (Tupaia belangeri chinensis) bear several characteristics that are considered to be very crucial for utilizing in animal experimental models in biomedical research. Subsequent to the identification of key aspects and signaling pathways in nervous and immune systems, it is revealed that tree shrews acquires shared common as well as unique characteristics, and hence offers a genetic basis for employing this animal as a prospective model for biomedical research. CD59 glycoprotein, commonly referred to as MAC-inhibitory protein (MAC-IP), membrane inhibitor of reactive lysis (MIRL), or protectin, is encoded by the CD59 gene in human beings. It is the member of the LY6/uPAR/alpha-neurotoxin protein family. With this initial point the objective of this study was to determine a comparative composite based structure of CD59 of Chinese tree shrew. The additional objective of this study was to examine the distribution of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, hydrophobicity molecular surface analysis and electrostatic potential analysis with the assistance of several bioinformatical analytical tools. CD59 Amino acid sequence of Chinese tree shrew collected from the online database system of National Centre for Biotechnology Information. SignalP 4.0 online server was employed for detection of signal peptide instance within the protein sequence of CD59. Molecular model structure of CD59 protein was generated by the Iterative Threading ASSEmbly Refinement (I-TASSER) suite. The confirmation for three-dimensional structural model was evaluated by structure validation tools. Location of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, and hydrophobicity molecular surface analysis was performed with the help of Chimera tool. Electrostatic potential analysis was carried out with the adaptive Poisson-Boltzmann solver package. Subsequently validated model was used for the functionally critical amino acids and active site prediction. The functionally critical amino acids and ligand- binding site (LBS) of the proteins (modeled) was determined using the COACH program. Analysis of Ramachandran plot for Chinese tree shrew depicted that overall, 100% of the residues in homology model were observed in allowed and favored regions, sequentially leading to the validation of the standard of generated protein structural model. In case of CD59 of Chinese tree shrew, the total score of G-factor was found to be -0.66 that was generally larger than the acceptable value. This approach suggests the significance and acceptability of the modeled structure of CD59 of Chinese tree shrew. The molecular model data in cooperation to other relevant post model analysis data put forward molecular insight to protecting activity of CD59 protein molecule of Chinese tree shrew. In the present study, we have proposed the first molecular model structure of uncharted CD59 of Chinese tree shrew by significantly utilizing the comparative composite modeling approach. Therefore, the development of a structural model of the CD59 protein was carried out and analyzed further for deducing molecular enrichment technique. The collaborative effort of molecular model and other relevant data of post model analysis carry forward molecular understanding to protecting activity of CD59 functions towards better insight of features of this natural lead compound. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  17. Methods for the visualization and analysis of extracellular matrix protein structure and degradation.

    PubMed

    Leonard, Annemarie K; Loughran, Elizabeth A; Klymenko, Yuliya; Liu, Yueying; Kim, Oleg; Asem, Marwa; McAbee, Kevin; Ravosa, Matthew J; Stack, M Sharon

    2018-01-01

    This chapter highlights methods for visualization and analysis of extracellular matrix (ECM) proteins, with particular emphasis on collagen type I, the most abundant protein in mammals. Protocols described range from advanced imaging of complex in vivo matrices to simple biochemical analysis of individual ECM proteins. The first section of this chapter describes common methods to image ECM components and includes protocols for second harmonic generation, scanning electron microscopy, and several histological methods of ECM localization and degradation analysis, including immunohistochemistry, Trichrome staining, and in situ zymography. The second section of this chapter details both a common transwell invasion assay and a novel live imaging method to investigate cellular behavior with respect to collagen and other ECM proteins of interest. The final section consists of common electrophoresis-based biochemical methods that are used in analysis of ECM proteins. Use of the methods described herein will enable researchers to gain a greater understanding of the role of ECM structure and degradation in development and matrix-related diseases such as cancer and connective tissue disorders. © 2018 Elsevier Inc. All rights reserved.

  18. Structure of Dimeric and Tetrameric Complexes of the BAR Domain Protein PICK1 Determined by Small-Angle X-Ray Scattering.

    PubMed

    Karlsen, Morten L; Thorsen, Thor S; Johner, Niklaus; Ammendrup-Johnsen, Ina; Erlendsson, Simon; Tian, Xinsheng; Simonsen, Jens B; Høiberg-Nielsen, Rasmus; Christensen, Nikolaj M; Khelashvili, George; Streicher, Werner; Teilum, Kaare; Vestergaard, Bente; Weinstein, Harel; Gether, Ulrik; Arleth, Lise; Madsen, Kenneth L

    2015-07-07

    PICK1 is a neuronal scaffolding protein containing a PDZ domain and an auto-inhibited BAR domain. BAR domains are membrane-sculpting protein modules generating membrane curvature and promoting membrane fission. Previous data suggest that BAR domains are organized in lattice-like arrangements when stabilizing membranes but little is known about structural organization of BAR domains in solution. Through a small-angle X-ray scattering (SAXS) analysis, we determine the structure of dimeric and tetrameric complexes of PICK1 in solution. SAXS and biochemical data reveal a strong propensity of PICK1 to form higher-order structures, and SAXS analysis suggests an offset, parallel mode of BAR-BAR oligomerization. Furthermore, unlike accessory domains in other BAR domain proteins, the positioning of the PDZ domains is flexible, enabling PICK1 to perform long-range, dynamic scaffolding of membrane-associated proteins. Together with functional data, these structural findings are compatible with a model in which oligomerization governs auto-inhibition of BAR domain function. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. A proteome view of structural, functional, and taxonomic characteristics of major protein domain clusters.

    PubMed

    Sun, Chia-Tsen; Chiang, Austin W T; Hwang, Ming-Jing

    2017-10-27

    Proteome-scale bioinformatics research is increasingly conducted as the number of completely sequenced genomes increases, but analysis of protein domains (PDs) usually relies on similarity in their amino acid sequences and/or three-dimensional structures. Here, we present results from a bi-clustering analysis on presence/absence data for 6,580 unique PDs in 2,134 species with a sequenced genome, thus covering a complete set of proteins, for the three superkingdoms of life, Bacteria, Archaea, and Eukarya. Our analysis revealed eight distinctive PD clusters, which, following an analysis of enrichment of Gene Ontology functions and CATH classification of protein structures, were shown to exhibit structural and functional properties that are taxa-characteristic. For examples, the largest cluster is ubiquitous in all three superkingdoms, constituting a set of 1,472 persistent domains created early in evolution and retained in living organisms and characterized by basic cellular functions and ancient structural architectures, while an Archaea and Eukarya bi-superkingdom cluster suggests its PDs may have existed in the ancestor of the two superkingdoms, and others are single superkingdom- or taxa (e.g. Fungi)-specific. These results contribute to increase our appreciation of PD diversity and our knowledge of how PDs are used in species, yielding implications on species evolution.

  20. Purification and Crystallization Reveal Two Types of Interactions of the Fusion Protein Homotrimer of Semliki Forest Virus

    PubMed Central

    Gibbons, Don L.; Reilly, Brigid; Ahn, Anna; Vaney, Marie-Christine; Vigouroux, Armelle; Rey, Felix A.; Kielian, Margaret

    2004-01-01

    The fusion proteins of the alphaviruses and flaviviruses have a similar native structure and convert to a highly stable homotrimer conformation during the fusion of the viral and target membranes. The properties of the alpha- and flavivirus fusion proteins distinguish them from the class I viral fusion proteins, such as influenza virus hemagglutinin, and establish them as the first members of the class II fusion proteins. Understanding how this new class carries out membrane fusion will require analysis of the structural basis for both the interaction of the protein subunits within the homotrimer and their interaction with the viral and target membranes. To this end we report a purification method for the E1 ectodomain homotrimer from the alphavirus Semliki Forest virus. The purified protein is trimeric, detergent soluble, retains the characteristic stability of the starting homotrimer, and is free of lipid and other contaminants. In contrast to the postfusion structures that have been determined for the class I proteins, the E1 homotrimer contains the fusion peptide region responsible for interaction with target membranes. This E1 trimer preparation is an excellent candidate for structural studies of the class II viral fusion proteins, and we report conditions that generate three-dimensional crystals suitable for analysis by X-ray diffraction. Determination of the structure will provide our first high-resolution views of both the low-pH-induced trimeric conformation and the target membrane-interacting region of the alphavirus fusion protein. PMID:15016874

  1. Accurate Structural Correlations from Maximum Likelihood Superpositions

    PubMed Central

    Theobald, Douglas L; Wuttke, Deborah S

    2008-01-01

    The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091

  2. Unfolding stabilities of two structurally similar proteins as probed by temperature-induced and force-induced molecular dynamics simulations.

    PubMed

    Gorai, Biswajit; Prabhavadhni, Arasu; Sivaraman, Thirunavukkarasu

    2015-09-01

    Unfolding stabilities of two homologous proteins, cardiotoxin III and short-neurotoxin (SNTX) belonging to three-finger toxin (TFT) superfamily, have been probed by means of molecular dynamics (MD) simulations. Combined analysis of data obtained from steered MD and all-atom MD simulations at various temperatures in near physiological conditions on the proteins suggested that overall structural stabilities of the two proteins were different from each other and the MD results are consistent with experimental data of the proteins reported in the literature. Rationalization for the differential structural stabilities of the structurally similar proteins has been chiefly attributed to the differences in the structural contacts between C- and N-termini regions in their three-dimensional structures, and the findings endorse the 'CN network' hypothesis proposed to qualitatively analyse the thermodynamic stabilities of proteins belonging to TFT superfamily of snake venoms. Moreover, the 'CN network' hypothesis has been revisited and the present study suggested that 'CN network' should be accounted in terms of 'structural contacts' and 'structural strengths' in order to precisely describe order of structural stabilities of TFTs.

  3. Crystal structure of AFV1-102, a protein from the acidianus filamentous virus 1

    PubMed Central

    Keller, Jenny; Leulliot, Nicolas; Collinet, Bruno; Campanacci, Valerie; Cambillau, Christian; Pranghisvilli, David; van Tilbeurgh, Herman

    2009-01-01

    Viruses infecting hyperthermophilic archaea have intriguing morphologies and genomic properties. The vast majority of their genes do not have homologs other than in other hyperthermophilic viruses, and the biology of these viruses is poorly understood. As part of a structural genomics project on the proteins of these viruses, we present here the structure of a 102 amino acid protein from acidianus filamentous virus 1 (AFV1-102). The structure shows that it is made of two identical motifs that have poor sequence similarity. Although no function can be proposed from structural analysis, tight binding of the gateway tag peptide in a groove between the two motifs suggests AFV1-102 is involved in protein protein interactions. PMID:19319936

  4. Structural bioinformatics of the human spliceosomal proteome

    PubMed Central

    Korneta, Iga; Magnus, Marcin; Bujnicki, Janusz M.

    2012-01-01

    In this work, we describe the results of a comprehensive structural bioinformatics analysis of the spliceosomal proteome. We used fold recognition analysis to complement prior data on the ordered domains of 252 human splicing proteins. Examples of newly identified domains include a PWI domain in the U5 snRNP protein 200K (hBrr2, residues 258–338), while examples of previously known domains with a newly determined fold include the DUF1115 domain of the U4/U6 di-snRNP protein 90K (hPrp3, residues 540–683). We also established a non-redundant set of experimental models of spliceosomal proteins, as well as constructed in silico models for regions without an experimental structure. The combined set of structural models is available for download. Altogether, over 90% of the ordered regions of the spliceosomal proteome can be represented structurally with a high degree of confidence. We analyzed the reduced spliceosomal proteome of the intron-poor organism Giardia lamblia, and as a result, we proposed a candidate set of ordered structural regions necessary for a functional spliceosome. The results of this work will aid experimental and structural analyses of the spliceosomal proteins and complexes, and can serve as a starting point for multiscale modeling of the structure of the entire spliceosome. PMID:22573172

  5. The Leptospiral Antigen Lp49 is a Two-Domain Protein with Putative Protein Binding Function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oliveira Giuseppe,P.; Oliveira Neves, F.; Nascimento, A.

    2008-01-01

    Pathogenic Leptospira is the etiological agent of leptospirosis, a life-threatening disease that affects populations worldwide. Currently available vaccines have limited effectiveness and therapeutic interventions are complicated by the difficulty in making an early diagnosis of leptospirosis. The genome of Leptospira interrogans was recently sequenced and comparative genomic analysis contributed to the identification of surface antigens, potential candidates for development of new vaccines and serodiagnosis. Lp49 is a membrane-associated protein recognized by antibodies present in sera from early and convalescent phases of leptospirosis patients. Its crystal structure was determined by single-wavelength anomalous diffraction using selenomethionine-labelled crystals and refined at 2.0 Angstromsmore » resolution. Lp49 is composed of two domains and belongs to the all-beta-proteins class. The N-terminal domain folds in an immunoglobulin-like beta-sandwich structure, whereas the C-terminal domain presents a seven-bladed beta-propeller fold. Structural analysis of Lp49 indicates putative protein-protein binding sites, suggesting a role in Leptospira-host interaction. This is the first crystal structure of a leptospiral antigen described to date.« less

  6. Crystal Structure Analysis and the Identification of Distinctive Functional Regions of the Protein Elicitor Mohrip2.

    PubMed

    Liu, Mengjie; Duan, Liangwei; Wang, Meifang; Zeng, Hongmei; Liu, Xinqi; Qiu, Dewen

    2016-01-01

    The protein elicitor MoHrip2, which was extracted from Magnaporthe oryzae as an exocrine protein, triggers the tobacco immune system and enhances blast resistance in rice. However, the detailed mechanisms by which MoHrip2 acts as an elicitor remain unclear. Here, we investigated the structure of MoHrip2 to elucidate its functions based on molecular structure. The three-dimensional structure of MoHrip2 was obtained. Overall, the crystal structure formed a β-barrel structure and showed high similarity to the pathogenesis-related (PR) thaumatin superfamily protein thaumatin-like xylanase inhibitor (TL-XI). To investigate the functional regions responsible for MoHrip2 elicitor activities, the full length and eight truncated proteins were expressed in Escherichia coli and were evaluated for elicitor activity in tobacco. Biological function analysis showed that MoHrip2 triggered the defense system against Botrytis cinerea in tobacco. Moreover, only MoHrip2M14 and other fragments containing the 14 amino acids residues in the middle region of the protein showed the elicitor activity of inducing a hypersensitive response and resistance related pathways, which were similar to that of full-length MoHrip2. These results revealed that the central 14 amino acid residues were essential for anti-pathogenic activity.

  7. Predicting the helix packing of globular proteins by self-correcting distance geometry.

    PubMed

    Mumenthaler, C; Braun, W

    1995-05-01

    A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.

  8. Structure-based analysis of catalysis and substrate definition in the HIT protein family.

    PubMed

    Lima, C D; Klein, M G; Hendrickson, W A

    1997-10-10

    The histidine triad (HIT) protein family is among the most ubiquitous and highly conserved in nature, but a biological activity has not yet been identified for any member of the HIT family. Fragile histidine triad protein (FHIT) and protein kinase C interacting protein (PKCI) were used in a structure-based approach to elucidate characteristics of in vivo ligands and reactions. Crystallographic structures of apo, substrate analog, pentacovalent transition-state analog, and product states of both enzymes reveal a catalytic mechanism and define substrate characteristics required for catalysis, thus unifying the HIT family as nucleotidyl hydrolases, transferases, or both. The approach described here may be useful in identifying structure-function relations between protein families identified through genomics.

  9. Structure and function of seed storage proteins in faba bean (Vicia faba L.).

    PubMed

    Liu, Yujiao; Wu, Xuexia; Hou, Wanwei; Li, Ping; Sha, Weichao; Tian, Yingying

    2017-05-01

    The protein subunit is the most important basic unit of protein, and its study can unravel the structure and function of seed storage proteins in faba bean. In this study, we identified six specific protein subunits in Faba bean (cv. Qinghai 13) combining liquid chromatography (LC), liquid chromatography-electronic spray ionization mass (LC-ESI-MS/MS) and bio-information technology. The results suggested a diversity of seed storage proteins in faba bean, and a total of 16 proteins (four GroEL molecular chaperones and 12 plant-specific proteins) were identified from 97-, 96-, 64-, 47-, 42-, and 38-kD-specific protein subunits in faba bean based on the peptide sequence. We also analyzed the composition and abundance of the amino acids, the physicochemical characteristics, secondary structure, three-dimensional structure, transmembrane domain, and possible subcellular localization of these identified proteins in faba bean seed, and finally predicted function and structure. The three-dimensional structures were generated based on homologous modeling, and the protein function was analyzed based on the annotation from the non-redundant protein database (NR database, NCBI) and function analysis of optimal modeling. The objective of this study was to identify the seed storage proteins in faba bean and confirm the structure and function of these proteins. Our results can be useful for the study of protein nutrition and achieve breeding goals for optimal protein quality in faba bean.

  10. Molecular Analysis of Protein Assembly in Muscle Development.

    ERIC Educational Resources Information Center

    Epstein, Henry F.; Fischman, Donald A.

    1991-01-01

    Advances in the genetics and cell biology of muscle development are discussed. In-vitro analysis of the renaturation, polymerization, and three-dimensional structure of the purified proteins involved is described. (CW)

  11. Determination and Quantification of Molecular Interactions in Protein Films: A Review.

    PubMed

    Hammann, Felicia; Schmid, Markus

    2014-12-10

    Protein based films are nowadays also prepared with the aim of replacing expensive, crude oil-based polymers as environmentally friendly and renewable alternatives. The protein structure determines the ability of protein chains to form intra- and intermolecular bonds, whereas the degree of cross-linking depends on the amino acid composition and molecular weight of the protein, besides the conditions used in film preparation and processing. The functionality varies significantly depending on the type of protein and affects the resulting film quality and properties. This paper reviews the methods used in examination of molecular interactions in protein films and discusses how these intermolecular interactions can be quantified. The qualitative determination methods can be distinguished by structural analysis of solutions (electrophoretic analysis, size exclusion chromatography) and analysis of solid films (spectroscopy techniques, X-ray scattering methods). To quantify molecular interactions involved, two methods were found to be the most suitable: protein film swelling and solubility. The importance of non-covalent and covalent interactions in protein films can be investigated using different solvents. The research was focused on whey protein, whereas soy protein and wheat gluten were included as further examples of proteins.

  12. Determination Quantification of Molecular Interactions in Protein Films: A Review

    PubMed Central

    Hammann, Felicia; Schmid, Markus

    2014-01-01

    Protein based films are nowadays also prepared with the aim of replacing expensive, crude oil-based polymers as environmentally friendly and renewable alternatives. The protein structure determines the ability of protein chains to form intra- and intermolecular bonds, whereas the degree of cross-linking depends on the amino acid composition and molecular weight of the protein, besides the conditions used in film preparation and processing. The functionality varies significantly depending on the type of protein and affects the resulting film quality and properties. This paper reviews the methods used in examination of molecular interactions in protein films and discusses how these intermolecular interactions can be quantified. The qualitative determination methods can be distinguished by structural analysis of solutions (electrophoretic analysis, size exclusion chromatography) and analysis of solid films (spectroscopy techniques, X-ray scattering methods). To quantify molecular interactions involved, two methods were found to be the most suitable: protein film swelling and solubility. The importance of non-covalent and covalent interactions in protein films can be investigated using different solvents. The research was focused on whey protein, whereas soy protein and wheat gluten were included as further examples of proteins. PMID:28788285

  13. Sequence co-evolution gives 3D contacts and structures of protein complexes

    PubMed Central

    Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna G; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, Debora S

    2014-01-01

    Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001 PMID:25255213

  14. Geometrical analysis of Cys-Cys bridges in proteins and their prediction from incomplete structural information

    NASA Technical Reports Server (NTRS)

    Goldblum, A.; Rein, R.

    1987-01-01

    Analysis of C-alpha atom positions from cysteines involved in disulphide bridges in protein crystals shows that their geometric characteristics are unique with respect to other Cys-Cys, non-bridging pairs. They may be used for predicting disulphide connections in incompletely determined protein structures, such as low resolution crystallography or theoretical folding experiments. The basic unit for analysis and prediction is the 3 x 3 distance matrix for Cx positions of residues (i - 1), Cys(i), (i +1) with (j - 1), Cys(j), (j + 1). In each of its columns, row and diagonal vector--outer distances are larger than the central distance. This analysis is compared with some analytical models.

  15. SCOWLP classification: Structural comparison and analysis of protein binding regions

    PubMed Central

    Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa

    2008-01-01

    Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098

  16. Structural deformation upon protein-protein interaction: A structural alphabet approach

    PubMed Central

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-01-01

    Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking. PMID:18307769

  17. Structural deformation upon protein-protein interaction: a structural alphabet approach.

    PubMed

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-02-28

    In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  18. Nonketotic hyperglycinemia: Functional assessment of missense variants in GLDC to understand phenotypes of the disease.

    PubMed

    Bravo-Alonso, Irene; Navarrete, Rosa; Arribas-Carreira, Laura; Perona, Almudena; Abia, David; Couce, María Luz; García-Cazorla, Angels; Morais, Ana; Domingo, Rosario; Ramos, María Antonia; Swanson, Michael A; Van Hove, Johan L K; Ugarte, Magdalena; Pérez, Belén; Pérez-Cerdá, Celia; Rodríguez-Pombo, Pilar

    2017-06-01

    The rapid analysis of genomic data is providing effective mutational confirmation in patients with clinical and biochemical hallmarks of a specific disease. This is the case for nonketotic hyperglycinemia (NKH), a Mendelian disorder causing seizures in neonates and early-infants, primarily due to mutations in the GLDC gene. However, understanding the impact of missense variants identified in this gene is a major challenge for the application of genomics into clinical practice. Herein, a comprehensive functional and structural analysis of 19 GLDC missense variants identified in a cohort of 26 NKH patients was performed. Mutant cDNA constructs were expressed in COS7 cells followed by enzymatic assays and Western blot analysis of the GCS P-protein to assess the residual activity and mutant protein stability. Structural analysis, based on molecular modeling of the 3D structure of GCS P-protein, was also performed. We identify hypomorphic variants that produce attenuated phenotypes with improved prognosis of the disease. Structural analysis allows us to interpret the effects of mutations on protein stability and catalytic activity, providing molecular evidence for clinical outcome and disease severity. Moreover, we identify an important number of mutants whose loss-of-functionality is associated with instability and, thus, are potential targets for rescue using folding therapeutic approaches. © 2017 Wiley Periodicals, Inc.

  19. There is Diversity in Disorder-"In all Chaos there is a Cosmos, in all Disorder a Secret Order".

    PubMed

    Nielsen, Jakob T; Mulder, Frans A A

    2016-01-01

    The protein universe consists of a continuum of structures ranging from full order to complete disorder. As the structured part of the proteome has been intensively studied, stably folded proteins are increasingly well documented and understood. However, proteins that are fully, or in large part, disordered are much less well characterized. Here we collected NMR chemical shifts in a small database for 117 protein sequences that are known to contain disorder. We demonstrate that NMR chemical shift data can be brought to bear as an exquisite judge of protein disorder at the residue level, and help in validation. With the help of secondary chemical shift analysis we demonstrate that the proteins in the database span the full spectrum of disorder, but still, largely segregate into two classes; disordered with small segments of order scattered along the sequence, and structured with small segments of disorder inserted between the different structured regions. A detailed analysis reveals that the distribution of order/disorder along the sequence shows a complex and asymmetric distribution, that is highly protein-dependent. Access to ratified training data further suggests an avenue to improving prediction of disorder from sequence.

  20. Proteomic analysis of skeletal organic matrix from the stony coral Stylophora pistillata

    PubMed Central

    Drake, Jeana L.; Mass, Tali; Haramaty, Liti; Zelzion, Ehud; Bhattacharya, Debashish; Falkowski, Paul G.

    2013-01-01

    It has long been recognized that a suite of proteins exists in coral skeletons that is critical for the oriented precipitation of calcium carbonate crystals, yet these proteins remain poorly characterized. Using liquid chromatography-tandem mass spectrometry analysis of proteins extracted from the cell-free skeleton of the hermatypic coral, Stylophora pistillata, combined with a draft genome assembly from the cnidarian host cells of the same species, we identified 36 coral skeletal organic matrix proteins. The proteome of the coral skeleton contains an assemblage of adhesion and structural proteins as well as two highly acidic proteins that may constitute a unique coral skeletal organic matrix protein subfamily. We compared the 36 skeletal organic matrix protein sequences to genome and transcriptome data from three other corals, three additional invertebrates, one vertebrate, and three single-celled organisms. This work represents a unique extensive proteomic analysis of biomineralization-related proteins in corals from which we identify a biomineralization “toolkit,” an organic scaffold upon which aragonite crystals can be deposited in specific orientations to form a phenotypically identifiable structure. PMID:23431140

  1. SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database.

    PubMed

    Chandonia, John-Marc; Fox, Naomi K; Brenner, Steven E

    2017-02-03

    SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  2. Analysis of the structural organization and thermal stability of two spermadhesins. Calorimetric, circular dichroic and Fourier-transform infrared spectroscopic studies.

    PubMed

    Menéndez, M; Gasset, M; Laynez, J; López-Zumel, C; Usobiaga, P; Töpfer-Petersen, E; Calvete, J J

    1995-12-15

    The CUB domain is a widespread 110-amino-acid module found in functionally diverse, often developmentally regulated proteins, for which an antiparallel beta-barrel topology similar to that in immunoglobulin V domains has been predicted. Spermadhesins have been proposed as a subgroup of this protein family built up by a single CUB domain architecture. To test the proposed structural model, we have analyzed the structural organization of two members of the spermadhesin protein family, porcine seminal plasma proteins I/II (PSP-I/PSP-II) heterodimer and bovine acidic seminal fluid protein (aSFP) homodimer, using differential scanning calorimetry, far-ultraviolet circular dichroism and Fourier-transform infrared spectroscopy. Thermal unfolding of PSP-I/PSP-II and aSFP were irreversible and followed a one-step process with transition temperatures (Tm) of 60.5 degrees C and 78.6 degrees C, respectively. The calorimetric enthalpy changes (delta Hcat) of thermal denaturation were 439 kJ/mol for PSP-I/PSP-II and 660 kJ/mol for aSFP dimer. Analysis of the calorimetric curves of PSP-I/PSP-II showed that the entire dimer constituted the cooperative unfolding unit. Fourier-transform infrared spectroscopy and deconvolution of circular dichroic spectra using a convex constraint analysis indicated that beta-structure and turns are the major structural element of both PSP-I/PSP-II (53% of beta-sheet, 21% of turns) and aSFP (44% of beta-sheet, 36% of turns), and that the porcine and the bovine proteins contain little, if any, alpha-helical structure. Taken together, our results indicate that the porcine and the bovine spermadhesin molecules are probably all-beta-structure proteins, and would support a beta-barrel topology like that predicted for the CUB domain. Other beta-structure folds, such as the Greek-key pattern characteristic of many carbohydrate-binding protein domains cannot be eliminated. Finally, the same combination of biophysical techniques was used to characterize the residual secondary structure of thermally denatured forms of PSP-I/PSP-II and aSFP, and to emphasize the aggregation tendency of these forms.

  3. Frequent side chain methyl carbon-oxygen hydrogen bonding in proteins revealed by computational and stereochemical analysis of neutron structures.

    PubMed

    Yesselman, Joseph D; Horowitz, Scott; Brooks, Charles L; Trievel, Raymond C

    2015-03-01

    The propensity of backbone Cα atoms to engage in carbon-oxygen (CH · · · O) hydrogen bonding is well-appreciated in protein structure, but side chain CH · · · O hydrogen bonding remains largely uncharacterized. The extent to which side chain methyl groups in proteins participate in CH · · · O hydrogen bonding is examined through a survey of neutron crystal structures, quantum chemistry calculations, and molecular dynamics simulations. Using these approaches, methyl groups were observed to form stabilizing CH · · · O hydrogen bonds within protein structure that are maintained through protein dynamics and participate in correlated motion. Collectively, these findings illustrate that side chain methyl CH · · · O hydrogen bonding contributes to the energetics of protein structure and folding. © 2014 Wiley Periodicals, Inc.

  4. Rebelling for a Reason: Protein Structural “Outliers”

    PubMed Central

    Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini

    2013-01-01

    Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209

  5. Structure and Self-Assembly of the Calcium Binding Matrix Protein of Human Metapneumovirus

    PubMed Central

    Leyrat, Cedric; Renner, Max; Harlos, Karl; Huiskonen, Juha T.; Grimes, Jonathan M.

    2014-01-01

    Summary The matrix protein (M) of paramyxoviruses plays a key role in determining virion morphology by directing viral assembly and budding. Here, we report the crystal structure of the human metapneumovirus M at 2.8 Å resolution in its native dimeric state. The structure reveals the presence of a high-affinity Ca2+ binding site. Molecular dynamics simulations (MDS) predict a secondary lower-affinity site that correlates well with data from fluorescence-based thermal shift assays. By combining small-angle X-ray scattering with MDS and ensemble analysis, we captured the structure and dynamics of M in solution. Our analysis reveals a large positively charged patch on the protein surface that is involved in membrane interaction. Structural analysis of DOPC-induced polymerization of M into helical filaments using electron microscopy leads to a model of M self-assembly. The conservation of the Ca2+ binding sites suggests a role for calcium in the replication and morphogenesis of pneumoviruses. PMID:24316400

  6. Molecular dynamics simulations and structure-based network analysis reveal structural and functional aspects of G-protein coupled receptor dimer interactions.

    PubMed

    Baltoumas, Fotis A; Theodoropoulou, Margarita C; Hamodrakas, Stavros J

    2016-06-01

    A significant amount of experimental evidence suggests that G-protein coupled receptors (GPCRs) do not act exclusively as monomers but also form biologically relevant dimers and oligomers. However, the structural determinants, stoichiometry and functional importance of GPCR oligomerization remain topics of intense speculation. In this study we attempted to evaluate the nature and dynamics of GPCR oligomeric interactions. A representative set of GPCR homodimers were studied through Coarse-Grained Molecular Dynamics simulations, combined with interface analysis and concepts from network theory for the construction and analysis of dynamic structural networks. Our results highlight important structural determinants that seem to govern receptor dimer interactions. A conserved dynamic behavior was observed among different GPCRs, including receptors belonging in different GPCR classes. Specific GPCR regions were highlighted as the core of the interfaces. Finally, correlations of motion were observed between parts of the dimer interface and GPCR segments participating in ligand binding and receptor activation, suggesting the existence of mechanisms through which dimer formation may affect GPCR function. The results of this study can be used to drive experiments aimed at exploring GPCR oligomerization, as well as in the study of transmembrane protein-protein interactions in general.

  7. Molecular dynamics simulations and structure-based network analysis reveal structural and functional aspects of G-protein coupled receptor dimer interactions

    NASA Astrophysics Data System (ADS)

    Baltoumas, Fotis A.; Theodoropoulou, Margarita C.; Hamodrakas, Stavros J.

    2016-06-01

    A significant amount of experimental evidence suggests that G-protein coupled receptors (GPCRs) do not act exclusively as monomers but also form biologically relevant dimers and oligomers. However, the structural determinants, stoichiometry and functional importance of GPCR oligomerization remain topics of intense speculation. In this study we attempted to evaluate the nature and dynamics of GPCR oligomeric interactions. A representative set of GPCR homodimers were studied through Coarse-Grained Molecular Dynamics simulations, combined with interface analysis and concepts from network theory for the construction and analysis of dynamic structural networks. Our results highlight important structural determinants that seem to govern receptor dimer interactions. A conserved dynamic behavior was observed among different GPCRs, including receptors belonging in different GPCR classes. Specific GPCR regions were highlighted as the core of the interfaces. Finally, correlations of motion were observed between parts of the dimer interface and GPCR segments participating in ligand binding and receptor activation, suggesting the existence of mechanisms through which dimer formation may affect GPCR function. The results of this study can be used to drive experiments aimed at exploring GPCR oligomerization, as well as in the study of transmembrane protein-protein interactions in general.

  8. GALT protein database, a bioinformatics resource for the management and analysis of structural features of a galactosemia-related protein and its mutants.

    PubMed

    d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna

    2009-06-01

    We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type I. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.

  9. Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

    PubMed Central

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-01-01

    Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389

  10. Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d.

    PubMed

    Doxey, Andrew C; Cheng, Zhenyu; Moffatt, Barbara A; McConkey, Brendan J

    2010-08-03

    Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins. The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry. Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.

  11. Automated structure determination of proteins with the SAIL-FLYA NMR method.

    PubMed

    Takeda, Mitsuhiro; Ikeya, Teppei; Güntert, Peter; Kainosho, Masatsune

    2007-01-01

    The labeling of proteins with stable isotopes enhances the NMR method for the determination of 3D protein structures in solution. Stereo-array isotope labeling (SAIL) provides an optimal stereospecific and regiospecific pattern of stable isotopes that yields sharpened lines, spectral simplification without loss of information, and the ability to collect rapidly and evaluate fully automatically the structural restraints required to solve a high-quality solution structure for proteins up to twice as large as those that can be analyzed using conventional methods. Here, we describe a protocol for the preparation of SAIL proteins by cell-free methods, including the preparation of S30 extract and their automated structure analysis using the FLYA algorithm and the program CYANA. Once efficient cell-free expression of the unlabeled or uniformly labeled target protein has been achieved, the NMR sample preparation of a SAIL protein can be accomplished in 3 d. A fully automated FLYA structure calculation can be completed in 1 d on a powerful computer system.

  12. Structural characterization and physicochemical properties of protein extracted from soybean meal assisted by steam flash-explosion with dilute acid soaking.

    PubMed

    Zhang, Yanpeng; Yang, Ruijin; Zhang, Weinong; Hu, Zhixiong; Zhao, Wei

    2017-03-15

    The aim of this work was to analyze the influence of steam flash-explosion (SFE) with dilute acid soaking pretreatment on the structural characteristics and physiochemical properties of protein from soybean meal (SBM). The pretreatment led to depolymerisation of soy protein isolate (SPI) and formation of new protein aggregation through non-disulfide covalent bonds, which resulted in broader MW distribution of SPI. The analysis of CD spectroscopy showed that the SFE treatment induced minor changes in secondary structure, however, the intrinsic tryptophan fluorescence revealed that acid soaking and SFE treatment pronouncedly altered the tertiary structure of SPI. The protein zeta potential was shown to be increased after SFE treatment attributed to the changes in protein structure and the covalent coupling between carbohydrate and protein. These results contribute to clarifying the mechanisms of the effect of pretreatment on SPI structure, thus moving further toward implementing SFE in the processing chain of SPI. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Proteomic analysis of the crayfish gastrolith chitinous extracellular matrix reveals putative protein complexes and a central role for GAP 65.

    PubMed

    Glazer, Lilah; Roth, Ziv; Weil, Simy; Aflalo, Eliahu D; Khalaila, Isam; Sagi, Amir

    2015-10-14

    Chitin is a major component of arthropod cuticles, where it forms a three-dimensional network that constitutes the scaffold upon which cuticles form. The chitin fibers that form this network are closely associated with specific structural proteins, while the cuticular matrix contains many additional structural, enzymatic and other proteins. We study the crayfish gastrolith as a simple model for the assembly of calcified cuticular structures, with particular focus on the proteins involved in this process. The present study integrates a gastrolith-forming epithelium transcriptomic library with data from mass spectrometry analysis of proteins extracted from the gastrolith matrix to obtain a near-complete picture of gastrolith protein content. Using native protein separation we identified 24 matrix proteins, of which 14 are novel. Further analysis led to discovery of three putative protein complexes, all containing GAP 65 the most abundant gastrolith structural protein. Using immunological methods we further studied the role of GAP 65 in the gastrolith matrix and forming epithelium, as well as in the newly identified protein complexes. We propose that gastrolith matrix construction is a sequential process in which protein complexes are dynamically assembled and disassembled around GAP 65, thus changing their functional properties to perform each step in the construction process. The scientific interest on which this study is based arises from three main features of gastroliths: (1) Gastroliths possess partial analogy to cuticles both in structural and molecular properties, and may be regarded, with the appropriate reservations (see Introduction), as simple models for cuticle assembly. At the same time, gastroliths are terminally assembled during a well-defined period, which can be controlled in the laboratory, making them significantly easier to study than cuticles. (2) Gastroliths, like the crayfish exoskeleton, contain stable amorphous calcium carbonate (ACC) rather than crystalline calcite. The biological mechanism for the stabilization of a naturally unstable, but at the same time biologically highly available, calcium carbonate polymorph is of great interest from the pharmaceutical point of view. (3) The gastrolith organic matrix is based on a highly structured chitin network that interacts with a variety of substances. This biologically manipulated, biodegradable structure is in itself of biotechnological and pharmaceutical potential. A growing body of evidence indicates that proteins play central roles in all above aspects of gastrolith construction. This study offers the first comprehensive screening of gastrolith proteins, and we believe that the analysis presented in this work can not only help reveal basic biological questions regarding assembly of mineralized and non-mineralized cuticular structures, but may also serve as basis for applied research in the fields of agriculture (e.g. cuticle-based pest management), health (e.g. bioavailable calcium supplements and biodegradable drug carriers) and materials science (e.g. non-toxic scaffolds for water purification). Copyright © 2015. Published by Elsevier B.V.

  14. Insights into Bacteriophage T5 Structure from Analysis of Its Morphogenesis Genes and Protein Components

    PubMed Central

    Zivanovic, Yvan; Confalonieri, Fabrice; Ponchon, Luc; Lurz, Rudi; Chami, Mohamed; Flayhan, Ali; Renouard, Madalena; Huet, Alexis; Decottignies, Paulette; Davidson, Alan R.; Breyton, Cécile

    2014-01-01

    Bacteriophage T5 represents a large family of lytic Siphoviridae infecting Gram-negative bacteria. The low-resolution structure of T5 showed the T=13 geometry of the capsid and the unusual trimeric organization of the tail tube, and the assembly pathway of the capsid was established. Although major structural proteins of T5 have been identified in these studies, most of the genes encoding the morphogenesis proteins remained to be identified. Here, we combine a proteomic analysis of T5 particles with a bioinformatic study and electron microscopic immunolocalization to assign function to the genes encoding the structural proteins, the packaging proteins, and other nonstructural components required for T5 assembly. A head maturation protease that likely accounts for the cleavage of the different capsid proteins is identified. Two other proteins involved in capsid maturation add originality to the T5 capsid assembly mechanism: the single head-to-tail joining protein, which closes the T5 capsid after DNA packaging, and the nicking endonuclease responsible for the single-strand interruptions in the T5 genome. We localize most of the tail proteins that were hitherto uncharacterized and provide a detailed description of the tail tip composition. Our findings highlight novel variations of viral assembly strategies and of virion particle architecture. They further recommend T5 for exploring phage structure and assembly and for deciphering conformational rearrangements that accompany DNA transfer from the capsid to the host cytoplasm. PMID:24198424

  15. Comparison of intrinsic dynamics of cytochrome p450 proteins using normal mode analysis

    PubMed Central

    Dorner, Mariah E; McMunn, Ryan D; Bartholow, Thomas G; Calhoon, Brecken E; Conlon, Michelle R; Dulli, Jessica M; Fehling, Samuel C; Fisher, Cody R; Hodgson, Shane W; Keenan, Shawn W; Kruger, Alyssa N; Mabin, Justin W; Mazula, Daniel L; Monte, Christopher A; Olthafer, Augustus; Sexton, Ashley E; Soderholm, Beatrice R; Strom, Alexander M; Hati, Sanchita

    2015-01-01

    Cytochrome P450 enzymes are hemeproteins that catalyze the monooxygenation of a wide-range of structurally diverse substrates of endogenous and exogenous origin. These heme monooxygenases receive electrons from NADH/NADPH via electron transfer proteins. The cytochrome P450 enzymes, which constitute a diverse superfamily of more than 8,700 proteins, share a common tertiary fold but < 25% sequence identity. Based on their electron transfer protein partner, cytochrome P450 proteins are classified into six broad classes. Traditional methods of pro are based on the canonical paradigm that attributes proteins' function to their three-dimensional structure, which is determined by their primary structure that is the amino acid sequence. It is increasingly recognized that protein dynamics play an important role in molecular recognition and catalytic activity. As the mobility of a protein is an intrinsic property that is encrypted in its primary structure, we examined if different classes of cytochrome P450 enzymes display any unique patterns of intrinsic mobility. Normal mode analysis was performed to characterize the intrinsic dynamics of five classes of cytochrome P450 proteins. The present study revealed that cytochrome P450 enzymes share a strong dynamic similarity (root mean squared inner product > 55% and Bhattacharyya coefficient > 80%), despite the low sequence identity (< 25%) and sequence similarity (< 50%) across the cytochrome P450 superfamily. Noticeable differences in Cα atom fluctuations of structural elements responsible for substrate binding were noticed. These differences in residue fluctuations might be crucial for substrate selectivity in these enzymes. PMID:26130403

  16. Effects of power ultrasound on oxidation and structure of beef proteins during curing processing.

    PubMed

    Kang, Da-Cheng; Zou, Yun-He; Cheng, Yu-Ping; Xing, Lu-Juan; Zhou, Guang-Hong; Zhang, Wan-Gang

    2016-11-01

    The aim of this study was to evaluate the effects of power ultrasound intensity (PUS, 2.39, 6.23, 11.32 and 20.96Wcm(-2)) and treatment time (30, 60, 90 and 120min) on the oxidation and structure of beef proteins during the brining procedure with 6% NaCl concentration. The investigation was conducted with an ultrasonic generator with the frequency of 20kHz and fresh beef at 48h after slaughter. Analysis of TBARS (Thiobarbituric acid reactive substances) contents showed that PUS treatment significantly increased the extent of lipid oxidation compared to static brining (P<0.05). As indicators of protein oxidation, the carbonyl contents were significantly affected by PUS (P<0.05). SDS-PAGE analysis showed that PUS treatment increased protein aggregation through disulfide cross-linking, indicated by the decreasing content of total sulfhydryl groups which would contribute to protein oxidation. In addition, changes in protein structure after PUS treatment are suggested by the increases in free sulfhydryl residues and protein surface hydrophobicity. Fourier transformed infrared spectroscopy (FTIR) provided further information about the changes in protein secondary structures with increases in β-sheet and decreases in α-helix contents after PUS processing. These results indicate that PUS leads to changes in structures and oxidation of beef proteins caused by mechanical effects of cavitation and the resultant generation of free radicals. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. A series of PDB related databases for everyday needs.

    PubMed

    Joosten, Robbie P; te Beek, Tim A H; Krieger, Elmar; Hekkelman, Maarten L; Hooft, Rob W W; Schneider, Reinhard; Sander, Chris; Vriend, Gert

    2011-01-01

    The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design.

  18. Non-Uniform Sampling and J-UNIO Automation for Efficient Protein NMR Structure Determination.

    PubMed

    Didenko, Tatiana; Proudfoot, Andrew; Dutta, Samit Kumar; Serrano, Pedro; Wüthrich, Kurt

    2015-08-24

    High-resolution structure determination of small proteins in solution is one of the big assets of NMR spectroscopy in structural biology. Improvements in the efficiency of NMR structure determination by advances in NMR experiments and automation of data handling therefore attracts continued interest. Here, non-uniform sampling (NUS) of 3D heteronuclear-resolved [(1)H,(1)H]-NOESY data yielded two- to three-fold savings of instrument time for structure determinations of soluble proteins. With the 152-residue protein NP_372339.1 from Staphylococcus aureus and the 71-residue protein NP_346341.1 from Streptococcus pneumonia we show that high-quality structures can be obtained with NUS NMR data, which are equally well amenable to robust automated analysis as the corresponding uniformly sampled data. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Structural and kinetic analysis of the unnatural fusion protein 4-coumaroyl-CoA ligase::stilbene synthase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Yechun; Yi, Hankuil; Wang, Melissa

    2012-10-24

    To increase the biochemical efficiency of biosynthetic systems, metabolic engineers have explored different approaches for organizing enzymes, including the generation of unnatural fusion proteins. Previous work aimed at improving the biosynthesis of resveratrol, a stilbene associated a range of health-promoting activities, in yeast used an unnatural engineered fusion protein of Arabidopsis thaliana (thale cress) 4-coumaroyl-CoA ligase (At4CL1) and Vitis vinifera (grape) stilbene synthase (VvSTS) to increase resveratrol levels 15-fold relative to yeast expressing the individual enzymes. Here we present the crystallographic and biochemical analysis of the 4CL::STS fusion protein. Determination of the X-ray crystal structure of 4CL::STS provides the firstmore » molecular view of an artificial didomain adenylation/ketosynthase fusion protein. Comparison of the steady-state kinetic properties of At4CL1, VvSTS, and 4CL::STS demonstrates that the fusion protein improves catalytic efficiency of either reaction less than 3-fold. Structural and kinetic analysis suggests that colocalization of the two enzyme active sites within 70 {angstrom} of each other provides the basis for enhanced in vivo synthesis of resveratrol.« less

  20. Influence of the R823W mutation on the interaction of the ANKS6-ANKS3: insights from molecular dynamics simulation and free energy analysis.

    PubMed

    Kan, Wei; Fang, Fengqin; Chen, Lin; Wang, Ruige; Deng, Qigang

    2016-05-01

    The sterile alpha motif (SAM) domain of the protein ANKS6, a protein-protein interaction domain, is responsible for autosomal dominant polycystic kidney disease. Although the disease is the result of the R823W point mutation in the SAM domain of the protein ANKS6, the molecular details are still unclear. We applied molecular dynamics simulations, the principal component analysis, and the molecular mechanics Poisson-Boltzmann surface area binding free energy calculation to explore the structural and dynamic effects of the R823W point mutation on the complex ANKS6-ANKS3 (PDB ID: 4NL9) in comparison to the wild proteins. The energetic analysis presents that the wild type has a more stable structure than the mutant. The R823W point mutation not only disrupts the structure of the ANKS6 SAM domain but also negatively affects the interaction of the ANKS6-ANKS3. These results further clarify the previous experiments to understand the ANKS6-ANKS3 interaction comprehensively. In summary, this study would provide useful suggestions to understand the interaction of these proteins and their fatal action on mediating kidney function.

  1. Voroprot: an interactive tool for the analysis and visualization of complex geometric features of protein structure.

    PubMed

    Olechnovic, Kliment; Margelevicius, Mindaugas; Venclovas, Ceslovas

    2011-03-01

    We present Voroprot, an interactive cross-platform software tool that provides a unique set of capabilities for exploring geometric features of protein structure. Voroprot allows the construction and visualization of the Apollonius diagram (also known as the additively weighted Voronoi diagram), the Apollonius graph, protein alpha shapes, interatomic contact surfaces, solvent accessible surfaces, pockets and cavities inside protein structure. Voroprot is available for Windows, Linux and Mac OS X operating systems and can be downloaded from http://www.ibt.lt/bioinformatics/voroprot/.

  2. Impact of genetic variation on three dimensional structure and function of proteins

    PubMed Central

    Bhattacharya, Roshni; Rose, Peter W.; Burley, Stephen K.

    2017-01-01

    The Protein Data Bank (PDB; http://wwpdb.org) was established in 1971 as the first open access digital data resource in biology with seven protein structures as its initial holdings. The global PDB archive now contains more than 126,000 experimentally determined atomic level three-dimensional (3D) structures of biological macromolecules (proteins, DNA, RNA), all of which are freely accessible via the Internet. Knowledge of the 3D structure of the gene product can help in understanding its function and role in disease. Of particular interest in the PDB archive are proteins for which 3D structures of genetic variant proteins have been determined, thus revealing atomic-level structural differences caused by the variation at the DNA level. Herein, we present a systematic and qualitative analysis of such cases. We observe a wide range of structural and functional changes caused by single amino acid differences, including changes in enzyme activity, aggregation propensity, structural stability, binding, and dissociation, some in the context of large assemblies. Structural comparison of wild type and mutated proteins, when both are available, provide insights into atomic-level structural differences caused by the genetic variation. PMID:28296894

  3. Linking structural biology with genome research: Beamlines for the Berlin ``Protein Structure Factory'' initiative

    NASA Astrophysics Data System (ADS)

    Illing, Gerd; Saenger, Wolfram; Heinemann, Udo

    2000-06-01

    The Protein Structure Factory will be established to characterize proteins encoded by human genes or cDNAs, which will be selected by criteria of potential structural novelty or medical or biotechnological usefulness. It represents an integrative approach to structure analysis combining bioinformatics techniques, automated gene expression and purification of gene products, generation of a biophysical fingerprint of the proteins and the determination of their three-dimensional structures either by NMR spectroscopy or by X-ray diffraction. The use of synchrotron radiation will be crucial to the Protein Structure Factory: high brilliance and tunable wavelengths are prerequisites for fast data collection, the use of small crystals and multiwavelength anomalous diffraction (MAD) phasing. With the opening of BESSY II, direct access to a third-generation XUV storage ring source with excellent conditions is available nearby. An insertion device with two MAD beamlines and one constant energy station will be set up until 2001.

  4. Hydrogen Exchange Mass Spectrometry

    PubMed Central

    Mayne, Leland

    2018-01-01

    Hydrogen exchange (HX) methods can reveal much about the structure, energetics, and dynamics of proteins. The addition of mass spectrometry (MS) to an earlier fragmentation-separation HX analysis now extends HX studies to larger proteins at high structural resolution and can provide information not available before. This chapter discusses experimental aspects of HX labeling, especially with respect to the use of MS and the analysis of MS data. PMID:26791986

  5. Analysis of protein circular dichroism spectra for secondary structure using a simple matrix multiplication.

    PubMed

    Compton, L A; Johnson, W C

    1986-05-15

    Inverse circular dichroism (CD) spectra are presented for each of the five major secondary structures of proteins: alpha-helix, antiparallel and parallel beta-sheet, beta-turn, and other (random) structures. The fraction of the each secondary structure in a protein is predicted by forming the dot product of the corresponding inverse CD spectrum, expressed as a vector, with the CD spectrum of the protein digitized in the same way. We show how this method is based on the construction of the generalized inverse from the singular value decomposition of a set of CD spectra corresponding to proteins whose secondary structures are known from X-ray crystallography. These inverse spectra compute secondary structure directly from protein CD spectra without resorting to least-squares fitting and standard matrix inversion techniques. In addition, spectra corresponding to the individual secondary structures, analogous to the CD spectra of synthetic polypeptides, are generated from the five most significant CD eigenvectors.

  6. qPIPSA: Relating enzymatic kinetic parameters and interaction fields

    PubMed Central

    Gabdoulline, Razif R; Stein, Matthias; Wade, Rebecca C

    2007-01-01

    Background The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes. Results We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure. Conclusion Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism. PMID:17919319

  7. The hypothetical protein Atu4866 from Agrobacterium tumefaciens adopts a streptavidin-like fold

    PubMed Central

    Ai, Xuanjun; Semesi, Anthony; Yee, Adelinda; Arrowsmith, Cheryl H.; Choy, Wing-Yiu; Li, Shawn S.C.

    2008-01-01

    Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares ≥60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a β-barrel/sandwich formed by eight antiparallel β-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site. PMID:18042676

  8. Universality and diversity of folding mechanics for three-helix bundle proteins.

    PubMed

    Yang, Jae Shick; Wallin, Stefan; Shakhnovich, Eugene I

    2008-01-22

    In this study we evaluate, at full atomic detail, the folding processes of two small helical proteins, the B domain of protein A and the Villin headpiece. Folding kinetics are studied by performing a large number of ab initio Monte Carlo folding simulations using a single transferable all-atom potential. Using these trajectories, we examine the relaxation behavior, secondary structure formation, and transition-state ensembles (TSEs) of the two proteins and compare our results with experimental data and previous computational studies. To obtain a detailed structural information on the folding dynamics viewed as an ensemble process, we perform a clustering analysis procedure based on graph theory. Moreover, rigorous p(fold) analysis is used to obtain representative samples of the TSEs and a good quantitative agreement between experimental and simulated Phi values is obtained for protein A. Phi values for Villin also are obtained and left as predictions to be tested by future experiments. Our analysis shows that the two-helix hairpin is a common partially stable structural motif that gets formed before entering the TSE in the studied proteins. These results together with our earlier study of Engrailed Homeodomain and recent experimental studies provide a comprehensive, atomic-level picture of folding mechanics of three-helix bundle proteins.

  9. Cryo-EM of dynamic protein complexes in eukaryotic DNA replication.

    PubMed

    Sun, Jingchuan; Yuan, Zuanning; Bai, Lin; Li, Huilin

    2017-01-01

    DNA replication in Eukaryotes is a highly dynamic process that involves several dozens of proteins. Some of these proteins form stable complexes that are amenable to high-resolution structure determination by cryo-EM, thanks to the recent advent of the direct electron detector and powerful image analysis algorithm. But many of these proteins associate only transiently and flexibly, precluding traditional biochemical purification. We found that direct mixing of the component proteins followed by 2D and 3D image sorting can capture some very weakly interacting complexes. Even at 2D average level and at low resolution, EM images of these flexible complexes can provide important biological insights. It is often necessary to positively identify the feature-of-interest in a low resolution EM structure. We found that systematically fusing or inserting maltose binding protein (MBP) to selected proteins is highly effective in these situations. In this chapter, we describe the EM studies of several protein complexes involved in the eukaryotic DNA replication over the past decade or so. We suggest that some of the approaches used in these studies may be applicable to structural analysis of other biological systems. © 2016 The Protein Society.

  10. Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

    PubMed Central

    Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

    2014-01-01

    In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061

  11. Stiffening of flexible SUMO1 protein upon peptide-binding: Analysis with anisotropic network model.

    PubMed

    Sarkar, Ranja

    2018-01-01

    SUMO (small ubiquitin-like modifier) proteins interact with a large number of target proteins via a key regulatory event called sumoylation that encompasses activation, conjugation and ligation of SUMO proteins through specific E1, E2, and E3-type enzymes respectively. Single-molecule atomic force microscopic (AFM) experiments performed to unravel bound SUMO1 along its NC termini direction reveal that E3-ligases (in the form of small peptides) increase mechanical stability (along the axis) of the flexible protein upon binding. The experimental results are expected to correlate with the intrinsic flexibility of bound SUMO1 protein in the native state i.e., the bound conformation of SUMO1 without the binding peptide. The native protein flexibility/stiffness can be measured as a spring constant by normal mode analysis. In the present study, protein normal modes are computed from the protein structural data (as input from protein databank) via a simple anisotropic network model (ANM). ANM is computationally inexpensive and hence, can be explored to investigate and compare the native conformational dynamics of unbound and bound (without the binding partner) structures, if the corresponding structural data (NMR/X-ray) are available. The paper illustrates that SUMO1 stiffens (native flexibility decreases) along the NC termini (end-to-end) direction of the protein upon binding to small peptides; however, the degree of stiffening is peptide sequence-specific. The theoretical results are demonstrated for NMR structures of unbound SUMO1 and that bound to two peptides having short amino acid motifs and of similar size, one being an M-IR2 peptide derived from RanBP2 protein and the other one derived from PIASX protein. The peptide derived from PIASX stiffens SUMO1 remarkably which is evident from an atomic-level normal mode analysis. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Proteins without unique 3D structures: biotechnological applications of intrinsically unstable/disordered proteins.

    PubMed

    Uversky, Vladimir N

    2015-03-01

    Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Generation of wavy structure on lipid membrane by peripheral proteins: a linear elastic analysis.

    PubMed

    Mahata, Paritosh; Das, Sovan Lal

    2017-05-01

    We carry out a linear elastic analysis to study wavy structure generation on lipid membrane by peripheral membrane proteins. We model the lipid membrane as linearly elastic and anisotropic material. The hydrophobic insertion by proteins into the lipid membrane has been idealized as penetration of rigid rod-like inclusions into the membrane and the electrostatic interaction between protein and membrane has been modeled by a distributed surface traction acting on the membrane surface. With the proposed model we study curvature generation by several binding domains of peripheral membrane proteins containing BAR domains and amphipathic alpha-helices. It is observed that electrostatic interaction is essential for curvature generation by the BAR domains. © 2017 Federation of European Biochemical Societies.

  14. Study of structure-function relationships in proteins: Techniques and applications ot cytochrome c: Final report January 15, 1988--January 14, 1989

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldstein, D.A.; Rackovsky, S.R.

    1989-08-01

    During the initial period of this work we explored the differential geometry results which had been used to explain the structure-function relationships in the set of yeast iso-1-cytochrome c mutants studied under the initial contract. In addition we continued the development of techniques which would permit the structural characterization and comparison of proteins in a very efficient manner. We have expanded the studies based on the characterization of the structural preferences of various residues in a sample of twenty six globular proteins. It has been demonstrated that the overall structural preferences and the amino acid specific preferences seen in themore » analysis carried out at the five alpha carbon level can not be explained by the results of the analysis carried out at the four alpha carbon level. Thus the structural preferences seen must be described by considering groups of five or more residues. We do no yet have enough data to extend the analysis to the six alpha carbon unit level. We have also verified that the yeast/tuna structural analogy which we used before was justified, and have performed a conformational energy minimization of the reduced yeast cytochrome c crystal data in order to have a baseline for the study of mutant proteins. 6 refs.« less

  15. Identification of a new protein in the centrosome-like "atractophore" of Trichomonas vaginalis.

    PubMed

    Bricheux, Geneviève; Coffe, Gérard; Brugerolle, Guy

    2007-06-01

    The human parasite Trichomonas vaginalis has specific structural bodies, atractophores, associated at one end to the kinetosomes and at the other to the spindle during division. A monoclonal antibody specific for a component of this structure was obtained. It recognizes a protein with a predicted molecular mass of 477 kDa. Sequence analysis of this protein shows that P477 belongs to the family of large coiled-coil proteins, sharing a highly versatile protein folding motif adaptable to many biological functions. P477-might act as an anchor to localize cellular activities and components to the golgi centrosomal region. It may represent a new class of structural proteins, since similar proteins were found in many protozoans.

  16. SFG analysis of surface bound proteins: a route towards structure determination.

    PubMed

    Weidner, Tobias; Castner, David G

    2013-08-14

    The surface of a material is rapidly covered with proteins once that material is placed in a biological environment. The structure and function of these bound proteins play a key role in the interactions and communications of the material with the biological environment. Thus, it is crucial to gain a molecular level understanding of surface bound protein structure. While X-ray diffraction and solution phase NMR methods are well established for determining the structure of proteins in the crystalline or solution phase, there is not a corresponding single technique that can provide the same level of structural detail about proteins at surfaces or interfaces. However, recent advances in sum frequency generation (SFG) vibrational spectroscopy have significantly increased our ability to obtain structural information about surface bound proteins and peptides. A multi-technique approach of combining SFG with (1) protein engineering methods to selectively introduce mutations and isotopic labels, (2) other experimental methods such as time-of-flight secondary ion mass spectrometry (ToF-SIMS) and near edge X-ray absorption fine structure (NEXAFS) to provide complementary information, and (3) molecular dynamic (MD) simulations to extend the molecular level experimental results is a particularly promising route for structural characterization of surface bound proteins and peptides. By using model peptides and small proteins with well-defined structures, methods have been developed to determine the orientation of both backbone and side chains to the surface.

  17. SFG analysis of surface bound proteins: A route towards structure determination

    PubMed Central

    Weidner, Tobias; Castner, David G.

    2013-01-01

    The surface of a material is rapidly covered with proteins once that material is placed in a biological environment. The structure and function of these bound proteins play a key role in the interactions and communications of the material with the biological environment. Thus, it is crucial to gain a molecular level understanding of surface bound protein structure. While X-ray diffraction and solution phase NMR methods are well established for determining the structure of proteins in the crystalline or solution phase, there is not a corresponding single technique that can provide the same level of structural detail about proteins at surfaces or interfaces. However, recent advances in sum frequency generation (SFG) vibrational spectroscopy have significantly increased our ability to obtain structural information about surface bound proteins and peptides. A multi-technique approach of combining SFG with (1) protein engineering methods to selectively introduce mutations and isotopic labels, (2) other experimental methods such as time-of-flight secondary ion mass spectrometry (ToF-SIMS) and near edge x-ray absorption fine structure (NEXAFS) to provide complementary information, and (3) molecular dynamic (MD) simulations to extend the molecular level experimental results is a particularly promising route for structural characterization of surface bound proteins and peptides. By using model peptides and small proteins with well-defined structures, methods have been developed to determine the orientation of both backbone and side chains to the surface. PMID:23727992

  18. An Augmented Pocketome: Detection and Analysis of Small-Molecule Binding Pockets in Proteins of Known 3D Structure.

    PubMed

    Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma

    2018-03-06

    Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. Phosphorylation of the budgerigar fledgling disease virus major capsid protein VP1

    NASA Technical Reports Server (NTRS)

    Haynes, J. I. 2nd; Consigli, R. A.; Spooner, B. S. (Principal Investigator)

    1992-01-01

    The structural proteins of the budgerigar fledgling disease virus, the first known nonmammalian polyomavirus, were analyzed by isoelectric focusing and sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The major capsid protein VP1 was found to be composed of at least five distinct species having isoelectric points ranging from pH 6.45 to 5.85. By analogy with the murine polyomavirus, these species apparently result from different modifications of an initial translation product. Primary chicken embryo cells were infected in the presence of 32Pi to determine whether the virus structural proteins were modified by phosphorylation. SDS-PAGE of the purified virus structural proteins demonstrated that VP1 (along with both minor capsid proteins) was phosphorylated. Two-dimensional analysis of the radiolabeled virus showed phosphorylation of only the two most acidic isoelectric species of VP1, indicating that this posttranslational modification contributes to VP1 species heterogeneity. Phosphoamino acid analysis of 32P-labeled VP1 revealed that phosphoserine is the only phosphoamino acid present in the VP1 protein.

  20. Global disulfide bond profiling for crude snake venom using dimethyl labeling coupled with mass spectrometry and RADAR algorithm.

    PubMed

    Huang, Sheng Yu; Chen, Sung Fang; Chen, Chun Hao; Huang, Hsuan Wei; Wu, Wen Guey; Sung, Wang Chou

    2014-09-02

    Snake venom consists of toxin proteins with multiple disulfide linkages to generate unique structures and biological functions. Determination of these cysteine connections usually requires the purification of each protein followed by structural analysis. In this study, dimethyl labeling coupled with LC-MS/MS and RADAR algorithm was developed to identify the disulfide bonds in crude snake venom. Without any protein separation, the disulfide linkages of several cytotoxins and PLA2 could be solved, including more than 20 disulfide bonds. The results show that this method is capable of analyzing protein mixture. In addition, the approach was also used to compare native cytotoxin 3 (CTX III) and its scrambled isomer, another category of protein mixture, for unknown disulfide bonds. Two disulfide-linked peptides were observed in the native CTX III, and 10 in its scrambled form, X-CTX III. This is the first study that reports a platform for the global cysteine connection analysis on a protein mixture. The proposed method is simple and automatic, offering an efficient tool for structural and functional studies of venom proteins.

  1. Network Analysis of Protein Adaptation: Modeling the Functional Impact of Multiple Mutations

    PubMed Central

    Beleva Guthrie, Violeta; Masica, David L; Fraser, Andrew; Federico, Joseph; Fan, Yunfan; Camps, Manel; Karchin, Rachel

    2018-01-01

    Abstract The evolution of new biochemical activities frequently involves complex dependencies between mutations and rapid evolutionary radiation. Mutation co-occurrence and covariation have previously been used to identify compensating mutations that are the result of physical contacts and preserve protein function and fold. Here, we model pairwise functional dependencies and higher order interactions that enable evolution of new protein functions. We use a network model to find complex dependencies between mutations resulting from evolutionary trade-offs and pleiotropic effects. We present a method to construct these networks and to identify functionally interacting mutations in both extant and reconstructed ancestral sequences (Network Analysis of Protein Adaptation). The time ordering of mutations can be incorporated into the networks through phylogenetic reconstruction. We apply NAPA to three distantly homologous β-lactamase protein clusters (TEM, CTX-M-3, and OXA-51), each of which has experienced recent evolutionary radiation under substantially different selective pressures. By analyzing the network properties of each protein cluster, we identify key adaptive mutations, positive pairwise interactions, different adaptive solutions to the same selective pressure, and complex evolutionary trajectories likely to increase protein fitness. We also present evidence that incorporating information from phylogenetic reconstruction and ancestral sequence inference can reduce the number of spurious links in the network, whereas preserving overall network community structure. The analysis does not require structural or biochemical data. In contrast to function-preserving mutation dependencies, which are frequently from structural contacts, gain-of-function mutation dependencies are most commonly between residues distal in protein structure. PMID:29522102

  2. The visualCMAT: A web-server to select and interpret correlated mutations/co-evolving residues in protein families.

    PubMed

    Suplatov, Dmitry; Sharapova, Yana; Timonina, Daria; Kopylov, Kirill; Švedas, Vytas

    2018-04-01

    The visualCMAT web-server was designed to assist experimental research in the fields of protein/enzyme biochemistry, protein engineering, and drug discovery by providing an intuitive and easy-to-use interface to the analysis of correlated mutations/co-evolving residues. Sequence and structural information describing homologous proteins are used to predict correlated substitutions by the Mutual information-based CMAT approach, classify them into spatially close co-evolving pairs, which either form a direct physical contact or interact with the same ligand (e.g. a substrate or a crystallographic water molecule), and long-range correlations, annotate and rank binding sites on the protein surface by the presence of statistically significant co-evolving positions. The results of the visualCMAT are organized for a convenient visual analysis and can be downloaded to a local computer as a content-rich all-in-one PyMol session file with multiple layers of annotation corresponding to bioinformatic, statistical and structural analyses of the predicted co-evolution, or further studied online using the built-in interactive analysis tools. The online interactivity is implemented in HTML5 and therefore neither plugins nor Java are required. The visualCMAT web-server is integrated with the Mustguseal web-server capable of constructing large structure-guided sequence alignments of protein families and superfamilies using all available information about their structures and sequences in public databases. The visualCMAT web-server can be used to understand the relationship between structure and function in proteins, implemented at selecting hotspots and compensatory mutations for rational design and directed evolution experiments to produce novel enzymes with improved properties, and employed at studying the mechanism of selective ligand's binding and allosteric communication between topologically independent sites in protein structures. The web-server is freely available at https://biokinet.belozersky.msu.ru/visualcmat and there are no login requirements.

  3. Mechanical stability analysis of the protein L immunoglobulin-binding domain by full alanine screening using molecular dynamics simulations.

    PubMed

    Glyakina, Anna V; Likhachev, Ilya V; Balabaev, Nikolay K; Galzitskaya, Oxana V

    2015-03-01

    This article is the first to study the mechanical properties of the immunoglobulin-binding domain of protein L (referred to as protein L) and its mutants at the atomic level. In the structure of protein L, each amino acid residue (except for alanines and glycines) was replaced sequentially by alanine. Thus, 49 mutants of protein L were obtained. The proteins were stretched at their termini at constant velocity using molecular dynamics simulations in water, i.e. by forced unfolding. 19 out of 49 mutations resulted in a large decrease of mechanical protein stability. These amino acids were affecting either the secondary structure (11 mutations) or loop structures (8 mutations) of protein L. Analysis of mechanical unfolding of the generated protein that has the same topology as protein L but consists of only alanines and glycines allows us to suggest that the mechanical stability of proteins, and specifically protein L, is determined by interactions between certain amino acid residues, although the unfolding pathway depends on the protein topology. This insight can now be used to modulate the mechanical properties of proteins and their unfolding pathways in the desired direction for using them in various biochips, biosensors and biomaterials for medicine, industry, and household purposes. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Query3d: a new method for high-throughput analysis of functional residues in protein structures.

    PubMed

    Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela

    2005-12-01

    The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface.

  5. Query3d: a new method for high-throughput analysis of functional residues in protein structures

    PubMed Central

    Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela

    2005-01-01

    Background The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Results Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. Conclusion With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface. PMID:16351754

  6. Duck hepatitis A virus structural proteins expressed in insect cells self-assemble into virus-like particles with strong immunogenicity in ducklings.

    PubMed

    Wang, Anping; Gu, Lingling; Wu, Shuang; Zhu, Shanyuan

    2018-02-01

    Duck hepatitis A virus (DHAV), a non-enveloped ssRNA virus, can cause a highly contagious disease in young ducklings. The three capsid proteins of VP0, VP1 and VP3 are translated within a single large open reading frame (ORF) and hydrolyzed by protease 3CD. However, little is known on whether the recombinant viral structural proteins (VPs) expressed in insect cells could spontaneously assemble into virus-like particles (VLPs) and whether these VLPs could induce protective immunity in young ducklings. To address these issues, the structural polyprotein precursor gene P1 and the protease gene 3CD were amplified by PCR, and the recombinant proteins were expressed in insect cells using a baculovirus expression system for the characterization of their structures and immunogenicity. The recombinant proteins expressed in Sf9 cells were detected by indirect immunofluorescence assay and Western blot analysis. Electron microscopy showed that the recombinant proteins spontaneously assembled into VLPs in insect cells. Western blot analysis of the purified VLPs revealed that the VLPs were composed with the three structural proteins. In addition, vaccination with the VLPs induced high humoral immune response and provided strong protection. Therefore, our findings may provide a framework for development of new vaccines for the prevention of duck viral hepatitis. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Structural and functional analyses of genes encoding VQ proteins in apple.

    PubMed

    Dong, Qinglong; Zhao, Shuang; Duan, Dingyue; Tian, Yi; Wang, Yanpeng; Mao, Ke; Zhou, Zongshan; Ma, Fengwang

    2018-07-01

    Recent studies with Arabidopsis and soybean have shown that a class of valine-glutamine (VQ) motif-containing proteins interacts with some WRKY transcription factors. However, little is known about the evolution, structures, and functions of those proteins in apple. Here, we examined their features and identified 49 apple VQ genes. Our evolutional analysis revealed that the proteins could be clustered into nine groups together with their homologues in 33 species. Historically, the main characteristics of proteins in Groups I, V, VI, VII, IX, and X were thought to have been generated before the monocot-dicot split, whereas those in Groups II, III + IV, and VIII were generated after that split. In the structural analysis, apple MdVQ proteins appeared to bind only with Group I and IIc MdWRKY proteins. Meanwhile, MdVQ1, MdVQ10, MdVQ15, and MdVQ36 interacted with multiple MdVQ proteins to form heterodimers but MdVQ15 formed a homodimer. The functional analysis indicated that overexpression of some apple MdVQs in Arabidopsis and tobacco plants effected their vegetative and reproductive growth. These results provide important information about the characteristics of apple MdVQ genes and can serve as a solid foundation for further studies about the role of WRKY-VQ interactions in regulating apple developmental and defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. The calcium binding properties and structure prediction of the Hax-1 protein.

    PubMed

    Balcerak, Anna; Rowinski, Sebastian; Szafron, Lukasz M; Grzybowska, Ewa A

    2017-01-01

    Hax-1 is a protein involved in regulation of different cellular processes, but its properties and exact mechanisms of action remain unknown. In this work, using purified, recombinant Hax-1 and by applying an in vitro autoradiography assay we have shown that this protein binds Ca 2+ . Additionally, we performed structure prediction analysis which shows that Hax-1 displays definitive structural features, such as two α-helices, short β-strands and four disordered segments.

  9. NMR in the SPINE Structural Proteomics project.

    PubMed

    Ab, E; Atkinson, A R; Banci, L; Bertini, I; Ciofi-Baffoni, S; Brunner, K; Diercks, T; Dötsch, V; Engelke, F; Folkers, G E; Griesinger, C; Gronwald, W; Günther, U; Habeck, M; de Jong, R N; Kalbitzer, H R; Kieffer, B; Leeflang, B R; Loss, S; Luchinat, C; Marquardsen, T; Moskau, D; Neidig, K P; Nilges, M; Piccioli, M; Pierattelli, R; Rieping, W; Schippmann, T; Schwalbe, H; Travé, G; Trenner, J; Wöhnert, J; Zweckstetter, M; Kaptein, R

    2006-10-01

    This paper describes the developments, role and contributions of the NMR spectroscopy groups in the Structural Proteomics In Europe (SPINE) consortium. Focusing on the development of high-throughput (HTP) pipelines for NMR structure determinations of proteins, all aspects from sample preparation, data acquisition, data processing, data analysis to structure determination have been improved with respect to sensitivity, automation, speed, robustness and validation. Specific highlights are protonless (13)C-direct detection methods and inferential structure determinations (ISD). In addition to technological improvements, these methods have been applied to deliver over 60 NMR structures of proteins, among which are five that failed to crystallize. The inclusion of NMR spectroscopy in structural proteomics pipelines improves the success rate for protein structure determinations.

  10. Evolutionary-inspired probabilistic search for enhancing sampling of local minima in the protein energy surface

    PubMed Central

    2012-01-01

    Background Despite computational challenges, elucidating conformations that a protein system assumes under physiologic conditions for the purpose of biological activity is a central problem in computational structural biology. While these conformations are associated with low energies in the energy surface that underlies the protein conformational space, few existing conformational search algorithms focus on explicitly sampling low-energy local minima in the protein energy surface. Methods This work proposes a novel probabilistic search framework, PLOW, that explicitly samples low-energy local minima in the protein energy surface. The framework combines algorithmic ingredients from evolutionary computation and computational structural biology to effectively explore the subspace of local minima. A greedy local search maps a conformation sampled in conformational space to a nearby local minimum. A perturbation move jumps out of a local minimum to obtain a new starting conformation for the greedy local search. The process repeats in an iterative fashion, resulting in a trajectory-based exploration of the subspace of local minima. Results and conclusions The analysis of PLOW's performance shows that, by navigating only the subspace of local minima, PLOW is able to sample conformations near a protein's native structure, either more effectively or as well as state-of-the-art methods that focus on reproducing the native structure for a protein system. Analysis of the actual subspace of local minima shows that PLOW samples this subspace more effectively that a naive sampling approach. Additional theoretical analysis reveals that the perturbation function employed by PLOW is key to its ability to sample a diverse set of low-energy conformations. This analysis also suggests directions for further research and novel applications for the proposed framework. PMID:22759582

  11. Computation-Guided Backbone Grafting of a Discontinuous Motif onto a Protein Scaffold

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azoitei, Mihai L.; Correia, Bruno E.; Ban, Yih-En Andrew

    2012-02-07

    The manipulation of protein backbone structure to control interaction and function is a challenge for protein engineering. We integrated computational design with experimental selection for grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by the cross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 with high specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffold bound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The method can be generalized to design other functional proteins through backbone grafting.

  12. Ultratight crystal packing of a 10 kDa protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trillo-Muyo, Sergio; Jasilionis, Andrius; Domagalski, Marcin J.

    2013-03-01

    The crystal structure of the C-terminal domain of a putative U32 peptidase from G. thermoleovorans is reported; it is one of the most tightly packed protein structures reported to date. While small organic molecules generally crystallize forming tightly packed lattices with little solvent content, proteins form air-sensitive high-solvent-content crystals. Here, the crystallization and full structure analysis of a novel recombinant 10 kDa protein corresponding to the C-terminal domain of a putative U32 peptidase are reported. The orthorhombic crystal contained only 24.5% solvent and is therefore among the most tightly packed protein lattices ever reported.

  13. Navigating ligand protein binding free energy landscapes: universality and diversity of protein folding and molecular recognition mechanisms

    NASA Astrophysics Data System (ADS)

    Verkhivker, Gennady M.; Rejto, Paul A.; Bouzida, Djamal; Arthurs, Sandra; Colson, Anthony B.; Freer, Stephan T.; Gehlhaar, Daniel K.; Larson, Veda; Luty, Brock A.; Marrone, Tami; Rose, Peter W.

    2001-03-01

    Thermodynamic and kinetic aspects of ligand-protein binding are studied for the methotrexate-dihydrofolate reductase system from the binding free energy profile constructed as a function of the order parameter. Thermodynamic stability of the native complex and a cooperative transition to the unique native structure suggest the nucleation kinetic mechanism at the equilibrium transition temperature. Structural properties of the transition state ensemble and the ensemble of nucleation conformations are determined by kinetic simulations of the transmission coefficient and ligand-protein association pathways. Structural analysis of the transition states and the nucleation conformations reconciles different views on the nucleation mechanism in protein folding.

  14. PACSY, a relational database management system for protein structure and chemical shift analysis.

    PubMed

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo; Lee, Weontae; Markley, John L

    2012-10-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu.

  15. Structural Basis of Interdomain Communication in the Hsc70 Chaperone

    PubMed Central

    Jiang, Jianwen; Prasad, Kondury; Lafer, Eileen M.; Sousa, Rui

    2015-01-01

    Summary Hsp70 family proteins are highly conserved chaperones involved in protein folding, degradation, targeting and translocation, and protein complex remodeling. They are comprised of an N-terminal nucleotide binding domain (NBD) and a C-terminal protein substrate binding domain (SBD). ATP binding to the NBD alters SBD conformation and substrate binding kinetics, but an understanding of the mechanism of interdomain communication has been hampered by the lack of a crystal structure of an intact chaperone. Were-port here the 2.6 Å structure of a functionally intact bovine Hsc70 (bHsc70) and a mutational analysis of the observed interdomain interface and the immediately adjacent interdomain linker. This analysis identifies interdomain interactions critical for chaperone function and supports an allosteric mechanism in which the interdomain linker invades and disrupts the interdomain interface when ATP binds. PMID:16307916

  16. Stacking and T-shape competition in aromatic-aromatic amino acid interactions.

    PubMed

    Chelli, Riccardo; Gervasio, Francesco Luigi; Procacci, Piero; Schettino, Vincenzo

    2002-05-29

    The potential of mean force of interacting aromatic amino acids is calculated using molecular dynamics simulations. The free energy surface is determined in order to study stacking and T-shape competition for phenylalanine-phenylalanine (Phe-Phe), phenylalanine-tyrosine (Phe-Tyr), and tyrosine-tyrosine (Tyr-Tyr) complexes in vacuo, water, carbon tetrachloride, and methanol. Stacked structures are favored in all solvents with the exception of the Tyr-Tyr complex in carbon tetrachloride, where T-shaped structures are also important. The effect of anchoring the two alpha-carbons (C(alpha)) at selected distances is investigated. We find that short and large C(alpha)-C(alpha) distances favor stacked and T-shaped structures, respectively. We analyze a set of 2396 protein structures resolved experimentally. Comparison of theoretical free energies for the complexes to the experimental analogue shows that Tyr-Tyr interaction occurs mainly at the protein surface, while Phe-Tyr and Phe-Phe interactions are more frequent in the hydrophobic protein core. This is confirmed by the Voronoi polyhedron analysis on the database protein structures. As found from the free energy calculation, analysis of the protein database has shown that proximal and distal interacting aromatic residues are predominantly stacked and T-shaped, respectively.

  17. Unraveling protein catalysis through neutron diffraction

    NASA Astrophysics Data System (ADS)

    Myles, Dean

    Neutron scattering and diffraction are exquisitely sensitive to the location, concentration and dynamics of hydrogen atoms in materials and provide a powerful tool for the characterization of structure-function and interfacial relationships in biological systems. Modern neutron scattering facilities offer access to a sophisticated, non-destructive suite of instruments for biophysical characterization that provide spatial and dynamic information spanning from Angstroms to microns and from picoseconds to microseconds, respectively. Applications range from atomic-resolution analysis of individual hydrogen atoms in enzymes, through to multi-scale analysis of hierarchical structures and assemblies in biological complexes, membranes and in living cells. Here we describe how the precise location of protein and water hydrogen atoms using neutron diffraction provides a more complete description of the atomic and electronic structures of proteins, enabling key questions concerning enzyme reaction mechanisms, molecular recognition and binding and protein-water interactions to be addressed. Current work is focused on understanding how molecular structure and dynamics control function in photosynthetic, cell signaling and DNA repair proteins. We will highlight recent studies that provide detailed understanding of the physiochemical mechanisms through which proteins recognize ligands and catalyze reactions, and help to define and understand the key principles involved.

  18. Multiscale weighted colored graphs for protein flexibility and rigidity analysis

    NASA Astrophysics Data System (ADS)

    Bramer, David; Wei, Guo-Wei

    2018-02-01

    Protein structural fluctuation, measured by Debye-Waller factors or B-factors, is known to correlate to protein flexibility and function. A variety of methods has been developed for protein Debye-Waller factor prediction and related applications to domain separation, docking pose ranking, entropy calculation, hinge detection, stability analysis, etc. Nevertheless, none of the current methodologies are able to deliver an accuracy of 0.7 in terms of the Pearson correlation coefficients averaged over a large set of proteins. In this work, we introduce a paradigm-shifting geometric graph model, multiscale weighted colored graph (MWCG), to provide a new generation of computational algorithms to significantly change the current status of protein structural fluctuation analysis. Our MWCG model divides a protein graph into multiple subgraphs based on interaction types between graph nodes and represents the protein rigidity by generalized centralities of subgraphs. MWCGs not only predict the B-factors of protein residues but also accurately analyze the flexibility of all atoms in a protein. The MWCG model is validated over a number of protein test sets and compared with many standard methods. An extensive numerical study indicates that the proposed MWCG offers an accuracy of over 0.8 and thus provides perhaps the first reliable method for estimating protein flexibility and B-factors. It also simultaneously predicts all-atom flexibility in a molecule.

  19. Molecular Dynamics Analysis of Lysozyme Protein in Ethanol-Water Mixed Solvent Environment

    NASA Astrophysics Data System (ADS)

    Ochije, Henry Ikechukwu

    Effect of protein-solvent interaction on the protein structure is widely studied using both experimental and computational techniques. Despite such extensive studies molecular level understanding of proteins and some simple solvents is still not fully understood. This work focuses on detailed molecular dynamics simulations to study of solvent effect on lysozyme protein, using water, alcohol and different concentrations of water-alcohol mixtures as solvents. The lysozyme protein structure in water, alcohol and alcohol-water mixture (0-12% alcohol) was studied using GROMACS molecular dynamics simulation code. Compared to water environment, the lysozome structure showed remarkable changes in solvents with increasing alcohol concentration. In particular, significant changes were observed in the protein secondary structure involving alpha helices. The influence of alcohol on the lysozyme protein was investigated by studying thermodynamic and structural properties. With increasing ethanol concentration we observed a systematic increase in total energy, enthalpy, root mean square deviation (RMSD), and radius of gyration. a polynomial interpolation approach. Using the resulting polynomial equation, we could determine above quantities for any intermediate alcohol percentage. In order to validate this approach, we selected an intermediate ethanol percentage and carried out full MD simulation. The results from MD simulation were in reasonably good agreement with that obtained using polynomial approach. Hence, the polynomial approach based method proposed here eliminates the need for computationally intensive full MD analysis for the concentrations within the range (0-12%) studied in this work.

  20. Permeabilization Activated Reduction in Fluorescence (PARF): a novel method to measure kinetics of protein interactions with intracellular structures

    PubMed Central

    Singh, Pali P.; Hawthorne, Jenci L.; Davis, Christie A.; Quintero, Omar A.

    2016-01-01

    Understanding kinetic information is fundamental in understanding biological function. Advanced imaging technologies have fostered the development of kinetic analyses in cells. We have developed Permeabilization Activated Reduction in Fluorescence (PARF) analysis for determination of apparent t1/2 and immobile fraction, describing the dissociation of a protein of interest from intracellular structures. To create conditions where dissociation events are observable, cells expressing a fluorescently-tagged protein are permeabilized with digitonin, diluting the unbound protein into the extracellular media. As the media volume is much larger than the cytosolic volume, the concentration of the unbound pool decreases drastically, shifting the system out of equilibrium--favoring dissociation events. Loss of bound protein is observed as loss of fluorescence from intracellular structures and can be fit to an exponential decay. We compared PARF dissociation kinetics with previously published equilibrium kinetics as determined by FRAP. PARF dissociation rates agreed with the equilibrium-based FRAP analysis predictions of the magnitude of those rates. When used to investigate binding kinetics of a panel of cytoskeletal proteins, PARF analysis revealed that filament stabilization resulted in slower fluorescence loss. Additionally, commonly used “general” F-actin labels display differences in kinetic properties, suggesting that not all fluorescently-tagged actin labels interact with the actin network in the same way. We also observed differential dissociation kinetics for GFP-VASP depending on which cellular structure was being labeled. These results demonstrate that PARF analysis of non-equilibrium systems reveals kinetic information without the infrastructure investment required for other quantitative approaches such as FRAP, photoactivation, or in vitro reconstitution assays. PMID:27126922

  1. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    PubMed

    de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html).

  2. Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

    PubMed Central

    de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html). PMID:24489849

  3. Using NMR chemical shifts to calculate the propensity for structural order and disorder in proteins.

    PubMed

    Tamiola, Kamil; Mulder, Frans A A

    2012-10-01

    NMR spectroscopy offers the unique possibility to relate the structural propensities of disordered proteins and loop segments of folded peptides to biological function and aggregation behaviour. Backbone chemical shifts are ideally suited for this task, provided that appropriate reference data are available and idiosyncratic sensitivity of backbone chemical shifts to structural information is treated in a sensible manner. In the present paper, we describe methods to detect structural protein changes from chemical shifts, and present an online tool [ncSPC (neighbour-corrected Structural Propensity Calculator)], which unites aspects of several current approaches. Examples of structural propensity calculations are given for two well-characterized systems, namely the binding of α-synuclein to micelles and light activation of photoactive yellow protein. These examples spotlight the great power of NMR chemical shift analysis for the quantitative assessment of protein disorder at the atomic level, and further our understanding of biologically important problems.

  4. Structure Prediction and Analysis of DNA Transposon and LINE Retrotransposon Proteins*

    PubMed Central

    Abrusán, György; Zhang, Yang; Szilágyi, András

    2013-01-01

    Despite the considerable amount of research on transposable elements, no large-scale structural analyses of the TE proteome have been performed so far. We predicted the structures of hundreds of proteins from a representative set of DNA and LINE transposable elements and used the obtained structural data to provide the first general structural characterization of TE proteins and to estimate the frequency of TE domestication and horizontal transfer events. We show that 1) ORF1 and Gag proteins of retrotransposons contain high amounts of structural disorder; thus, despite their very low conservation, the presence of disordered regions and probably their chaperone function is conserved. 2) The distribution of SCOP classes in DNA transposons and LINEs indicates that the proteins of DNA transposons are more ancient, containing folds that already existed when the first cellular organisms appeared. 3) DNA transposon proteins have lower contact order than randomly selected reference proteins, indicating rapid folding, most likely to avoid protein aggregation. 4) Structure-based searches for TE homologs indicate that the overall frequency of TE domestication events is low, whereas we found a relatively high number of cases where horizontal transfer, frequently involving parasites, is the most likely explanation for the observed homology. PMID:23530042

  5. High-Resolution NMR Reveals Secondary Structure and Folding of Amino Acid Transporter from Outer Chloroplast Membrane

    PubMed Central

    Zook, James D.; Molugu, Trivikram R.; Jacobsen, Neil E.; Lin, Guangxin; Soll, Jürgen; Cherry, Brian R.; Brown, Michael F.; Fromme, Petra

    2013-01-01

    Solving high-resolution structures for membrane proteins continues to be a daunting challenge in the structural biology community. In this study we report our high-resolution NMR results for a transmembrane protein, outer envelope protein of molar mass 16 kDa (OEP16), an amino acid transporter from the outer membrane of chloroplasts. Three-dimensional, high-resolution NMR experiments on the 13C, 15N, 2H-triply-labeled protein were used to assign protein backbone resonances and to obtain secondary structure information. The results yield over 95% assignment of N, HN, CO, Cα, and Cβ chemical shifts, which is essential for obtaining a high resolution structure from NMR data. Chemical shift analysis from the assignment data reveals experimental evidence for the first time on the location of the secondary structure elements on a per residue basis. In addition T 1Z and T2 relaxation experiments were performed in order to better understand the protein dynamics. Arginine titration experiments yield an insight into the amino acid residues responsible for protein transporter function. The results provide the necessary basis for high-resolution structural determination of this important plant membrane protein. PMID:24205117

  6. Membrane remodeling by amyloidogenic and non-amyloidogenic proteins studied by EPR

    NASA Astrophysics Data System (ADS)

    Varkey, Jobin; Langen, Ralf

    2017-07-01

    The advancement in site-directed spin labeling of proteins has enabled EPR studies to expand into newer research areas within the umbrella of protein-membrane interactions. Recently, membrane remodeling by amyloidogenic and non-amyloidogenic proteins has gained a substantial interest in relation to driving and controlling vital cellular processes such as endocytosis, exocytosis, shaping of organelles like endoplasmic reticulum, Golgi and mitochondria, intracellular vesicular trafficking, formation of filopedia and multivesicular bodies, mitochondrial fusion and fission, and synaptic vesicle fusion and recycling in neurotransmission. Misregulation in any of these processes due to an aberrant protein (mutation or misfolding) or alteration of lipid metabolism can be detrimental to the cell and cause disease. Dissection of the structural basis of membrane remodeling by proteins is thus quite necessary for an understanding of the underlying mechanisms, but it remains a formidable task due to the difficulties of various common biophysical tools in monitoring the dynamic process of membrane binding and bending by proteins. This is largely since membranes generally complicate protein structure analysis and this problem is amplified for structural analysis in the presence of different types of membrane curvatures. Recent EPR studies on membrane remodeling by proteins show that a significant structural information can be generated to delineate the role of different protein modules, domains and individual amino acids in the generation of membrane curvature. These studies also show how EPR can complement the data obtained by high resolution techniques such as X-ray and NMR. This perspective covers the application of EPR in recent studies for understanding membrane remodeling by amyloidogenic and non-amyloidogenic proteins that is useful for researchers interested in using or complimenting EPR to gain better understanding of membrane remodeling. We also discuss how a single protein can generate different type of membrane curvatures using specific conformations for specific membrane structures and how EPR is a versatile tool well-suited to analyze subtle alterations in structures under such modifying conditions which otherwise would have been difficult using other biophysical tools.

  7. Computer analysis of protein functional sites projection on exon structure of genes in Metazoa

    PubMed Central

    2015-01-01

    Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737

  8. Theoretical study of the partial molar volume change associated with the pressure-induced structural transition of ubiquitin

    PubMed Central

    Imai, Takashi; Ohyama, Shusaku; Kovalenko, Andriy; Hirata, Fumio

    2007-01-01

    The partial molar volume (PMV) change associated with the pressure-induced structural transition of ubiquitin is analyzed by the three-dimensional reference interaction site model (3D-RISM) theory of molecular solvation. The theory predicts that the PMV decreases upon the structural transition, which is consistent with the experimental observation. The volume decomposition analysis demonstrates that the PMV reduction is primarily caused by the decrease in the volume of structural voids in the protein, which is partially canceled by the volume expansion due to the hydration effects. It is found from further analysis that the PMV reduction is ascribed substantially to the penetration of water molecules into a specific part of the protein. Based on the thermodynamic relation, this result implies that the water penetration causes the pressure-induced structural transition. It supports the water penetration model of pressure denaturation of proteins proposed earlier. PMID:17660257

  9. Theoretical study of the partial molar volume change associated with the pressure-induced structural transition of ubiquitin.

    PubMed

    Imai, Takashi; Ohyama, Shusaku; Kovalenko, Andriy; Hirata, Fumio

    2007-09-01

    The partial molar volume (PMV) change associated with the pressure-induced structural transition of ubiquitin is analyzed by the three-dimensional reference interaction site model (3D-RISM) theory of molecular solvation. The theory predicts that the PMV decreases upon the structural transition, which is consistent with the experimental observation. The volume decomposition analysis demonstrates that the PMV reduction is primarily caused by the decrease in the volume of structural voids in the protein, which is partially canceled by the volume expansion due to the hydration effects. It is found from further analysis that the PMV reduction is ascribed substantially to the penetration of water molecules into a specific part of the protein. Based on the thermodynamic relation, this result implies that the water penetration causes the pressure-induced structural transition. It supports the water penetration model of pressure denaturation of proteins proposed earlier.

  10. Measuring changes in chemistry, composition, and molecular structure within hair fibers by infrared and Raman spectroscopic imaging.

    PubMed

    Zhang, Guojin; Senak, Laurence; Moore, David J

    2011-05-01

    Spatially resolved infrared (IR) and Raman images are acquired from human hair cross sections or intact hair fibers. The full informational content of these spectra are spatially correlated to hair chemistry, anatomy, and structural organization through univariate and multivariate data analysis. Specific IR and Raman images from untreated human hair describing the spatial dependence of lipid and protein distribution, protein secondary structure, lipid chain conformational order, and distribution of disulfide cross-links in hair protein are presented in this study. Factor analysis of the image plane acquired with IR microscopy in hair sections, permits delineation of specific micro-regions within the hair. These data indicate that both IR and Raman imaging of molecular structural changes in a specific region of hair will prove to be valuable tools in the understanding of hair structure, physiology, and the effect of various stresses upon its integrity.

  11. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    NASA Astrophysics Data System (ADS)

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A.; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W.; Suino-Powell, Kelly M.; Boutet, Sébastien; Williams, Garth J.; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N.; Spence, John C. H.; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C.; Cherezov, Vadim; Melcher, Karsten; Xu, H. Eric

    2016-04-01

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.

  12. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex.

    PubMed

    Zhou, X Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W; Suino-Powell, Kelly M; Boutet, Sébastien; Williams, Garth J; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N; Spence, John C H; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C; Cherezov, Vadim; Melcher, Karsten; Xu, H Eric

    2016-04-12

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.

  13. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, X. Edward; Gao, Xiang; Barty, Anton

    Here, serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solvedmore » with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.« less

  14. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    PubMed Central

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A.; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W.; Suino-Powell, Kelly M.; Boutet, Sébastien; Williams, Garth J.; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N.; Spence, John C.H.; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C.; Cherezov, Vadim; Melcher, Karsten; Xu, H. Eric

    2016-01-01

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes. PMID:27070998

  15. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    DOE PAGES

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; ...

    2016-04-12

    Here, serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solvedmore » with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.« less

  16. Polyphony: superposition independent methods for ensemble-based drug discovery.

    PubMed

    Pitt, William R; Montalvão, Rinaldo W; Blundell, Tom L

    2014-09-30

    Structure-based drug design is an iterative process, following cycles of structural biology, computer-aided design, synthetic chemistry and bioassay. In favorable circumstances, this process can lead to the structures of hundreds of protein-ligand crystal structures. In addition, molecular dynamics simulations are increasingly being used to further explore the conformational landscape of these complexes. Currently, methods capable of the analysis of ensembles of crystal structures and MD trajectories are limited and usually rely upon least squares superposition of coordinates. Novel methodologies are described for the analysis of multiple structures of a protein. Statistical approaches that rely upon residue equivalence, but not superposition, are developed. Tasks that can be performed include the identification of hinge regions, allosteric conformational changes and transient binding sites. The approaches are tested on crystal structures of CDK2 and other CMGC protein kinases and a simulation of p38α. Known interaction - conformational change relationships are highlighted but also new ones are revealed. A transient but druggable allosteric pocket in CDK2 is predicted to occur under the CMGC insert. Furthermore, an evolutionarily-conserved conformational link from the location of this pocket, via the αEF-αF loop, to phosphorylation sites on the activation loop is discovered. New methodologies are described and validated for the superimposition independent conformational analysis of large collections of structures or simulation snapshots of the same protein. The methodologies are encoded in a Python package called Polyphony, which is released as open source to accompany this paper [http://wrpitt.bitbucket.org/polyphony/].

  17. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes.

    PubMed

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-05-26

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.

  18. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes

    PubMed Central

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-01-01

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414

  19. Nanostructure and molecular mechanics of spider dragline silk protein assemblies

    PubMed Central

    Keten, Sinan; Buehler, Markus J.

    2010-01-01

    Spider silk is a self-assembling biopolymer that outperforms most known materials in terms of its mechanical performance, despite its underlying weak chemical bonding based on H-bonds. While experimental studies have shown that the molecular structure of silk proteins has a direct influence on the stiffness, toughness and failure strength of silk, no molecular-level analysis of the nanostructure and associated mechanical properties of silk assemblies have been reported. Here, we report atomic-level structures of MaSp1 and MaSp2 proteins from the Nephila clavipes spider dragline silk sequence, obtained using replica exchange molecular dynamics, and subject these structures to mechanical loading for a detailed nanomechanical analysis. The structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly beta-sheet crystal domains, while disorderly regions are formed by glycine-rich repeats that consist of 31-helix type structures and beta-turns. Our structural predictions are validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots, alpha-carbon atomic distances, as well as secondary structure content. Mechanical shearing simulations on selected structures illustrate that the nanoscale behaviour of silk protein assemblies is controlled by the distinctly different secondary structure content and hydrogen bonding in the crystalline and semi-amorphous regions. Both structural and mechanical characterization results show excellent agreement with available experimental evidence. Our findings set the stage for extensive atomistic investigations of silk, which may contribute towards an improved understanding of the source of the strength and toughness of this biological superfibre. PMID:20519206

  20. Nanostructure and molecular mechanics of spider dragline silk protein assemblies.

    PubMed

    Keten, Sinan; Buehler, Markus J

    2010-12-06

    Spider silk is a self-assembling biopolymer that outperforms most known materials in terms of its mechanical performance, despite its underlying weak chemical bonding based on H-bonds. While experimental studies have shown that the molecular structure of silk proteins has a direct influence on the stiffness, toughness and failure strength of silk, no molecular-level analysis of the nanostructure and associated mechanical properties of silk assemblies have been reported. Here, we report atomic-level structures of MaSp1 and MaSp2 proteins from the Nephila clavipes spider dragline silk sequence, obtained using replica exchange molecular dynamics, and subject these structures to mechanical loading for a detailed nanomechanical analysis. The structural analysis reveals that poly-alanine regions in silk predominantly form distinct and orderly beta-sheet crystal domains, while disorderly regions are formed by glycine-rich repeats that consist of 3₁-helix type structures and beta-turns. Our structural predictions are validated against experimental data based on dihedral angle pair calculations presented in Ramachandran plots, alpha-carbon atomic distances, as well as secondary structure content. Mechanical shearing simulations on selected structures illustrate that the nanoscale behaviour of silk protein assemblies is controlled by the distinctly different secondary structure content and hydrogen bonding in the crystalline and semi-amorphous regions. Both structural and mechanical characterization results show excellent agreement with available experimental evidence. Our findings set the stage for extensive atomistic investigations of silk, which may contribute towards an improved understanding of the source of the strength and toughness of this biological superfibre.

  1. Elastic strain and twist analysis of protein structural data and allostery of the transmembrane channel KcsA

    NASA Astrophysics Data System (ADS)

    Mitchell, Michael R.; Leibler, Stanislas

    2018-05-01

    The abundance of available static protein structural data makes the more effective analysis and interpretation of this data a valuable tool to supplement the experimental study of protein mechanics. Structural displacements can be difficult to analyze and interpret. Previously, we showed that strains provide a more natural and interpretable representation of protein deformations, revealing mechanical coupling between spatially distinct sites of allosteric proteins. Here, we demonstrate that other transformations of displacements yield additional insights. We calculate the divergence and curl of deformations of the transmembrane channel KcsA. Additionally, we introduce quantities analogous to bend, splay, and twist deformation energies of nematic liquid crystals. These transformations enable the decomposition of displacements into different modes of deformation, helping to characterize the type of deformation a protein undergoes. We apply these calculations to study the filter and gating regions of KcsA. We observe a continuous path of rotational deformations physically coupling these two regions, and, we propose, underlying the allosteric interaction between these regions. Bend, splay, and twist distinguish KcsA gate opening, filter opening, and filter-gate coupling, respectively. In general, physically meaningful representations of deformations (like strain, curl, bend, splay, and twist) can make testable predictions and yield insights into protein mechanics, augmenting experimental methods and more fully exploiting available structural data.

  2. Evolution of a protein folding nucleus.

    PubMed

    Xia, Xue; Longo, Liam M; Sutherland, Mason A; Blaber, Michael

    2016-07-01

    The folding nucleus (FN) is a cryptic element within protein primary structure that enables an efficient folding pathway and is the postulated heritable element in the evolution of protein architecture; however, almost nothing is known regarding how the FN structurally changes as complex protein architecture evolves from simpler peptide motifs. We report characterization of the FN of a designed purely symmetric β-trefoil protein by ϕ-value analysis. We compare the structure and folding properties of key foldable intermediates along the evolutionary trajectory of the β-trefoil. The results show structural acquisition of the FN during gene fusion events, incorporating novel turn structure created by gene fusion. Furthermore, the FN is adjusted by circular permutation in response to destabilizing functional mutation. FN plasticity by way of circular permutation is made possible by the intrinsic C3 cyclic symmetry of the β-trefoil architecture, identifying a possible selective advantage that helps explain the prevalence of cyclic structural symmetry in the proteome. © 2015 The Protein Society.

  3. A Viral-Human Interactome Based on Structural Motif-Domain Interactions Captures the Human Infectome

    PubMed Central

    Guo, Xianwu; Rodríguez-Pérez, Mario A.

    2013-01-01

    Protein interactions between a pathogen and its host are fundamental in the establishment of the pathogen and underline the infection mechanism. In the present work, we developed a single predictive model for building a host-viral interactome based on the identification of structural descriptors from motif-domain interactions of protein complexes deposited in the Protein Data Bank (PDB). The structural descriptors were used for searching, in a database of protein sequences of human and five clinically important viruses; therefore, viral and human proteins sharing a descriptor were predicted as interacting proteins. The analysis of the host-viral interactome allowed to identify a set of new interactions that further explain molecular mechanism associated with viral infections and showed that it was able to capture human proteins already associated to viral infections (human infectome) and non-infectious diseases (human diseasome). The analysis of human proteins targeted by viral proteins in the context of a human interactome showed that their neighbors are enriched in proteins reported with differential expression under infection and disease conditions. It is expected that the findings of this work will contribute to the development of systems biology for infectious diseases, and help guide the rational identification and prioritization of novel drug targets. PMID:23951184

  4. Low-temperature protein dynamics: a simulation analysis of interprotein vibrations and the boson peak at 150 k.

    PubMed

    Kurkal-Siebert, Vandana; Smith, Jeremy C

    2006-02-22

    An understanding of low-frequency, collective protein dynamics at low temperatures can furnish valuable information on functional protein energy landscapes, on the origins of the protein glass transition and on protein-protein interactions. Here, molecular dynamics (MD) simulations and normal-mode analyses are performed on various models of crystalline myoglobin in order to characterize intra- and interprotein vibrations at 150 K. Principal component analysis of the MD trajectories indicates that the Boson peak, a broad peak in the dynamic structure factor centered at about approximately 2-2.5 meV, originates from approximately 10(2) collective, harmonic vibrations. An accurate description of the environment is found to be essential in reproducing the experimental Boson peak form and position. At lower energies other strong peaks are found in the calculated dynamic structure factor. Characterization of these peaks shows that they arise from harmonic vibrations of proteins relative to each other. These vibrations are likely to furnish valuable information on the physical nature of protein-protein interactions.

  5. SITEX 2.0: Projections of protein functional sites on eukaryotic genes. Extension with orthologous genes.

    PubMed

    Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2017-04-01

    Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .

  6. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis

    NASA Astrophysics Data System (ADS)

    Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei

    2014-06-01

    Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N^2). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.

  7. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei, E-mail: wei@math.msu.edu

    Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions,more » while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N{sup 2}). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.« less

  8. The Differential Response of Proteins to Macromolecular Crowding

    PubMed Central

    Candotti, Michela; Orozco, Modesto

    2016-01-01

    The habitat in which proteins exert their function contains up to 400 g/L of macromolecules, most of which are proteins. The repercussions of this dense environment on protein behavior are often overlooked or addressed using synthetic agents such as poly(ethylene glycol), whose ability to mimic protein crowders has not been demonstrated. Here we performed a comprehensive atomistic molecular dynamic analysis of the effect of protein crowders on the structure and dynamics of three proteins, namely an intrinsically disordered protein (ACTR), a molten globule conformation (NCBD), and a one-fold structure (IRF-3) protein. We found that crowding does not stabilize the native compact structure, and, in fact, often prevents structural collapse. Poly(ethylene glycol) PEG500 failed to reproduce many aspects of the physiologically-relevant protein crowders, thus indicating its unsuitability to mimic the cell interior. Instead, the impact of protein crowding on the structure and dynamics of a protein depends on its degree of disorder and results from two competing effects: the excluded volume, which favors compact states, and quinary interactions, which favor extended conformers. Such a viscous environment slows down protein flexibility and restricts the conformational landscape, often biasing it towards bioactive conformations but hindering biologically relevant protein-protein contacts. Overall, the protein crowders used here act as unspecific chaperons that modulate the protein conformational space, thus having relevant consequences for disordered proteins. PMID:27471851

  9. A comprehensive structure-function analysis shed a new light on molecular mechanism by which a novel smart copolymer, NY-3-1, assists protein refolding.

    PubMed

    Ye, Chaohui; Ilghari, Dariush; Niu, Jianlou; Xie, Yaoyao; Wang, Yan; Wang, Chao; Li, Xiaokun; Liu, Bailin; Huang, Zhifeng

    2012-08-31

    An in-depth understanding of molecular basis by which smart polymers assist protein refolding can lead us to develop a more effective polymer for protein refolding. In this report, to investigate structure-function relationship of pH-sensitive smart polymers, a series of poly(methylacrylic acid (MAc)-acrylic acid (AA))s with different MAc/AA ratios and molecular weights were synthesized and then their abilities in refolding of denatured lysozyme were compared by measuring the lytic activity of the refolded lysozyme. Based on our analysis, there were optimal MAc/AA ratio (44% MAc), M(w) (1700 Da), and copolymer concentration (0.1%, w/v) at which the highest yield of protein refolding was achieved. Fluorescence, circular dichroism, and RP-HPLC analysis reported in this study demonstrated that the presence of P(MAc-AA)s in the refolding buffer significantly improved the refolding yield of denatured lysozyme without affecting the overall structure of the enzyme. Importantly, our bioseparation analysis, together with the analysis of zeta potential and particle size of the copolymer in refolding buffers with different copolymer concentrations, suggested that the polymer provided a negatively charged surface for an electrostatic interaction with the denatured lysozyme molecules and thereby minimized the hydrophobic-prone aggregation of unfolded proteins during the process of refolding. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Mixture models for protein structure ensembles.

    PubMed

    Hirsch, Michael; Habeck, Michael

    2008-10-01

    Protein structure ensembles provide important insight into the dynamics and function of a protein and contain information that is not captured with a single static structure. However, it is not clear a priori to what extent the variability within an ensemble is caused by internal structural changes. Additional variability results from overall translations and rotations of the molecule. And most experimental data do not provide information to relate the structures to a common reference frame. To report meaningful values of intrinsic dynamics, structural precision, conformational entropy, etc., it is therefore important to disentangle local from global conformational heterogeneity. We consider the task of disentangling local from global heterogeneity as an inference problem. We use probabilistic methods to infer from the protein ensemble missing information on reference frames and stable conformational sub-states. To this end, we model a protein ensemble as a mixture of Gaussian probability distributions of either entire conformations or structural segments. We learn these models from a protein ensemble using the expectation-maximization algorithm. Our first model can be used to find multiple conformers in a structure ensemble. The second model partitions the protein chain into locally stable structural segments or core elements and less structured regions typically found in loops. Both models are simple to implement and contain only a single free parameter: the number of conformers or structural segments. Our models can be used to analyse experimental ensembles, molecular dynamics trajectories and conformational change in proteins. The Python source code for protein ensemble analysis is available from the authors upon request.

  11. High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies.

    PubMed

    Youn, Ji-Young; Dunham, Wade H; Hong, Seo Jung; Knight, James D R; Bashkurov, Mikhail; Chen, Ginny I; Bagci, Halil; Rathod, Bhavisha; MacLeod, Graham; Eng, Simon W M; Angers, Stéphane; Morris, Quaid; Fabian, Marc; Côté, Jean-François; Gingras, Anne-Claude

    2018-02-01

    mRNA processing, transport, translation, and ultimately degradation involve a series of dedicated protein complexes that often assemble into large membraneless structures such as stress granules (SGs) and processing bodies (PBs). Here, systematic in vivo proximity-dependent biotinylation (BioID) analysis of 119 human proteins associated with different aspects of mRNA biology uncovers 7424 unique proximity interactions with 1,792 proteins. Classical bait-prey analysis reveals connections of hundreds of proteins to distinct mRNA-associated processes or complexes, including the splicing and transcriptional elongation machineries (protein phosphatase 4) and the CCR4-NOT deadenylase complex (CEP85, RNF219, and KIAA0355). Analysis of correlated patterns between endogenous preys uncovers the spatial organization of RNA regulatory structures and enables the definition of 144 core components of SGs and PBs. We report preexisting contacts between most core SG proteins under normal growth conditions and demonstrate that several core SG proteins (UBAP2L, CSDE1, and PRRC2C) are critical for the formation of microscopically visible SGs. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. LoopX: A Graphical User Interface-Based Database for Comprehensive Analysis and Comparative Evaluation of Loops from Protein Structures.

    PubMed

    Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna

    2017-10-01

    Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.

  13. A comparative analysis of human plasma and serum proteins by combining native PAGE, whole-gel slicing and quantitative LC-MS/MS: Utilizing native MS-electropherograms in proteomic analysis for discovering structure and interaction-correlated differences.

    PubMed

    Wen, Meiling; Jin, Ya; Manabe, Takashi; Chen, Shumin; Tan, Wen

    2017-12-01

    MS identification has long been used for PAGE-separated protein bands, but global and systematic quantitation utilizing MS after PAGE has remained rare and not been reported for native PAGE. Here we reported on a new method combining native PAGE, whole-gel slicing and quantitative LC-MS/MS, aiming at comparative analysis on not only abundance, but also structures and interactions of proteins. A pair of human plasma and serum samples were used as test samples and separated on a native PAGE gel. Six lanes of each sample were cut, each lane was further sliced into thirty-five 1.1 mm × 1.1 mm squares and all the squares were subjected to standardized procedures of in-gel digestion and quantitative LC-MS/MS. The results comprised 958 data rows that each contained abundance values of a protein detected in one square in eleven gel lanes (one plasma lane excluded). The data were evaluated to have satisfactory reproducibility of assignment and quantitation. Totally 315 proteins were assigned, with each protein assigned in 1-28 squares. The abundance distributions in the plasma and serum gel lanes were reconstructed for each protein, named as "native MS-electropherograms". Comparison of the electropherograms revealed significant plasma-versus-serum differences on 33 proteins in 87 squares (fold difference > 2 or < 0.5, p < 0.05). Many of the differences matched with accumulated knowledge on protein interactions and proteolysis involved in blood coagulation, complement and wound healing processes. We expect this method would be useful to provide more comprehensive information in comparative proteomic analysis, on both quantities and structures/interactions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. PhyreStorm: A Web Server for Fast Structural Searches Against the PDB.

    PubMed

    Mezulis, Stefans; Sternberg, Michael J E; Kelley, Lawrence A

    2016-02-22

    The identification of structurally similar proteins can provide a range of biological insights, and accordingly, the alignment of a query protein to a database of experimentally determined protein structures is a technique commonly used in the fields of structural and evolutionary biology. The PhyreStorm Web server has been designed to provide comprehensive, up-to-date and rapid structural comparisons against the Protein Data Bank (PDB) combined with a rich and intuitive user interface. It is intended that this facility will enable biologists inexpert in bioinformatics access to a powerful tool for exploring protein structure relationships beyond what can be achieved by sequence analysis alone. By partitioning the PDB into similar structures, PhyreStorm is able to quickly discard the majority of structures that cannot possibly align well to a query protein, reducing the number of alignments required by an order of magnitude. PhyreStorm is capable of finding 93±2% of all highly similar (TM-score>0.7) structures in the PDB for each query structure, usually in less than 60s. PhyreStorm is available at http://www.sbg.bio.ic.ac.uk/phyrestorm/. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  15. Structure-Based Annotation of a Novel Sugar Isomerase from the Pathogenic E. coli O157:H7

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    van Staalduinen, L.; Park, C; Yeom, S

    2010-01-01

    Prokaryotes can use a variety of sugars as carbon sources in order to provide a selective survival advantage. The gene z5688 found in the pathogenic Escherichia coli O157:H7 encodes a 'hypothetical' protein of unknown function. Sequence analysis identified the gene product as a putative member of the cupin superfamily of proteins, but no other functional information was known. We have determined the crystal structure of the Z5688 protein at 1.6 {angstrom} resolution and identified the protein as a novel E. coli sugar isomerase (EcSI) through overall fold analysis and secondary-structure matching. Extensive substrate screening revealed that EcSI is capable ofmore » acting on D-lyxose and D-mannose. The complex structure of EcSI with fructose allowed the identification of key active-site residues, and mutagenesis confirmed their importance. The structure of EcSI also suggested a novel mechanism for substrate binding and product release in a cupin sugar isomerase. Supplementation of a nonpathogenic E. coli strain with EcSI enabled cell growth on the rare pentose d-lyxose.« less

  16. How Structure Defines Affinity in Protein-Protein Interactions

    PubMed Central

    Erijman, Ariel; Rosenthal, Eran; Shifman, Julia M.

    2014-01-01

    Protein-protein interactions (PPI) in nature are conveyed by a multitude of binding modes involving various surfaces, secondary structure elements and intermolecular interactions. This diversity results in PPI binding affinities that span more than nine orders of magnitude. Several early studies attempted to correlate PPI binding affinities to various structure-derived features with limited success. The growing number of high-resolution structures, the appearance of more precise methods for measuring binding affinities and the development of new computational algorithms enable more thorough investigations in this direction. Here, we use a large dataset of PPI structures with the documented binding affinities to calculate a number of structure-based features that could potentially define binding energetics. We explore how well each calculated biophysical feature alone correlates with binding affinity and determine the features that could be used to distinguish between high-, medium- and low- affinity PPIs. Furthermore, we test how various combinations of features could be applied to predict binding affinity and observe a slow improvement in correlation as more features are incorporated into the equation. In addition, we observe a considerable improvement in predictions if we exclude from our analysis low-resolution and NMR structures, revealing the importance of capturing exact intermolecular interactions in our calculations. Our analysis should facilitate prediction of new interactions on the genome scale, better characterization of signaling networks and design of novel binding partners for various target proteins. PMID:25329579

  17. A hidden markov model derived structural alphabet for proteins.

    PubMed

    Camproux, A C; Gautier, R; Tufféry, P

    2004-06-04

    Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.

  18. Structure and self-assembly of the calcium binding matrix protein of human metapneumovirus.

    PubMed

    Leyrat, Cedric; Renner, Max; Harlos, Karl; Huiskonen, Juha T; Grimes, Jonathan M

    2014-01-07

    The matrix protein (M) of paramyxoviruses plays a key role in determining virion morphology by directing viral assembly and budding. Here, we report the crystal structure of the human metapneumovirus M at 2.8 Å resolution in its native dimeric state. The structure reveals the presence of a high-affinity Ca²⁺ binding site. Molecular dynamics simulations (MDS) predict a secondary lower-affinity site that correlates well with data from fluorescence-based thermal shift assays. By combining small-angle X-ray scattering with MDS and ensemble analysis, we captured the structure and dynamics of M in solution. Our analysis reveals a large positively charged patch on the protein surface that is involved in membrane interaction. Structural analysis of DOPC-induced polymerization of M into helical filaments using electron microscopy leads to a model of M self-assembly. The conservation of the Ca²⁺ binding sites suggests a role for calcium in the replication and morphogenesis of pneumoviruses. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  19. The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions

    PubMed Central

    Bartho, Joseph D.; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O.; Zhao, Youfu; Walsh, Martin A.

    2017-01-01

    AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family. PMID:28426806

  20. The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions.

    PubMed

    Bartho, Joseph D; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O; Zhao, Youfu; Walsh, Martin A; Benini, Stefano

    2017-01-01

    AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family.

  1. SCit: web tools for protein side chain conformation analysis.

    PubMed

    Gautier, R; Camproux, A-C; Tufféry, P

    2004-07-01

    SCit is a web server providing services for protein side chain conformation analysis and side chain positioning. Specific services use the dependence of the side chain conformations on the local backbone conformation, which is described using a structural alphabet that describes the conformation of fragments of four-residue length in a limited library of structural prototypes. Based on this concept, SCit uses sets of rotameric conformations dependent on the local backbone conformation of each protein for side chain positioning and the identification of side chains with unlikely conformations. The SCit web server is accessible at http://bioserv.rpbs.jussieu.fr/SCit.

  2. Structural Interface Forms and Their Involvement in Stabilization of Multidomain Proteins or Protein Complexes.

    PubMed

    Dygut, Jacek; Kalinowska, Barbara; Banach, Mateusz; Piwowar, Monika; Konieczny, Leszek; Roterman, Irena

    2016-10-18

    The presented analysis concerns the inter-domain and inter-protein interface in protein complexes. We propose extending the traditional understanding of the protein domain as a function of local compactness with an additional criterion which refers to the presence of a well-defined hydrophobic core. Interface areas in selected homodimers vary with respect to their contribution to share as well as individual (domain-specific) hydrophobic cores. The basic definition of a protein domain, i.e., a structural unit characterized by tighter packing than its immediate environment, is extended in order to acknowledge the role of a structured hydrophobic core, which includes the interface area. The hydrophobic properties of interfaces vary depending on the status of interacting domains-In this context we can distinguish: (1) Shared hydrophobic cores (spanning the whole dimer); (2) Individual hydrophobic cores present in each monomer irrespective of whether the dimer contains a shared core. Analysis of interfaces in dystrophin and utrophin indicates the presence of an additional quasi-domain with a prominent hydrophobic core, consisting of fragments contributed by both monomers. In addition, we have also attempted to determine the relationship between the type of interface (as categorized above) and the biological function of each complex. This analysis is entirely based on the fuzzy oil drop model.

  3. Molecular comparison of the structural proteins encoding gene clusters of two related Lactobacillus delbrueckii bacteriophages.

    PubMed Central

    Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T

    1993-01-01

    Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043

  4. Predictive and comparative analysis of Ebolavirus proteins

    PubMed Central

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395

  5. Predictive and comparative analysis of Ebolavirus proteins.

    PubMed

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.

  6. DNAproDB: an interactive tool for structural analysis of DNA–protein complexes

    PubMed Central

    Sagendorf, Jared M.

    2017-01-01

    Abstract Many biological processes are mediated by complex interactions between DNA and proteins. Transcription factors, various polymerases, nucleases and histones recognize and bind DNA with different levels of binding specificity. To understand the physical mechanisms that allow proteins to recognize DNA and achieve their biological functions, it is important to analyze structures of DNA–protein complexes in detail. DNAproDB is a web-based interactive tool designed to help researchers study these complexes. DNAproDB provides an automated structure-processing pipeline that extracts structural features from DNA–protein complexes. The extracted features are organized in structured data files, which are easily parsed with any programming language or viewed in a browser. We processed a large number of DNA–protein complexes retrieved from the Protein Data Bank and created the DNAproDB database to store this data. Users can search the database by combining features of the DNA, protein or DNA–protein interactions at the interface. Additionally, users can upload their own structures for processing privately and securely. DNAproDB provides several interactive and customizable tools for creating visualizations of the DNA–protein interface at different levels of abstraction that can be exported as high quality figures. All functionality is documented and freely accessible at http://dnaprodb.usc.edu. PMID:28431131

  7. The role of ion mobility spectrometry-mass spectrometry in the analysis of protein reference standards.

    PubMed

    Pritchard, Caroline; O'Connor, Gavin; Ashcroft, Alison E

    2013-08-06

    To achieve comparability of measurement results of protein amount of substance content between clinical laboratories, suitable reference materials are required. The impact on measurement comparability of potential differences in the tertiary and quaternary structure of protein reference standards is as yet not well understood. With the use of human growth hormone as a model protein, the potential of ion mobility spectrometry-mass spectrometry as a tool to assess differences in the structure of protein reference materials and their interactions with antibodies has been investigated here.

  8. Evolution driven structural changes in CENP-E motor domain.

    PubMed

    Kumar, Ambuj; Kamaraj, Balu; Sethumadhavan, Rao; Purohit, Rituraj

    2013-06-01

    Genetic evolution corresponds to various biochemical changes that are vital development of new functional traits. Phylogenetic analysis has provided an important insight into the genetic closeness among species and their evolutionary relationships. Centromere-associated protein-E (CENP-E) protein is vital for maintaining cell cycle and checkpoint signal mechanisms are vital for recruitment process of other essential kinetochore proteins. In this study we have focussed on the evolution driven structural changes in CENP-E motor domain among primate lineage. Through molecular dynamics simulation and computational chemistry approaches we examined the changes in ATP binding affinity and conformational deviations in human CENP-E motor domain as compared to the other primates. Root mean square deviation (RMSD), Root mean square fluctuation (RMSF), Radius of gyration (Rg) and principle component analysis (PCA) results together suggested a gain in stability level as we move from tarsier towards human. This study provides a significant insight into how the cell cycle proteins and their corresponding biochemical activities are evolving and illustrates the potency of a theoretical approach for assessing, in a single study, the structural, functional, and dynamical aspects of protein evolution.

  9. The Structure of the Mouse Serotonin 5-HT3 Receptor in Lipid Vesicles.

    PubMed

    Kudryashev, Mikhail; Castaño-Díez, Daniel; Deluz, Cédric; Hassaine, Gherici; Grasso, Luigino; Graf-Meyer, Alexandra; Vogel, Horst; Stahlberg, Henning

    2016-01-05

    The function of membrane proteins is best understood if their structure in the lipid membrane is known. Here, we determined the structure of the mouse serotonin 5-HT3 receptor inserted in lipid bilayers to a resolution of 12 Å without stabilizing antibodies by cryo electron tomography and subtomogram averaging. The reconstruction reveals protein secondary structure elements in the transmembrane region, the extracellular pore, and the transmembrane channel pathway, showing an overall similarity to the available X-ray model of the truncated 5-HT3 receptor determined in the presence of a stabilizing nanobody. Structural analysis of the 5-HT3 receptor embedded in a lipid bilayer allowed the position of the membrane to be determined. Interactions between the densely packed receptors in lipids were visualized, revealing that the interactions were maintained by the short horizontal helices. In combination with methodological improvements, our approach enables the structural analysis of membrane proteins in response to voltage and ligand gating. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Progression of 3D Protein Structure and Dynamics Measurements

    NASA Astrophysics Data System (ADS)

    Sato-Tomita, Ayana; Sekiguchi, Hiroshi; Sasaki, Yuji C.

    2018-06-01

    New measurement methodologies have begun to be proposed with the recent progress in the life sciences. Here, we introduce two new methodologies, X-ray fluorescence holography for protein structural analysis and diffracted X-ray tracking (DXT), to observe the dynamic behaviors of individual single molecules.

  11. UbSRD: The Ubiquitin Structural Relational Database.

    PubMed

    Harrison, Joseph S; Jacobs, Tim M; Houlihan, Kevin; Van Doorslaer, Koenraad; Kuhlman, Brian

    2016-02-22

    The structurally defined ubiquitin-like homology fold (UBL) can engage in several unique protein-protein interactions and many of these complexes have been characterized with high-resolution techniques. Using Rosetta's structural classification tools, we have created the Ubiquitin Structural Relational Database (UbSRD), an SQL database of features for all 509 UBL-containing structures in the PDB, allowing users to browse these structures by protein-protein interaction and providing a platform for quantitative analysis of structural features. We used UbSRD to define the recognition features of ubiquitin (UBQ) and SUMO observed in the PDB and the orientation of the UBQ tail while interacting with certain types of proteins. While some of the interaction surfaces on UBQ and SUMO overlap, each molecule has distinct features that aid in molecular discrimination. Additionally, we find that the UBQ tail is malleable and can adopt a variety of conformations upon binding. UbSRD is accessible as an online resource at rosettadesign.med.unc.edu/ubsrd. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. PACSY, a relational database management system for protein structure and chemical shift analysis

    PubMed Central

    Lee, Woonghee; Yu, Wookyung; Kim, Suhkmann; Chang, Iksoo

    2012-01-01

    PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu. PMID:22903636

  13. KFC Server: interactive forecasting of protein interaction hot spots

    PubMed Central

    Darnell, Steven J.; LeGault, Laura; Mitchell, Julie C.

    2008-01-01

    The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model—a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein–protein or protein–DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org. PMID:18539611

  14. Watching proteins function with picosecond X-ray crystallography and molecular dynamics simulations.

    NASA Astrophysics Data System (ADS)

    Anfinrud, Philip

    2006-03-01

    Time-resolved electron density maps of myoglobin, a ligand-binding heme protein, have been stitched together into movies that unveil with < 2-å spatial resolution and 150-ps time-resolution the correlated protein motions that accompany and/or mediate ligand migration within the hydrophobic interior of a protein. A joint analysis of all-atom molecular dynamics (MD) calculations and picosecond time-resolved X-ray structures provides single-molecule insights into mechanisms of protein function. Ensemble-averaged MD simulations of the L29F mutant of myoglobin following ligand dissociation reproduce the direction, amplitude, and timescales of crystallographically-determined structural changes. This close agreement with experiments at comparable resolution in space and time validates the individual MD trajectories, which identify and structurally characterize a conformational switch that directs dissociated ligands to one of two nearby protein cavities. This unique combination of simulation and experiment unveils functional protein motions and illustrates at an atomic level relationships among protein structure, dynamics, and function. In collaboration with Friedrich Schotte and Gerhard Hummer, NIH.

  15. Analysis of the interface variability in NMR structure ensembles of protein-protein complexes.

    PubMed

    Calvanese, Luisa; D'Auria, Gabriella; Vangone, Anna; Falcigno, Lucia; Oliva, Romina

    2016-06-01

    NMR structures consist in ensembles of conformers, all satisfying the experimental restraints, which exhibit a certain degree of structural variability. We analyzed here the interface in NMR ensembles of protein-protein heterodimeric complexes and found it to span a wide range of different conservations. The different exhibited conservations do not simply correlate with the size of the systems/interfaces, and are most probably the result of an interplay between different factors, including the quality of experimental data and the intrinsic complex flexibility. In any case, this information is not to be missed when NMR structures of protein-protein complexes are analyzed; especially considering that, as we also show here, the first NMR conformer is usually not the one which best reflects the overall interface. To quantify the interface conservation and to analyze it, we used an approach originally conceived for the analysis and ranking of ensembles of docking models, which has now been extended to directly deal with NMR ensembles. We propose this approach, based on the conservation of the inter-residue contacts at the interface, both for the analysis of the interface in whole ensembles of NMR complexes and for the possible selection of a single conformer as the best representative of the overall interface. In order to make the analyses automatic and fast, we made the protocol available as a web tool at: https://www.molnac.unisa.it/BioTools/consrank/consrank-nmr.html. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. A generative, probabilistic model of local protein structure.

    PubMed

    Boomsma, Wouter; Mardia, Kanti V; Taylor, Charles C; Ferkinghoff-Borg, Jesper; Krogh, Anders; Hamelryck, Thomas

    2008-07-01

    Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state. Our method represents a significant theoretical and practical improvement over the widely used fragment assembly technique by avoiding the drawbacks associated with a discrete and nonprobabilistic approach.

  17. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  18. Differential accumulation of nif structural gene mRNA in Azotobacter vinelandii.

    PubMed

    Hamilton, Trinity L; Jacobson, Marty; Ludwig, Marcus; Boyd, Eric S; Bryant, Donald A; Dean, Dennis R; Peters, John W

    2011-09-01

    Northern analysis was employed to investigate mRNA produced by mutant strains of Azotobacter vinelandii with defined deletions in the nif structural genes and in the intergenic noncoding regions. The results indicate that intergenic RNA secondary structures effect the differential accumulation of transcripts, supporting the high Fe protein-to-MoFe protein ratio required for optimal diazotrophic growth.

  19. Structure-based functional annotation of putative conserved proteins having lyase activity from Haemophilus influenzae.

    PubMed

    Shahbaaz, Mohd; Ahmad, Faizan; Imtaiyaz Hassan, Md

    2015-06-01

    Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure-function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.

  20. Genetics of PCOS: A systematic bioinformatics approach to unveil the proteins responsible for PCOS.

    PubMed

    Panda, Pritam Kumar; Rane, Riya; Ravichandran, Rahul; Singh, Shrinkhla; Panchal, Hetalkumar

    2016-06-01

    Polycystic ovary syndrome (PCOS) is a hormonal imbalance in women, which causes problems during menstrual cycle and in pregnancy that sometimes results in fatality. Though the genetics of PCOS is not fully understood, early diagnosis and treatment can prevent long-term effects. In this study, we have studied the proteins involved in PCOS and the structural aspects of the proteins that are taken into consideration using computational tools. The proteins involved are modeled using Modeller 9v14 and Ab-initio programs. All the 43 proteins responsible for PCOS were subjected to phylogenetic analysis to identify the relatedness of the proteins. Further, microarray data analysis of PCOS datasets was analyzed that was downloaded from GEO datasets to find the significant protein-coding genes responsible for PCOS, which is an addition to the reported protein-coding genes. Various statistical analyses were done using R programming to get an insight into the structural aspects of PCOS that can be used as drug targets to treat PCOS and other related reproductive diseases.

  1. Deciphering fine molecular details of proteins' structure and function with a Protein Surface Topography (PST) method.

    PubMed

    Koromyslova, Anna D; Chugunov, Anton O; Efremov, Roman G

    2014-04-28

    Molecular surfaces are the key players in biomolecular recognition and interactions. Nowadays, it is trivial to visualize a molecular surface and surface-distributed properties in three-dimensional space. However, such a representation trends to be biased and ambiguous in case of thorough analysis. We present a new method to create 2D spherical projection maps of entire protein surfaces and manipulate with them--protein surface topography (PST). It permits visualization and thoughtful analysis of surface properties. PST helps to easily portray conformational transitions, analyze proteins' properties and their dynamic behavior, improve docking performance, and reveal common patterns and dissimilarities in molecular surfaces of related bioactive peptides. This paper describes basic usage of PST with an example of small G-proteins conformational transitions, mapping of caspase-1 intersubunit interface, and intrinsic "complementarity" in the conotoxin-acetylcholine binding protein complex. We suggest that PST is a beneficial approach for structure-function studies of bioactive peptides and small proteins.

  2. UNRES server for physics-based coarse-grained simulations and prediction of protein structure, dynamics and thermodynamics.

    PubMed

    Czaplewski, Cezary; Karczynska, Agnieszka; Sieradzan, Adam K; Liwo, Adam

    2018-04-30

    A server implementation of the UNRES package (http://www.unres.pl) for coarse-grained simulations of protein structures with the physics-based UNRES model, coined a name UNRES server, is presented. In contrast to most of the protein coarse-grained models, owing to its physics-based origin, the UNRES force field can be used in simulations, including those aimed at protein-structure prediction, without ancillary information from structural databases; however, the implementation includes the possibility of using restraints. Local energy minimization, canonical molecular dynamics simulations, replica exchange and multiplexed replica exchange molecular dynamics simulations can be run with the current UNRES server; the latter are suitable for protein-structure prediction. The user-supplied input includes protein sequence and, optionally, restraints from secondary-structure prediction or small x-ray scattering data, and simulation type and parameters which are selected or typed in. Oligomeric proteins, as well as those containing D-amino-acid residues and disulfide links can be treated. The output is displayed graphically (minimized structures, trajectories, final models, analysis of trajectory/ensembles); however, all output files can be downloaded by the user. The UNRES server can be freely accessed at http://unres-server.chem.ug.edu.pl.

  3. An Algorithm for Protein Helix Assignment Using Helix Geometry

    PubMed Central

    Cao, Chen; Xu, Shutan; Wang, Lincong

    2015-01-01

    Helices are one of the most common and were among the earliest recognized secondary structure elements in proteins. The assignment of helices in a protein underlies the analysis of its structure and function. Though the mathematical expression for a helical curve is simple, no previous assignment programs have used a genuine helical curve as a model for helix assignment. In this paper we present a two-step assignment algorithm. The first step searches for a series of bona fide helical curves each one best fits the coordinates of four successive backbone Cα atoms. The second step uses the best fit helical curves as input to make helix assignment. The application to the protein structures in the PDB (protein data bank) proves that the algorithm is able to assign accurately not only regular α-helix but also 310 and π helices as well as their left-handed versions. One salient feature of the algorithm is that the assigned helices are structurally more uniform than those by the previous programs. The structural uniformity should be useful for protein structure classification and prediction while the accurate assignment of a helix to a particular type underlies structure-function relationship in proteins. PMID:26132394

  4. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  5. Maximally asymmetric transbilayer distribution of anionic lipids alters the structure and interaction with lipids of an amyloidogenic protein dimer bound to the membrane surface

    PubMed Central

    Cheng, Sara Y.; Chou, George; Buie, Creighton; Vaughn, Mark W.; Compton, Campbell; Cheng, Kwan H.

    2016-01-01

    We used molecular dynamics simulations to explore the effects of asymmetric transbilayer distribution of anionic phosphatidylserine (PS) lipids on the structure of a protein on the membrane surface and subsequent protein–lipid interactions. Our simulation systems consisted of an amyloidogenic, beta-sheet rich dimeric protein (D42) absorbed to the phosphatidylcholine (PC) leaflet, or protein-contact PC leaflet, of two membrane systems: a single-component PC bilayer and double PC/PS bilayers. The latter comprised of a stable but asymmetric transbilayer distribution of PS in the presence of counterions, with a 1-component PC leaflet coupled to a 1-component PS leaflet in each bilayer. The maximally asymmetric PC/PS bilayer had a non-zero transmembrane potential (TMP) difference and higher lipid order packing, whereas the symmetric PC bilayer had a zero TMP difference and lower lipid order packing under physiologically relevant conditions. Analysis of the adsorbed protein structures revealed weaker protein binding, more folding in the N-terminal domain, more aggregation of the N- and C-terminal domains and larger tilt angle of D42 on the PC leaflet surface of the PC/PS bilayer versus the PC bilayer. Also, analysis of protein-induced membrane structural disruption revealed more localized bilayer thinning in the PC/PS versus PC bilayer. Although the electric field profile in the non-protein-contact PS leaflet of the PC/PS bilayer differed significantly from that in the non-protein-contact PC leaflet of the PC bilayer, no significant difference in the electric field profile in the protein-contact PC leaflet of either bilayer was evident. We speculate that lipid packing has a larger effect on the surface adsorbed protein structure than the electric field for a maximally asymmetric PC/PS bilayer. Our results support the mechanism that the higher lipid packing in a lipid leaflet promotes stronger protein–protein but weaker protein–lipid interactions for a dimeric protein on membrane surfaces. PMID:26827904

  6. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  7. The determinants of bond angle variability in protein/peptide backbones: A comprehensive statistical/quantum mechanics analysis.

    PubMed

    Improta, Roberto; Vitagliano, Luigi; Esposito, Luciana

    2015-11-01

    The elucidation of the mutual influence between peptide bond geometry and local conformation has important implications for protein structure refinement, validation, and prediction. To gain insights into the structural determinants and the energetic contributions associated with protein/peptide backbone plasticity, we here report an extensive analysis of the variability of the peptide bond angles by combining statistical analyses of protein structures and quantum mechanics calculations on small model peptide systems. Our analyses demonstrate that all the backbone bond angles strongly depend on the peptide conformation and unveil the existence of regular trends as function of ψ and/or φ. The excellent agreement of the quantum mechanics calculations with the statistical surveys of protein structures validates the computational scheme here employed and demonstrates that the valence geometry of protein/peptide backbone is primarily dictated by local interactions. Notably, for the first time we show that the position of the H(α) hydrogen atom, which is an important parameter in NMR structural studies, is also dependent on the local conformation. Most of the trends observed may be satisfactorily explained by invoking steric repulsive interactions; in some specific cases the valence bond variability is also influenced by hydrogen-bond like interactions. Moreover, we can provide a reliable estimate of the energies involved in the interplay between geometry and conformations. © 2015 Wiley Periodicals, Inc.

  8. Identification and analysis of host proteins that interact with the 3'-untranslated region of tick-borne encephalitis virus genomic RNA.

    PubMed

    Muto, Memi; Kamitani, Wataru; Sakai, Mizuki; Hirano, Minato; Kobayashi, Shintaro; Kariwa, Hiroaki; Yoshii, Kentaro

    2018-04-02

    Tick-borne encephalitis virus (TBEV) causes severe neurological disease, but the pathogenetic mechanism is unclear. The conformational structure of the 3'-untranslated region (UTR) of TBEV is associated with its virulence. We tried to identify host proteins interacting with the 3'-UTR of TBEV. Cellular proteins of HEK293T cells were co-precipitated with biotinylated RNAs of the 3'-UTR of low- and high-virulence TBEV strains and subjected to mass spectrometry analysis. Fifteen host proteins were found to bind to the 3'-UTR of TBEV, four of which-cold shock domain containing-E1 (CSDE1), spermatid perinuclear RNA binding protein (STRBP), fragile X mental retardation protein (FMRP), and interleukin enhancer binding factor 3 (ILF3)-bound specifically to that of the low-virulence strain. An RNA immunoprecipitation and pull-down assay confirmed the interactions of the complete 3'-UTRs of TBEV genomic RNA with CSDE1, FMRP, and ILF3. Partial deletion of the stem loop (SL) 3 to SL 5 structure of the variable region of the 3'-UTR did not affect interactions with the host proteins, but the interactions were markedly suppressed by deletion of the complete SL 3, 4, and 5 structures, as in the high-virulence TBEV strain. Further analysis of the roles of host proteins in the neurologic pathogenicity of TBEV is warranted. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. PDB@: an offline toolkit for exploration and analysis of PDB files.

    PubMed

    Mani, Udayakumar; Ravisankar, Sadhana; Ramakrishnan, Sai Mukund

    2013-12-01

    Protein Data Bank (PDB) is a freely accessible archive of the 3-D structural data of biological molecules. Structure based studies offers a unique vantage point in inferring the properties of a protein molecule from structural data. This is too big a task to be done manually. Moreover, there is no single tool, software or server that comprehensively analyses all structure-based properties. The objective of the present work is to develop an offline computational toolkit, PDB@ containing in-built algorithms that help categorizing the structural properties of a protein molecule. The user has the facility to view and edit the PDB file to his need. Some features of the present work are unique in itself and others are an improvement over existing tools. Also, the representation of protein properties in both graphical and textual formats helps in predicting all the necessary details of a protein molecule on a single platform.

  10. RosettaHoles: rapid assessment of protein core packing for structure prediction, refinement, design, and validation.

    PubMed

    Sheffler, Will; Baker, David

    2009-01-01

    We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high-resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures.

  11. RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation

    PubMed Central

    Sheffler, Will; Baker, David

    2009-01-01

    We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high-resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures. PMID:19177366

  12. Fourier transform infrared microspectroscopic analysis of the effects of cereal type and variety within a type of grain on structural makeup in relation to rumen degradation kinetics.

    PubMed

    Walker, Amanda M; Yu, Peiqiang; Christensen, Colleen R; Christensen, David A; McKinnon, John J

    2009-08-12

    The objectives of this study were to use Fourier transform infrared microspectroscopy (FTIRM) to determine structural makeup (features) of cereal grain endosperm tissue and to reveal and identify differences in protein and carbohydrate structural makeup between different cereal types (corn vs barley) and between different varieties within a grain (barley CDC Bold, CDC Dolly, Harrington, and Valier). Another objective was to investigate how these structural features relate to rumen degradation kinetics. The items assessed included (1) structural differences in protein amide I to nonstructural carbohydrate (NSC, starch) intensity and ratio within cellular dimensions; (2) molecular structural differences in the secondary structure profile of protein, alpha-helix, beta-sheet, and their ratio; (3) structural differences in NSC to amide I ratio profile. From the results, it was observed that (1) comparison between grain types [corn (cv. Pioneer 39P78) vs barley (cv. Harrington)] showed significant differences in structural makeup in terms of NSC, amide I to NSC ratio, and rumen degradation kinetics (degradation ratio, effective degradability of dry matter, protein and NSC) (P < 0.05); (2) comparison between varieties within a grain (barley varieties) also showed significant differences in structural makeup in terms of amide I, NSC, amide I to NSC ratio, alpha-helix and beta-sheet protein structures, and rumen degradation kinetics (effective degradability of dry matter, protein, and NSC) (P < 0.05); (3) correlation analysis showed that the amide I to NSC ratio was strongly correlated with rumen degradation kinetics in terms of the degradation rate (R = 0.91, P = 0.086) and effective degradability of dry matter (R = 0.93, P = 0.071). The results suggest that with the FTIRM technique, the structural makeup differences between cereal types and between different varieties within a type of grain could be revealed. These structural makeup differences were related to the rate and extent of rumen degradation.

  13. Terahertz mechanical vibrations in lysozyme: Raman spectroscopy vs modal analysis

    NASA Astrophysics Data System (ADS)

    Carpinteri, Alberto; Lacidogna, Giuseppe; Piana, Gianfranco; Bassani, Andrea

    2017-07-01

    The mechanical behaviour of proteins is receiving an increasing attention from the scientific community. Recently it has been suggested that mechanical vibrations play a crucial role in controlling structural configuration changes (folding) which govern proteins biological function. The mechanism behind protein folding is still not completely understood, and many efforts are being made to investigate this phenomenon. Complex molecular dynamics simulations and sophisticated experimental measurements are conducted to investigate protein dynamics and to perform protein structure predictions; however, these are two related, although quite distinct, approaches. Here we investigate mechanical vibrations of lysozyme by Raman spectroscopy and linear normal mode calculations (modal analysis). The input mechanical parameters to the numerical computations are taken from the literature. We first give an estimate of the order of magnitude of protein vibration frequencies by considering both classical wave mechanics and structural dynamics formulas. Afterwards, we perform modal analyses of some relevant chemical groups and of the full lysozyme protein. The numerical results are compared to experimental data, obtained from both in-house and literature Raman measurements. In particular, the attention is focused on a large peak at 0.84 THz (29.3 cm-1) in the Raman spectrum obtained analyzing a lyophilized powder sample.

  14. LC-MS and MS/MS in the analysis of recombinant proteins

    NASA Astrophysics Data System (ADS)

    Coulot, M.; Domon, B.; Grossenbacher, H.; Guenat, C.; Maerki, W.; Müller, D. R.; Richter, W. J.

    1993-03-01

    Applicability and performance of electrospray ionization mass spectrometry (ESIMS) is demonstrated for protein analysis. ESIMS is applied in conjunction with on-line HPLC (LC-ESlMS) and direct tandem mass spectrometry (positive and negative ion mode ESlMS/MS) to the structural characterization of a recombinant protein (r-hirudin variant 1) and a congener phosphorylated at threonine 45 (RP-1).

  15. Structural and energetic study of cation-π-cation interactions in proteins.

    PubMed

    Pinheiro, Silvana; Soteras, Ignacio; Gelpí, Josep Lluis; Dehez, François; Chipot, Christophe; Luque, F Javier; Curutchet, Carles

    2017-04-12

    Cation-π interactions of aromatic rings and positively charged groups are among the most important interactions in structural biology. The role and energetic characteristics of these interactions are well established. However, the occurrence of cation-π-cation interactions is an unexpected motif, which raises intriguing questions about its functional role in proteins. We present a statistical analysis of the occurrence, composition and geometrical preferences of cation-π-cation interactions identified in a set of non-redundant protein structures taken from the Protein Data Bank. Our results demonstrate that this structural motif is observed at a small, albeit non-negligible frequency in proteins, and suggest a preference to establish cation-π-cation motifs with Trp, followed by Tyr and Phe. Furthermore, we have found that cation-π-cation interactions tend to be highly conserved, which supports their structural or functional role. Finally, we have performed an energetic analysis of a representative subset of cation-π-cation complexes combining quantum-chemical and continuum solvation calculations. Our results point out that the protein environment can strongly screen the cation-cation repulsion, leading to an attractive interaction in 64% of the complexes analyzed. Together with the high degree of conservation observed, these results suggest a potential stabilizing role in the protein fold, as demonstrated recently for a miniature protein (Craven et al., J. Am. Chem. Soc. 2016, 138, 1543). From a computational point of view, the significant contribution of non-additive three-body terms challenges the suitability of standard additive force fields for describing cation-π-cation motifs in molecular simulations.

  16. Sequence/structural analysis of xylem proteome emphasizes pathogenesis-related proteins, chitinases and β-1, 3-glucanases as key players in grapevine defense against Xylella fastidiosa.

    PubMed

    Chakraborty, Sandeep; Nascimento, Rafael; Zaini, Paulo A; Gouran, Hossein; Rao, Basuthkar J; Goulart, Luiz R; Dandekar, Abhaya M

    2016-01-01

    Background. Xylella fastidiosa, the causative agent of various plant diseases including Pierce's disease in the US, and Citrus Variegated Chlorosis in Brazil, remains a continual source of concern and economic losses, especially since almost all commercial varieties are sensitive to this Gammaproteobacteria. Differential expression of proteins in infected tissue is an established methodology to identify key elements involved in plant defense pathways. Methods. In the current work, we developed a methodology named CHURNER that emphasizes relevant protein functions from proteomic data, based on identification of proteins with similar structures that do not necessarily have sequence homology. Such clustering emphasizes protein functions which have multiple copies that are up/down-regulated, and highlights similar proteins which are differentially regulated. As a working example we present proteomic data enumerating differentially expressed proteins in xylem sap from grapevines that were infected with X. fastidiosa. Results. Analysis of this data by CHURNER highlighted pathogenesis related PR-1 proteins, reinforcing this as the foremost protein function in xylem sap involved in the grapevine defense response to X. fastidiosa. β-1, 3-glucanase, which has both anti-microbial and anti-fungal activities, is also up-regulated. Simultaneously, chitinases are found to be both up and down-regulated by CHURNER, and thus the net gain of this protein function loses its significance in the defense response. Discussion. We demonstrate how structural data can be incorporated in the pipeline of proteomic data analysis prior to making inferences on the importance of individual proteins to plant defense mechanisms. We expect CHURNER to be applicable to any proteomic data set.

  17. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function

    PubMed Central

    Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A.

    2012-01-01

    Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. PMID:22651983

  18. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function.

    PubMed

    Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A

    2012-09-21

    Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. Protein Structural Analysis via Mass Spectrometry-Based Proteomics

    PubMed Central

    Artigues, Antonio; Nadeau, Owen W.; Rimmer, Mary Ashley; Villar, Maria T.; Du, Xiuxia; Fenton, Aron W.; Carlson, Gerald M.

    2017-01-01

    Modern mass spectrometry (MS) technologies have provided a versatile platform that can be combined with a large number of techniques to analyze protein structure and dynamics. These techniques include the three detailed in this chapter: 1) hydrogen/deuterium exchange (HDX), 2) limited proteolysis, and 3) chemical crosslinking (CX). HDX relies on the change in mass of a protein upon its dilution into deuterated buffer, which results in varied deuterium content within its backbone amides. Structural information on surface exposed, flexible or disordered linker regions of proteins can be achieved through limited proteolysis, using a variety of proteases and only small extents of digestion. CX refers to the covalent coupling of distinct chemical species and has been used to analyze the structure, function and interactions of proteins by identifying crosslinking sites that are formed by small multi-functional reagents, termed crosslinkers. Each of these MS applications is capable of revealing structural information for proteins when used either with or without other typical high resolution techniques, including NMR and X-ray crystallography. PMID:27975228

  20. Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.

    PubMed

    Neumann, Sindy; Hartmann, Holger; Martin-Galiano, Antonio J; Fuchs, Angelika; Frishman, Dmitrij

    2012-03-01

    Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ∼1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/. Copyright © 2011 Wiley Periodicals, Inc.

  1. Recombinant Production, Reconstruction in Lipid-Protein Nanodiscs, and Electron Microscopy of Full-Length α-Subunit of Human Potassium Channel Kv7.1.

    PubMed

    Shenkarev, Z O; Karlova, M G; Kulbatskii, D S; Kirpichnikov, M P; Lyukmanova, E N; Sokolova, O S

    2018-05-01

    Voltage-gated potassium channel Kv7.1 plays an important role in the excitability of cardiac muscle. The α-subunit of Kv7.1 (KCNQ1) is the main structural element of this channel. Tetramerization of KCNQ1 in the membrane results in formation of an ion channel, which comprises a pore and four voltage-sensing domains. Mutations in the human KCNQ1 gene are one of the major causes of inherited arrhythmias, long QT syndrome in particular. The construct encoding full-length human KCNQ1 protein was synthesized in this work, and an expression system in the Pichia pastoris yeast cells was developed. The membrane fraction of the yeast cells containing the recombinant protein (rKCNQ1) was solubilized with CHAPS detergent. To better mimic the lipid environment of the channel, lipid-protein nanodiscs were formed using solubilized membrane fraction and MSP2N2 protein. The rKCNQ1/nanodisc and rKCNQ1/CHAPS samples were purified using the Rho1D4 tag introduced at the C-terminus of the protein. Protein samples were examined using transmission electron microscopy with negative staining. In both cases, homogeneous rKCNQ1 samples were observed based on image analysis. Statistical analysis of the images of individual protein particles solubilized in the detergent revealed the presence of a tetrameric structure confirming intact subunit assembly. A three-dimensional channel structure reconstructed at 2.5-nm resolution represents a compact density with diameter of the membrane part of ~9 nm and height ~11 nm. Analysis of the images of rKCNQ1 in nanodiscs revealed additional electron density corresponding to the lipid bilayer fragment and the MSP2N2 protein. These results indicate that the nanodiscs facilitate protein isolation, purification, and stabilization in solution and can be used for further structural studies of human Kv7.1.

  2. Computational Modeling of Allosteric Regulation in the Hsp90 Chaperones: A Statistical Ensemble Analysis of Protein Structure Networks and Allosteric Communications

    PubMed Central

    Blacklock, Kristin; Verkhivker, Gennady M.

    2014-01-01

    A fundamental role of the Hsp90 chaperone in regulating functional activity of diverse protein clients is essential for the integrity of signaling networks. In this work we have combined biophysical simulations of the Hsp90 crystal structures with the protein structure network analysis to characterize the statistical ensemble of allosteric interaction networks and communication pathways in the Hsp90 chaperones. We have found that principal structurally stable communities could be preserved during dynamic changes in the conformational ensemble. The dominant contribution of the inter-domain rigidity to the interaction networks has emerged as a common factor responsible for the thermodynamic stability of the active chaperone form during the ATPase cycle. Structural stability analysis using force constant profiling of the inter-residue fluctuation distances has identified a network of conserved structurally rigid residues that could serve as global mediating sites of allosteric communication. Mapping of the conformational landscape with the network centrality parameters has demonstrated that stable communities and mediating residues may act concertedly with the shifts in the conformational equilibrium and could describe the majority of functionally significant chaperone residues. The network analysis has revealed a relationship between structural stability, global centrality and functional significance of hotspot residues involved in chaperone regulation. We have found that allosteric interactions in the Hsp90 chaperone may be mediated by modules of structurally stable residues that display high betweenness in the global interaction network. The results of this study have suggested that allosteric interactions in the Hsp90 chaperone may operate via a mechanism that combines rapid and efficient communication by a single optimal pathway of structurally rigid residues and more robust signal transmission using an ensemble of suboptimal multiple communication routes. This may be a universal requirement encoded in protein structures to balance the inherent tension between resilience and efficiency of the residue interaction networks. PMID:24922508

  3. Computational modeling of allosteric regulation in the hsp90 chaperones: a statistical ensemble analysis of protein structure networks and allosteric communications.

    PubMed

    Blacklock, Kristin; Verkhivker, Gennady M

    2014-06-01

    A fundamental role of the Hsp90 chaperone in regulating functional activity of diverse protein clients is essential for the integrity of signaling networks. In this work we have combined biophysical simulations of the Hsp90 crystal structures with the protein structure network analysis to characterize the statistical ensemble of allosteric interaction networks and communication pathways in the Hsp90 chaperones. We have found that principal structurally stable communities could be preserved during dynamic changes in the conformational ensemble. The dominant contribution of the inter-domain rigidity to the interaction networks has emerged as a common factor responsible for the thermodynamic stability of the active chaperone form during the ATPase cycle. Structural stability analysis using force constant profiling of the inter-residue fluctuation distances has identified a network of conserved structurally rigid residues that could serve as global mediating sites of allosteric communication. Mapping of the conformational landscape with the network centrality parameters has demonstrated that stable communities and mediating residues may act concertedly with the shifts in the conformational equilibrium and could describe the majority of functionally significant chaperone residues. The network analysis has revealed a relationship between structural stability, global centrality and functional significance of hotspot residues involved in chaperone regulation. We have found that allosteric interactions in the Hsp90 chaperone may be mediated by modules of structurally stable residues that display high betweenness in the global interaction network. The results of this study have suggested that allosteric interactions in the Hsp90 chaperone may operate via a mechanism that combines rapid and efficient communication by a single optimal pathway of structurally rigid residues and more robust signal transmission using an ensemble of suboptimal multiple communication routes. This may be a universal requirement encoded in protein structures to balance the inherent tension between resilience and efficiency of the residue interaction networks.

  4. Structure prediction and binding sites analysis of curcin protein of Jatropha curcas using computational approaches.

    PubMed

    Srivastava, Mugdha; Gupta, Shishir K; Abhilash, P C; Singh, Nandita

    2012-07-01

    Ribosome inactivating proteins (RIPs) are defense proteins in a number of higher-plant species that are directly targeted toward herbivores. Jatropha curcas is one of the biodiesel plants having RIPs. The Jatropha seed meal, after extraction of oil, is rich in curcin, a highly toxic RIP similar to ricin, which makes it unsuitable for animal feed. Although the toxicity of curcin is well documented in the literature, the detailed toxic properties and the 3D structure of curcin has not been determined by X-ray crystallography, NMR spectroscopy or any in silico techniques to date. In this pursuit, the structure of curcin was modeled by a composite approach of 3D structure prediction using threading and ab initio modeling. Assessment of model quality was assessed by methods which include Ramachandran plot analysis and Qmean score estimation. Further, we applied the protein-ligand docking approach to identify the r-RNA binding residue of curcin. The present work provides the first structural insight into the binding mode of r-RNA adenine to the curcin protein and forms the basis for designing future inhibitors of curcin. Cloning of a future peptide inhibitor within J. curcas can produce non-toxic varieties of J. curcas, which would make the seed-cake suitable as animal feed without curcin detoxification.

  5. A structural analysis of the AAA+ domains in Saccharomyces cerevisiae cytoplasmic dynein

    PubMed Central

    Gleave, Emma S.; Schmidt, Helgo; Carter, Andrew P.

    2014-01-01

    Dyneins are large protein complexes that act as microtubule based molecular motors. The dynein heavy chain contains a motor domain which is a member of the AAA+ protein family (ATPases Associated with diverse cellular Activities). Proteins of the AAA+ family show a diverse range of functionalities, but share a related core AAA+ domain, which often assembles into hexameric rings. Dynein is unusual because it has all six AAA+ domains linked together, in one long polypeptide. The dynein motor domain generates movement by coupling ATP driven conformational changes in the AAA+ ring to the swing of a motile element called the linker. Dynein binds to its microtubule track via a long antiparallel coiled-coil stalk that emanates from the AAA+ ring. Recently the first high resolution structures of the dynein motor domain were published. Here we provide a detailed structural analysis of the six AAA+ domains using our Saccharomycescerevisiae crystal structure. We describe how structural similarities in the dynein AAA+ domains suggest they share a common evolutionary origin. We analyse how the different AAA+ domains have diverged from each other. We discuss how this is related to the function of dynein as a motor protein and how the AAA+ domains of dynein compare to those of other AAA+ proteins. PMID:24680784

  6. Distributions of experimental protein structures on coarse-grained free energy landscapes

    PubMed Central

    Liu, Jie; Jernigan, Robert L.

    2015-01-01

    Predicting conformational changes of proteins is needed in order to fully comprehend functional mechanisms. With the large number of available structures in sets of related proteins, it is now possible to directly visualize the clusters of conformations and their conformational transitions through the use of principal component analysis. The most striking observation about the distributions of the structures along the principal components is their highly non-uniform distributions. In this work, we use principal component analysis of experimental structures of 50 diverse proteins to extract the most important directions of their motions, sample structures along these directions, and estimate their free energy landscapes by combining knowledge-based potentials and entropy computed from elastic network models. When these resulting motions are visualized upon their coarse-grained free energy landscapes, the basis for conformational pathways becomes readily apparent. Using three well-studied proteins, T4 lysozyme, serum albumin, and sarco-endoplasmic reticular Ca2+ adenosine triphosphatase (SERCA), as examples, we show that such free energy landscapes of conformational changes provide meaningful insights into the functional dynamics and suggest transition pathways between different conformational states. As a further example, we also show that Monte Carlo simulations on the coarse-grained landscape of HIV-1 protease can directly yield pathways for force-driven conformational changes. PMID:26723638

  7. ProteMiner-SSM: a web server for efficient analysis of similar protein tertiary substructures.

    PubMed

    Chang, Darby Tien-Hau; Chen, Chien-Yu; Chung, Wen-Chin; Oyang, Yen-Jen; Juan, Hsueh-Fen; Huang, Hsuan-Cheng

    2004-07-01

    Analysis of protein-ligand interactions is a fundamental issue in drug design. As the detailed and accurate analysis of protein-ligand interactions involves calculation of binding free energy based on thermodynamics and even quantum mechanics, which is highly expensive in terms of computing time, conformational and structural analysis of proteins and ligands has been widely employed as a screening process in computer-aided drug design. In this paper, a web server called ProteMiner-SSM designed for efficient analysis of similar protein tertiary substructures is presented. In one experiment reported in this paper, the web server has been exploited to obtain some clues about a biochemical hypothesis. The main distinction in the software design of the web server is the filtering process incorporated to expedite the analysis. The filtering process extracts the residues located in the caves of the protein tertiary structure for analysis and operates with O(nlogn) time complexity, where n is the number of residues in the protein. In comparison, the alpha-hull algorithm, which is a widely used algorithm in computer graphics for identifying those instances that are on the contour of a three-dimensional object, features O(n2) time complexity. Experimental results show that the filtering process presented in this paper is able to speed up the analysis by a factor ranging from 3.15 to 9.37 times. The ProteMiner-SSM web server can be found at http://proteminer.csie.ntu.edu.tw/. There is a mirror site at http://p4.sbl.bc.sinica.edu.tw/proteminer/.

  8. Attenuated Total Reflection Fourier Transform Infrared (ATR FT-IR) Spectroscopy as an Analytical Method to Investigate the Secondary Structure of a Model Protein Embedded in Solid Lipid Matrices.

    PubMed

    Zeeshan, Farrukh; Tabbassum, Misbah; Jorgensen, Lene; Medlicott, Natalie J

    2018-02-01

    Protein drugs may encounter conformational perturbations during the formulation processing of lipid-based solid dosage forms. In aqueous protein solutions, attenuated total reflection Fourier transform infrared (ATR FT-IR) spectroscopy can investigate these conformational changes following the subtraction of spectral interference of solvent with protein amide I bands. However, in solid dosage forms, the possible spectral contribution of lipid carriers to protein amide I band may be an obstacle to determine conformational alterations. The objective of this study was to develop an ATR FT-IR spectroscopic method for the analysis of protein secondary structure embedded in solid lipid matrices. Bovine serum albumin (BSA) was chosen as a model protein, while Precirol AT05 (glycerol palmitostearate, melting point 58 ℃) was employed as the model lipid matrix. Bovine serum albumin was incorporated into lipid using physical mixing, melting and mixing, or wet granulation mixing methods. Attenuated total reflection FT-IR spectroscopy and size exclusion chromatography (SEC) were performed for the analysis of BSA secondary structure and its dissolution in aqueous media, respectively. The results showed significant interference of Precirol ATO5 with BSA amide I band which was subtracted up to 90% w/w lipid content to analyze BSA secondary structure. In addition, ATR FT-IR spectroscopy also detected thermally denatured BSA solid alone and in the presence of lipid matrix indicating its suitability for the detection of denatured protein solids in lipid matrices. Despite being in the solid state, conformational changes occurred to BSA upon incorporation into solid lipid matrices. However, the extent of these conformational alterations was found to be dependent on the mixing method employed as indicated by area overlap calculations. For instance, the melting and mixing method imparted negligible effect on BSA secondary structure, whereas the wet granulation mixing method promoted more changes. Size exclusion chromatography analysis depicted the complete dissolution of BSA in the aqueous media employed in the wet granulation method. In conclusion, an ATR FT-IR spectroscopic method was successfully developed to investigate BSA secondary structure in solid lipid matrices following the subtraction of lipid spectral interference. The ATR FT-IR spectroscopy could further be applied to investigate the secondary structure perturbations of therapeutic proteins during their formulation development.

  9. Comparative structural analysis of human DEAD-box RNA helicases.

    PubMed

    Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig

    2010-09-30

    DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members.

  10. Comparative Structural Analysis of Human DEAD-Box RNA Helicases

    PubMed Central

    Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig

    2010-01-01

    DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members. PMID:20941364

  11. Slow dynamics of a protein backbone in molecular dynamics simulation revealed by time-structure based independent component analysis

    NASA Astrophysics Data System (ADS)

    Naritomi, Yusuke; Fuchigami, Sotaro

    2013-12-01

    We recently proposed the method of time-structure based independent component analysis (tICA) to examine the slow dynamics involved in conformational fluctuations of a protein as estimated by molecular dynamics (MD) simulation [Y. Naritomi and S. Fuchigami, J. Chem. Phys. 134, 065101 (2011)]. Our previous study focused on domain motions of the protein and examined its dynamics by using rigid-body domain analysis and tICA. However, the protein changes its conformation not only through domain motions but also by various types of motions involving its backbone and side chains. Some of these motions might occur on a slow time scale: we hypothesize that if so, we could effectively detect and characterize them using tICA. In the present study, we investigated slow dynamics of the protein backbone using MD simulation and tICA. The selected target protein was lysine-, arginine-, ornithine-binding protein (LAO), which comprises two domains and undergoes large domain motions. MD simulation of LAO in explicit water was performed for 1 μs, and the obtained trajectory of Cα atoms in the backbone was analyzed by tICA. This analysis successfully provided us with slow modes for LAO that represented either domain motions or local movements of the backbone. Further analysis elucidated the atomic details of the suggested local motions and confirmed that these motions truly occurred on the expected slow time scale.

  12. CAVER Analyst 1.0: graphic tool for interactive visualization and analysis of tunnels and channels in protein structures.

    PubMed

    Kozlikova, Barbora; Sebestova, Eva; Sustr, Vilem; Brezovsky, Jan; Strnad, Ondrej; Daniel, Lukas; Bednar, David; Pavelka, Antonin; Manak, Martin; Bezdeka, Martin; Benes, Petr; Kotry, Matus; Gora, Artur; Damborsky, Jiri; Sochor, Jiri

    2014-09-15

    The transport of ligands, ions or solvent molecules into proteins with buried binding sites or through the membrane is enabled by protein tunnels and channels. CAVER Analyst is a software tool for calculation, analysis and real-time visualization of access tunnels and channels in static and dynamic protein structures. It provides an intuitive graphic user interface for setting up the calculation and interactive exploration of identified tunnels/channels and their characteristics. CAVER Analyst is a multi-platform software written in JAVA. Binaries and documentation are freely available for non-commercial use at http://www.caver.cz. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Structural and Functional Studies of H. seropedicae RecA Protein - Insights into the Polymerization of RecA Protein as Nucleoprotein Filament.

    PubMed

    Leite, Wellington C; Galvão, Carolina W; Saab, Sérgio C; Iulek, Jorge; Etto, Rafael M; Steffens, Maria B R; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L; Cox, Michael M

    2016-01-01

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.

  14. Physics and evolution of thermophilic adaptation.

    PubMed

    Berezovsky, Igor N; Shakhnovich, Eugene I

    2005-09-06

    Analysis of structures and sequences of several hyperthermostable proteins from various sources reveals two major physical mechanisms of their thermostabilization. The first mechanism is "structure-based," whereby some hyperthermostable proteins are significantly more compact than their mesophilic homologues, while no particular interaction type appears to cause stabilization; rather, a sheer number of interactions is responsible for thermostability. Other hyperthermostable proteins employ an alternative, "sequence-based" mechanism of their thermal stabilization. They do not show pronounced structural differences from mesophilic homologues. Rather, a small number of apparently strong interactions is responsible for high thermal stability of these proteins. High-throughput comparative analysis of structures and complete genomes of several hyperthermophilic archaea and bacteria revealed that organisms develop diverse strategies of thermophilic adaptation by using, to a varying degree, two fundamental physical mechanisms of thermostability. The choice of a particular strategy depends on the evolutionary history of an organism. Proteins from organisms that originated in an extreme environment, such as hyperthermophilic archaea (Pyrococcus furiosus), are significantly more compact and more hydrophobic than their mesophilic counterparts. Alternatively, organisms that evolved as mesophiles but later recolonized a hot environment (Thermotoga maritima) relied in their evolutionary strategy of thermophilic adaptation on "sequence-based" mechanism of thermostability. We propose an evolutionary explanation of these differences based on physical concepts of protein designability.

  15. Top-Down Hydrogen-Deuterium Exchange Analysis of Protein Structures Using Ultraviolet Photodissociation.

    PubMed

    Brodie, Nicholas I; Huguet, Romain; Zhang, Terry; Viner, Rosa; Zabrouskov, Vlad; Pan, Jingxi; Petrotchenko, Evgeniy V; Borchers, Christoph H

    2018-03-06

    Top-down hydrogen-deuterium exchange (HDX) analysis using electron capture or transfer dissociation Fourier transform mass spectrometry (FTMS) is a powerful method for the analysis of secondary structure of proteins in solution. The resolution of the method is a function of the degree of fragmentation of backbone bonds in the proteins. While fragmentation is usually extensive near the N- and C-termini, electron capture (ECD) or electron transfer dissociation (ETD) fragmentation methods sometimes lack good coverage of certain regions of the protein, most often in the middle of the sequence. Ultraviolet photodissociation (UVPD) is a recently developed fast-fragmentation technique, which provides extensive backbone fragmentation that can be complementary in sequence coverage to the aforementioned electron-based fragmentation techniques. Here, we explore the application of electrospray ionization (ESI)-UVPD FTMS on an Orbitrap Fusion Lumos Tribrid mass spectrometer to top-down HDX analysis of proteins. We have incorporated UVPD-specific fragment-ion types and fragment-ion mixtures into our isotopic envelope fitting software (HDX Match) for the top-down HDX analysis. We have shown that UVPD data is complementary to ETD, thus improving the overall resolution when used as a combined approach.

  16. Neutron scattering for the analysis of biological structures. Brookhaven symposia in biology. Number 27

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schoenborn, B P

    1976-01-01

    Sessions were included on neutron scattering and biological structure analysis, protein crystallography, neutron scattering from oriented systems, solution scattering, preparation of deuterated specimens, inelastic scattering, data analysis, experimental techniques, and instrumentation. Separate entries were made for the individual papers.

  17. Application of linker technique to trap transiently interacting protein complexes for structural studies

    PubMed Central

    Reddy Chichili, Vishnu Priyanka; Kumar, Veerendra; Sivaraman, J.

    2016-01-01

    Protein-protein interactions are key events controlling several biological processes. We have developed and employed a method to trap transiently interacting protein complexes for structural studies using glycine-rich linkers to fuse interacting partners, one of which is unstructured. Initial steps involve isothermal titration calorimetry to identify the minimum binding region of the unstructured protein in its interaction with its stable binding partner. This is followed by computational analysis to identify the approximate site of the interaction and to design an appropriate linker length. Subsequently, fused constructs are generated and characterized using size exclusion chromatography and dynamic light scattering experiments. The structure of the chimeric protein is then solved by crystallization, and validated both in vitro and in vivo by substituting key interacting residues of the full length, unlinked proteins with alanine. This protocol offers the opportunity to study crucial and currently unattainable transient protein interactions involved in various biological processes. PMID:26985443

  18. Structure prediction, expression, and antigenicity of c-terminal of GRP78.

    PubMed

    Aghamollaei, Hossein; Mousavi Gargari, Seyed Latif; Ghanei, Mostafa; Rasaee, Mohamad Javad; Amani, Jafar; Bakherad, Hamid; Farnoosh, Gholamreza

    2017-01-01

    Glucose-regulated protein 78 (GRP78) is a typical endoplasmic reticulum luminal chaperone having a main role in the activation of the unfolded protein response. Because of hypoxia and nutrient deprivation in the tumor microenvironment, expression of GRP78 in these cells becomes higher than the native cells, which makes it a suitable candidate for cancer targeting. Suppression of survival signals by antibody production against C-terminal domain of GR78 (CGRP) can induce apoptosis of cancer cells. The aim of this study was in silico analysis, recombinant production, and characterization of CGRP in Escherichia coli. Structural prediction of CGRP by bioinformatics tools was done and the construct containing optimized sequence was transferred to E. coli T7 shuffle. Expression was induced by isopropyl-β-d-thiogalactoside, and recombinant protein was purified by Ni-NTA agarose resin. The content of secondary structures was obtained by circular dichroism (CD) spectrum. CGRP immunogenicity was evaluated from the immunized mouse sera. SDS-PAGE analysis showed CGRP expression in E. coli. CD spectrum also confirmed prediction of structures by bioinformatics tools. The enzyme-linked immunosorbent assay using sera from immunized mice revealed CGRP as a good immunogen. The results obtained in this study showed that the structure of truncated CGRP is very similar to its structure in the whole protein context. This protein can be used in cancer researches. © 2015 International Union of Biochemistry and Molecular Biology, Inc.

  19. Application of the Ramanujan Fourier Transform for the analysis of secondary structure content in amino acid sequences.

    PubMed

    Mainardi, L T; Pattini, L; Cerutti, S

    2007-01-01

    A novel method is presented for the investigation of protein properties of sequences using Ramanujan Fourier Transform (RFT). The new methodology involves the preprocessing of protein sequence data by numerically encoding it and then applying the RFT. The RFT is based on projecting the obtained numerical series on a set of basis functions constituted by Ramanujan sums (RS). In RS components, periodicities of finite integer length, rather than frequency, (as in classical harmonic analysis) are considered. The potential of the new approach is documented by a few examples in the analysis of hydrophobic profiles of proteins in two classes including abundance of alpha-helices (group A) or beta-strands (group B). Different patterns are provided as evidence. RFT can be used to characterize the structural properties of proteins and integrate complementary information provided by other signal processing transforms.

  20. Restricted N-glycan conformational space in the PDB and its implication in glycan structure modeling.

    PubMed

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures.

  1. Restricted N-glycan Conformational Space in the PDB and Its Implication in Glycan Structure Modeling

    PubMed Central

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures. PMID:23516343

  2. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    PubMed

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  3. Subfamily-specific adaptations in the structures of two penicillin-binding proteins from Mycobacterium tuberculosis

    DOE PAGES

    Prigozhin, Daniil M.; Krieger, Inna V.; Huizar, John P.; ...

    2014-12-31

    Beta-lactam antibiotics target penicillin-binding proteins including several enzyme classes essential for bacterial cell-wall homeostasis. To better understand the functional and inhibitor-binding specificities of penicillin-binding proteins from the pathogen, Mycobacterium tuberculosis, we carried out structural and phylogenetic analysis of two predicted D,D-carboxypeptidases, Rv2911 and Rv3330. Optimization of Rv2911 for crystallization using directed evolution and the GFP folding reporter method yielded a soluble quadruple mutant. Structures of optimized Rv2911 bound to phenylmethylsulfonyl fluoride and Rv3330 bound to meropenem show that, in contrast to the nonspecific inhibitor, meropenem forms an extended interaction with the enzyme along a conserved surface. Phylogenetic analysis shows thatmore » Rv2911 and Rv3330 belong to different clades that emerged in Actinobacteria and are not represented in model organisms such as Escherichia coli and Bacillus subtilis. Clade-specific adaptations allow these enzymes to fulfill distinct physiological roles despite strict conservation of core catalytic residues. The characteristic differences include potential protein-protein interaction surfaces and specificity-determining residues surrounding the catalytic site. Overall, these structural insights lay the groundwork to develop improved beta-lactam therapeutics for tuberculosis.« less

  4. Subfamily-specific adaptations in the structures of two penicillin-binding proteins from Mycobacterium tuberculosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prigozhin, Daniil M.; Krieger, Inna V.; Huizar, John P.

    Beta-lactam antibiotics target penicillin-binding proteins including several enzyme classes essential for bacterial cell-wall homeostasis. To better understand the functional and inhibitor-binding specificities of penicillin-binding proteins from the pathogen, Mycobacterium tuberculosis, we carried out structural and phylogenetic analysis of two predicted D,D-carboxypeptidases, Rv2911 and Rv3330. Optimization of Rv2911 for crystallization using directed evolution and the GFP folding reporter method yielded a soluble quadruple mutant. Structures of optimized Rv2911 bound to phenylmethylsulfonyl fluoride and Rv3330 bound to meropenem show that, in contrast to the nonspecific inhibitor, meropenem forms an extended interaction with the enzyme along a conserved surface. Phylogenetic analysis shows thatmore » Rv2911 and Rv3330 belong to different clades that emerged in Actinobacteria and are not represented in model organisms such as Escherichia coli and Bacillus subtilis. Clade-specific adaptations allow these enzymes to fulfill distinct physiological roles despite strict conservation of core catalytic residues. The characteristic differences include potential protein-protein interaction surfaces and specificity-determining residues surrounding the catalytic site. Overall, these structural insights lay the groundwork to develop improved beta-lactam therapeutics for tuberculosis.« less

  5. New Techniques for Ancient Proteins: Direct Coupling Analysis Applied on Proteins Involved in Iron Sulfur Cluster Biogenesis

    PubMed Central

    Fantini, Marco; Malinverni, Duccio; De Los Rios, Paolo; Pastore, Annalisa

    2017-01-01

    Direct coupling analysis (DCA) is a powerful statistical inference tool used to study protein evolution. It was introduced to predict protein folds and protein-protein interactions, and has also been applied to the prediction of entire interactomes. Here, we have used it to analyze three proteins of the iron-sulfur biogenesis machine, an essential metabolic pathway conserved in all organisms. We show that DCA can correctly reproduce structural features of the CyaY/frataxin family (a protein involved in the human disease Friedreich's ataxia) despite being based on the relatively small number of sequences allowed by its genomic distribution. This result gives us confidence in the method. Its application to the iron-sulfur cluster scaffold protein IscU, which has been suggested to function both as an ordered and a disordered form, allows us to distinguish evolutionary traces of the structured species, suggesting that, if present in the cell, the disordered form has not left evolutionary imprinting. We observe instead, for the first time, direct indications of how the protein can dimerize head-to-head and bind 4Fe4S clusters. Analysis of the alternative scaffold protein IscA provides strong support to a coordination of the cluster by a dimeric form rather than a tetramer, as previously suggested. Our analysis also suggests the presence in solution of a mixture of monomeric and dimeric species, and guides us to the prevalent one. Finally, we used DCA to analyze interactions between some of these proteins, and discuss the potentials and limitations of the method. PMID:28664160

  6. The Snail-Induced Sulfonation Pathway in Breast Cancer Metastasis

    DTIC Science & Technology

    2014-09-01

    of the SNAIL protein with DNA The model of SNAIL, containing 4 Zn fingers bound to DNA, was created using PDB structures 1tf3 (TFIIIA protein, for...AutoDOCK (17) analysis of fragmented LIMD2 structure against that of the pdb struc- ture 3kmw (ILK/a-Parvin), rethreading the LIMD2 structure through the top...Fig. 5E). We assessed the structural similarity between LIMD2 and other reported LIM structures present in the PDB . The superposition of LIMD2 onto the

  7. High-throughput crystallization screening.

    PubMed

    Skarina, Tatiana; Xu, Xiaohui; Evdokimova, Elena; Savchenko, Alexei

    2014-01-01

    Protein structure determination by X-ray crystallography is dependent on obtaining a single protein crystal suitable for diffraction data collection. Due to this requirement, protein crystallization represents a key step in protein structure determination. The conditions for protein crystallization have to be determined empirically for each protein, making this step also a bottleneck in the structure determination process. Typical protein crystallization practice involves parallel setup and monitoring of a considerable number of individual protein crystallization experiments (also called crystallization trials). In these trials the aliquots of purified protein are mixed with a range of solutions composed of a precipitating agent, buffer, and sometimes an additive that have been previously successful in prompting protein crystallization. The individual chemical conditions in which a particular protein shows signs of crystallization are used as a starting point for further crystallization experiments. The goal is optimizing the formation of individual protein crystals of sufficient size and quality to make them suitable for diffraction data collection. Thus the composition of the primary crystallization screen is critical for successful crystallization.Systematic analysis of crystallization experiments carried out on several hundred proteins as part of large-scale structural genomics efforts allowed the optimization of the protein crystallization protocol and identification of a minimal set of 96 crystallization solutions (the "TRAP" screen) that, in our experience, led to crystallization of the maximum number of proteins.

  8. Structural and biophysical properties of h-FANCI ARM repeat protein.

    PubMed

    Siddiqui, Mohd Quadir; Choudhary, Rajan Kumar; Thapa, Pankaj; Kulkarni, Neha; Rajpurohit, Yogendra S; Misra, Hari S; Gadewal, Nikhil; Kumar, Satish; Hasan, Syed K; Varma, Ashok K

    2017-11-01

    Fanconi anemia complementation groups - I (FANCI) protein facilitates DNA ICL (Inter-Cross-link) repair and plays a crucial role in genomic integrity. FANCI is a 1328 amino acids protein which contains armadillo (ARM) repeats and EDGE motif at the C-terminus. ARM repeats are functionally diverse and evolutionarily conserved domain that plays a pivotal role in protein-protein and protein-DNA interactions. Considering the importance of ARM repeats, we have explored comprehensive in silico and in vitro approach to examine folding pattern. Size exclusion chromatography, dynamic light scattering (DLS) and glutaraldehyde crosslinking studies suggest that FANCI ARM repeat exist as monomer as well as in oligomeric forms. Circular dichroism (CD) and fluorescence spectroscopy results demonstrate that protein has predominantly α- helices and well-folded tertiary structure. DNA binding was analysed using electrophoretic mobility shift assay by autoradiography. Temperature-dependent CD, Fluorescence spectroscopy and DLS studies concluded that protein unfolds and start forming oligomer from 30°C. The existence of stable portion within FANCI ARM repeat was examined using limited proteolysis and mass spectrometry. The normal mode analysis, molecular dynamics and principal component analysis demonstrated that helix-turn-helix (HTH) motif present in ARM repeat is highly dynamic and has anti-correlated motion. Furthermore, FANCI ARM repeat has HTH structural motif which binds to double-stranded DNA.

  9. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Ruiying; Zheng, Han; Preamplume, Gan

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of amore » noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.« less

  10. Recent Advances and Applications in Synchrotron X-Ray Protein Footprinting for Protein Structure and Dynamics Elucidation.

    PubMed

    Gupta, Sayan; Feng, Jun; Chance, Mark; Ralston, Corie

    2016-01-01

    Synchrotron X-ray Footprinting is a powerful in situ hydroxyl radical labeling method for analysis of protein structure, interactions, folding and conformation change in solution. In this method, water is ionized by high flux density broad band synchrotron X-rays to produce a steady-state concentration of hydroxyl radicals, which then react with solvent accessible side-chains. The resulting stable modification products are analyzed by liquid chromatography coupled to mass spectrometry. A comparative reactivity rate between known and unknown states of a protein provides local as well as global information on structural changes, which is then used to develop structural models for protein function and dynamics. In this review we describe the XF-MS method, its unique capabilities and its recent technical advances at the Advanced Light Source. We provide a comparison of other hydroxyl radical and mass spectrometry based methods with XFMS. We also discuss some of the latest developments in its usage for studying bound water, transmembrane proteins and photosynthetic protein components, and the synergy of the method with other synchrotron based structural biology methods.

  11. LIBP-Pred: web server for lipid binding proteins using structural network parameters; PDB mining of human cancer biomarkers and drug targets in parasites and bacteria.

    PubMed

    González-Díaz, Humberto; Munteanu, Cristian R; Postelnicu, Lucian; Prado-Prado, Francisco; Gestal, Marcos; Pazos, Alejandro

    2012-03-01

    Lipid-Binding Proteins (LIBPs) or Fatty Acid-Binding Proteins (FABPs) play an important role in many diseases such as different types of cancer, kidney injury, atherosclerosis, diabetes, intestinal ischemia and parasitic infections. Thus, the computational methods that can predict LIBPs based on 3D structure parameters became a goal of major importance for drug-target discovery, vaccine design and biomarker selection. In addition, the Protein Data Bank (PDB) contains 3000+ protein 3D structures with unknown function. This list, as well as new experimental outcomes in proteomics research, is a very interesting source to discover relevant proteins, including LIBPs. However, to the best of our knowledge, there are no general models to predict new LIBPs based on 3D structures. We developed new Quantitative Structure-Activity Relationship (QSAR) models based on 3D electrostatic parameters of 1801 different proteins, including 801 LIBPs. We calculated these electrostatic parameters with the MARCH-INSIDE software and they correspond to the entire protein or to specific protein regions named core, inner, middle, and surface. We used these parameters as inputs to develop a simple Linear Discriminant Analysis (LDA) classifier to discriminate 3D structure of LIBPs from other proteins. We implemented this predictor in the web server named LIBP-Pred, freely available at , along with other important web servers of the Bio-AIMS portal. The users can carry out an automatic retrieval of protein structures from PDB or upload their custom protein structural models from their disk created with LOMETS server. We demonstrated the PDB mining option performing a predictive study of 2000+ proteins with unknown function. Interesting results regarding the discovery of new Cancer Biomarkers in humans or drug targets in parasites have been discussed here in this sense.

  12. A Circular Dichroism Reference Database for Membrane Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wallace,B.; Wien, F.; Stone, T.

    2006-01-01

    Membrane proteins are a major product of most genomes and the target of a large number of current pharmaceuticals, yet little information exists on their structures because of the difficulty of crystallising them; hence for the most part they have been excluded from structural genomics programme targets. Furthermore, even methods such as circular dichroism (CD) spectroscopy which seek to define secondary structure have not been fully exploited because of technical limitations to their interpretation for membrane embedded proteins. Empirical analyses of circular dichroism (CD) spectra are valuable for providing information on secondary structures of proteins. However, the accuracy of themore » results depends on the appropriateness of the reference databases used in the analyses. Membrane proteins have different spectral characteristics than do soluble proteins as a result of the low dielectric constants of membrane bilayers relative to those of aqueous solutions (Chen & Wallace (1997) Biophys. Chem. 65:65-74). To date, no CD reference database exists exclusively for the analysis of membrane proteins, and hence empirical analyses based on current reference databases derived from soluble proteins are not adequate for accurate analyses of membrane protein secondary structures (Wallace et al (2003) Prot. Sci. 12:875-884). We have therefore created a new reference database of CD spectra of integral membrane proteins whose crystal structures have been determined. To date it contains more than 20 proteins, and spans the range of secondary structures from mostly helical to mostly sheet proteins. This reference database should enable more accurate secondary structure determinations of membrane embedded proteins and will become one of the reference database options in the CD calculation server DICHROWEB (Whitmore & Wallace (2004) NAR 32:W668-673).« less

  13. Protein Structure Prediction with Evolutionary Algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hart, W.E.; Krasnogor, N.; Pelta, D.A.

    1999-02-08

    Evolutionary algorithms have been successfully applied to a variety of molecular structure prediction problems. In this paper we reconsider the design of genetic algorithms that have been applied to a simple protein structure prediction problem. Our analysis considers the impact of several algorithmic factors for this problem: the confirmational representation, the energy formulation and the way in which infeasible conformations are penalized, Further we empirically evaluated the impact of these factors on a small set of polymer sequences. Our analysis leads to specific recommendations for both GAs as well as other heuristic methods for solving PSP on the HP model.

  14. On Ramachandran angles, closed strings and knots in protein structure

    NASA Astrophysics Data System (ADS)

    Chen, Si; Niemi, Antti J.

    2016-08-01

    The Ramachandran angles (φ,\\psi ) of a protein backbone form the vertices of a piecewise geodesic curve on the surface of a torus. When the ends of the curve are connected to each other similarly, by a geodesic, the result is a closed string that in general wraps around the torus a number of times both in the meridional and the longitudinal directions. The two wrapping numbers are global characteristics of the protein structure. A statistical analysis of the wrapping numbers in terms of crystallographic x-ray structures in the protein data bank (PDB) reveals that proteins have no net chirality in the ϕ direction but in the ψ direction, proteins prefer to display chirality. A comparison between the wrapping numbers and the concept of folding index discloses a non-linearity in their relationship. Thus these three integer valued invariants can be used in tandem, to scrutinize and classify the global loop structure of individual PDB proteins, in terms of the overall fold topology.

  15. Maximally asymmetric transbilayer distribution of anionic lipids alters the structure and interaction with lipids of an amyloidogenic protein dimer bound to the membrane surface.

    PubMed

    Cheng, Sara Y; Chou, George; Buie, Creighton; Vaughn, Mark W; Compton, Campbell; Cheng, Kwan H

    2016-03-01

    We used molecular dynamics simulations to explore the effects of asymmetric transbilayer distribution of anionic phosphatidylserine (PS) lipids on the structure of a protein on the membrane surface and subsequent protein-lipid interactions. Our simulation systems consisted of an amyloidogenic, beta-sheet rich dimeric protein (D42) absorbed to the phosphatidylcholine (PC) leaflet, or protein-contact PC leaflet, of two membrane systems: a single-component PC bilayer and double PC/PS bilayers. The latter comprised of a stable but asymmetric transbilayer distribution of PS in the presence of counterions, with a 1-component PC leaflet coupled to a 1-component PS leaflet in each bilayer. The maximally asymmetric PC/PS bilayer had a non-zero transmembrane potential (TMP) difference and higher lipid order packing, whereas the symmetric PC bilayer had a zero TMP difference and lower lipid order packing under physiologically relevant conditions. Analysis of the adsorbed protein structures revealed weaker protein binding, more folding in the N-terminal domain, more aggregation of the N- and C-terminal domains and larger tilt angle of D42 on the PC leaflet surface of the PC/PS bilayer versus the PC bilayer. Also, analysis of protein-induced membrane structural disruption revealed more localized bilayer thinning in the PC/PS versus PC bilayer. Although the electric field profile in the non-protein-contact PS leaflet of the PC/PS bilayer differed significantly from that in the non-protein-contact PC leaflet of the PC bilayer, no significant difference in the electric field profile in the protein-contact PC leaflet of either bilayer was evident. We speculate that lipid packing has a larger effect on the surface adsorbed protein structure than the electric field for a maximally asymmetric PC/PS bilayer. Our results support the mechanism that the higher lipid packing in a lipid leaflet promotes stronger protein-protein but weaker protein-lipid interactions for a dimeric protein on membrane surfaces. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Tracing Primordial Protein Evolution through Structurally Guided Stepwise Segment Elongation*

    PubMed Central

    Watanabe, Hideki; Yamasaki, Kazuhiko; Honda, Shinya

    2014-01-01

    The understanding of how primordial proteins emerged has been a fundamental and longstanding issue in biology and biochemistry. For a better understanding of primordial protein evolution, we synthesized an artificial protein on the basis of an evolutionary hypothesis, segment-based elongation starting from an autonomously foldable short peptide. A 10-residue protein, chignolin, the smallest foldable polypeptide ever reported, was used as a structural support to facilitate higher structural organization and gain-of-function in the development of an artificial protein. Repetitive cycles of segment elongation and subsequent phage display selection successfully produced a 25-residue protein, termed AF.2A1, with nanomolar affinity against the Fc region of immunoglobulin G. AF.2A1 shows exquisite molecular recognition ability such that it can distinguish conformational differences of the same molecule. The structure determined by NMR measurements demonstrated that AF.2A1 forms a globular protein-like conformation with the chignolin-derived β-hairpin and a tryptophan-mediated hydrophobic core. Using sequence analysis and a mutation study, we discovered that the structural organization and gain-of-function emerged from the vicinity of the chignolin segment, revealing that the structural support served as the core in both structural and functional development. Here, we propose an evolutionary model for primordial proteins in which a foldable segment serves as the evolving core to facilitate structural and functional evolution. This study provides insights into primordial protein evolution and also presents a novel methodology for designing small sized proteins useful for industrial and pharmaceutical applications. PMID:24356963

  17. Detecting transitions in protein dynamics using a recurrence quantification analysis based bootstrap method.

    PubMed

    Karain, Wael I

    2017-11-28

    Proteins undergo conformational transitions over different time scales. These transitions are closely intertwined with the protein's function. Numerous standard techniques such as principal component analysis are used to detect these transitions in molecular dynamics simulations. In this work, we add a new method that has the ability to detect transitions in dynamics based on the recurrences in the dynamical system. It combines bootstrapping and recurrence quantification analysis. We start from the assumption that a protein has a "baseline" recurrence structure over a given period of time. Any statistically significant deviation from this recurrence structure, as inferred from complexity measures provided by recurrence quantification analysis, is considered a transition in the dynamics of the protein. We apply this technique to a 132 ns long molecular dynamics simulation of the β-Lactamase Inhibitory Protein BLIP. We are able to detect conformational transitions in the nanosecond range in the recurrence dynamics of the BLIP protein during the simulation. The results compare favorably to those extracted using the principal component analysis technique. The recurrence quantification analysis based bootstrap technique is able to detect transitions between different dynamics states for a protein over different time scales. It is not limited to linear dynamics regimes, and can be generalized to any time scale. It also has the potential to be used to cluster frames in molecular dynamics trajectories according to the nature of their recurrence dynamics. One shortcoming for this method is the need to have large enough time windows to insure good statistical quality for the recurrence complexity measures needed to detect the transitions.

  18. Evolutionary distance from human homologs reflects allergenicity of animal food proteins.

    PubMed

    Jenkins, John A; Breiteneder, Heimo; Mills, E N Clare

    2007-12-01

    In silico analysis of allergens can identify putative relationships among protein sequence, structure, and allergenic properties. Such systematic analysis reveals that most plant food allergens belong to a restricted number of protein superfamilies, with pollen allergens behaving similarly. We have investigated the structural relationships of animal food allergens and their evolutionary relatedness to human homologs to define how closely a protein must resemble a human counterpart to lose its allergenic potential. Profile-based sequence homology methods were used to classify animal food allergens into Pfam families, and in silico analyses of their evolutionary and structural relationships were performed. Animal food allergens could be classified into 3 main families--tropomyosins, EF-hand proteins, and caseins--along with 14 minor families each composed of 1 to 3 allergens. The evolutionary relationships of each of these allergen superfamilies showed that in general, proteins with a sequence identity to a human homolog above approximately 62% were rarely allergenic. Single substitutions in otherwise highly conserved regions containing IgE epitopes in EF-hand parvalbumins may modulate allergenicity. These data support the premise that certain protein structures are more allergenic than others. Contrasting with plant food allergens, animal allergens, such as the highly conserved tropomyosins, challenge the capability of the human immune system to discriminate between foreign and self-proteins. Such immune responses run close to becoming autoimmune responses. Exploiting the closeness between animal allergens and their human homologs in the development of recombinant allergens for immunotherapy will need to consider the potential for developing unanticipated autoimmune responses.

  19. SCit: web tools for protein side chain conformation analysis

    PubMed Central

    Gautier, R.; Camproux, A.-C.; Tufféry, P.

    2004-01-01

    SCit is a web server providing services for protein side chain conformation analysis and side chain positioning. Specific services use the dependence of the side chain conformations on the local backbone conformation, which is described using a structural alphabet that describes the conformation of fragments of four-residue length in a limited library of structural prototypes. Based on this concept, SCit uses sets of rotameric conformations dependent on the local backbone conformation of each protein for side chain positioning and the identification of side chains with unlikely conformations. The SCit web server is accessible at http://bioserv.rpbs.jussieu.fr/SCit. PMID:15215438

  20. Highly branched penta-saccharide-bearing amphiphiles for membrane protein studies

    PubMed Central

    Ehsan, Muhammad; Du, Yang; Scull, Nicola J.; Tikhonova, Elena; Tarrasch, Jeffrey; Mortensen, Jonas S.; Loland, Claus J.; Skiniotis, Georgios; Guan, Lan; Byrne, Bernadette; Kobilka, Brian K.; Chae, Pil Seok

    2016-01-01

    Detergents are essential tools for membrane protein manipulation. Micelles formed by detergent molecules have the ability to encapsulate the hydrophobic domains of membrane proteins. The resulting protein-detergent complexes (PDCs) are compatible with the polar environments of aqueous media, making structural and functional analysis feasible. Although a number of novel agents have been developed to overcome the limitations of conventional detergents, most of them have traditional head groups such as glucoside or maltoside. In this study, we introduce a class of amphiphiles, the PSA’Es with a novel highly branched penta-saccharide hydrophilic group. The PSA’Es conferred markedly increased stability to a diverse range of membrane proteins compared to conventional detergents, indicating a positive role for the new hydrophilic group in maintaining the native protein integrity. In addition, PDCs formed by PSA’Es were smaller and more suitable for electron microscopic analysis than those formed by DDM, indicating that the new agents have significant potential for the structure-function studies of membrane proteins. PMID:26966956

  1. A tool for calculating binding-site residues on proteins from PDB structures.

    PubMed

    Hu, Jing; Yan, Changhui

    2009-08-03

    In the research on protein functional sites, researchers often need to identify binding-site residues on a protein. A commonly used strategy is to find a complex structure from the Protein Data Bank (PDB) that consists of the protein of interest and its interacting partner(s) and calculate binding-site residues based on the complex structure. However, since a protein may participate in multiple interactions, the binding-site residues calculated based on one complex structure usually do not reveal all binding sites on a protein. Thus, this requires researchers to find all PDB complexes that contain the protein of interest and combine the binding-site information gleaned from them. This process is very time-consuming. Especially, combing binding-site information obtained from different PDB structures requires tedious work to align protein sequences. The process becomes overwhelmingly difficult when researchers have a large set of proteins to analyze, which is usually the case in practice. In this study, we have developed a tool for calculating binding-site residues on proteins, TCBRP http://yanbioinformatics.cs.usu.edu:8080/ppbindingsubmit. For an input protein, TCBRP can quickly find all binding-site residues on the protein by automatically combining the information obtained from all PDB structures that consist of the protein of interest. Additionally, TCBRP presents the binding-site residues in different categories according to the interaction type. TCBRP also allows researchers to set the definition of binding-site residues. The developed tool is very useful for the research on protein binding site analysis and prediction.

  2. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins

    PubMed Central

    Perez, Romel B.; Tischer, Alexander; Auton, Matthew; Whitten, Steven T.

    2014-01-01

    Molecular transduction of biological signals is understood primarily in terms of the cooperative structural transitions of protein macromolecules, providing a mechanism through which discrete local structure perturbations affect global macromolecular properties. The recognition that proteins lacking tertiary stability, commonly referred to as intrinsically disordered proteins, mediate key signaling pathways suggests that protein structures without cooperative intramolecular interactions may also have the ability to couple local and global structure changes. Presented here are results from experiments that measured and tested the ability of disordered proteins to couple local changes in structure to global changes in structure. Using the intrinsically disordered N-terminal region of the p53 protein as an experimental model, a set of proline and alanine to glycine substitution variants were designed to modulate backbone conformational propensities without introducing non-native intramolecular interactions. The hydrodynamic radius (Rh) was used to monitor changes in global structure. Circular dichroism spectroscopy showed that the glycine substitutions decreased polyproline II (PPII) propensities relative to the wild type, as expected, and fluorescence methods indicated that substitution-induced changes in Rh were not associated with folding. The experiments showed that changes in local PPII structure cause changes in Rh that are variable and that depend on the intrinsic chain propensities of proline and alanine residues, demonstrating a mechanism for coupling local and global structure changes. Molecular simulations that model our results were used to extend the analysis to other proteins and illustrate the generality of the observed proline and alanine effects on the structures of intrinsically disordered proteins. PMID:25244701

  3. Analysis of core-periphery organization in protein contact networks reveals groups of structurally and functionally critical residues.

    PubMed

    Isaac, Arnold Emerson; Sinha, Sitabhra

    2015-10-01

    The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks corresponding to the native states of 66 proteins (belonging to different families) in terms of their core-periphery organization. The resulting hierarchical classification of the amino acid constituents of a protein arranges the residues into successive layers - having higher core order - with increasing connection density, ranging from a sparsely linked periphery to a densely intra-connected core (distinct from the earlier concept of protein core defined in terms of the three-dimensional geometry of the native state, which has least solvent accessibility). Our results show that residues in the inner cores are more conserved than those at the periphery. Underlining the functional importance of the network core, we see that the receptor sites for known ligand molecules of most proteins occur in the innermost core. Furthermore, the association of residues with structural pockets and cavities in binding or active sites increases with the core order. From mutation sensitivity analysis, we show that the probability of deleterious or intolerant mutations also increases with the core order. We also show that stabilization centre residues are in the innermost cores, suggesting that the network core is critically important in maintaining the structural stability of the protein. A publicly available Web resource for performing core-periphery analysis of any protein whose native state is known has been made available by us at http://www.imsc.res.in/ ~sitabhra/proteinKcore/index.html.

  4. The role of protein structural analysis in the next generation sequencing era.

    PubMed

    Yue, Wyatt W; Froese, D Sean; Brennan, Paul E

    2014-01-01

    Proteins are macromolecules that serve a cell's myriad processes and functions in all living organisms via dynamic interactions with other proteins, small molecules and cellular components. Genetic variations in the protein-encoding regions of the human genome account for >85% of all known Mendelian diseases, and play an influential role in shaping complex polygenic diseases. Proteins also serve as the predominant target class for the design of small molecule drugs to modulate their activity. Knowledge of the shape and form of proteins, by means of their three-dimensional structures, is therefore instrumental to understanding their roles in disease and their potentials for drug development. In this chapter we outline, with the wide readership of non-structural biologists in mind, the various experimental and computational methods available for protein structure determination. We summarize how the wealth of structure information, contributed to a large extent by the technological advances in structure determination to date, serves as a useful tool to decipher the molecular basis of genetic variations for disease characterization and diagnosis, particularly in the emerging era of genomic medicine, and becomes an integral component in the modern day approach towards rational drug development.

  5. Structure-Based Analysis Reveals Cancer Missense Mutations Target Protein Interaction Interfaces.

    PubMed

    Engin, H Billur; Kreisberg, Jason F; Carter, Hannah

    2016-01-01

    Recently it has been shown that cancer mutations selectively target protein-protein interactions. We hypothesized that mutations affecting distinct protein interactions involving established cancer genes could contribute to tumor heterogeneity, and that novel mechanistic insights might be gained into tumorigenesis by investigating protein interactions under positive selection in cancer. To identify protein interactions under positive selection in cancer, we mapped over 1.2 million nonsynonymous somatic cancer mutations onto 4,896 experimentally determined protein structures and analyzed their spatial distribution. In total, 20% of mutations on the surface of known cancer genes perturbed protein-protein interactions (PPIs), and this enrichment for PPI interfaces was observed for both tumor suppressors (Odds Ratio 1.28, P-value < 10(-4)) and oncogenes (Odds Ratio 1.17, P-value < 10(-3)). To study this further, we constructed a bipartite network representing structurally resolved PPIs from all available human complexes in the Protein Data Bank (2,864 proteins, 3,072 PPIs). Analysis of frequently mutated cancer genes within this network revealed that tumor-suppressors, but not oncogenes, are significantly enriched with functional mutations in homo-oligomerization regions (Odds Ratio 3.68, P-Value < 10(-8)). We present two important examples, TP53 and beta-2-microglobulin, for which the patterns of somatic mutations at interfaces provide insights into specifically perturbed biological circuits. In patients with TP53 mutations, patient survival correlated with the specific interactions that were perturbed. Moreover, we investigated mutations at the interface of protein-nucleotide interactions and observed an unexpected number of missense mutations but not silent mutations occurring within DNA and RNA binding sites. Finally, we provide a resource of 3,072 PPI interfaces ranked according to their mutation rates. Analysis of this list highlights 282 novel candidate cancer genes that encode proteins participating in interactions that are perturbed recurrently across tumors. In summary, mutation of specific protein interactions is an important contributor to tumor heterogeneity and may have important implications for clinical outcomes.

  6. Homology-based Modeling of Rhodopsin-like Family Members in the Inactive State: Structural Analysis and Deduction of Tips for Modeling and Optimization.

    PubMed

    Pappalardo, Matteo; Rayan, Mahmoud; Abu-Lafi, Saleh; Leonardi, Martha E; Milardi, Danilo; Guccione, Salvatore; Rayan, Anwar

    2017-08-01

    Modeling G-Protein Coupled Receptors (GPCRs) is an emergent field of research, since utility of high-quality models in receptor structure-based strategies might facilitate the discovery of interesting drug candidates. The findings from a quantitative analysis of eighteen resolved structures of rhodopsin family "A" receptors crystallized with antagonists and 153 pairs of structures are described. A strategy termed endeca-amino acids fragmentation was used to analyze the structures models aiming to detect the relationship between sequence identity and Root Mean Square Deviation (RMSD) at each trans-membrane-domain. Moreover, we have applied the leave-one-out strategy to study the shiftiness likelihood of the helices. The type of correlation between sequence identity and RMSD was studied using the aforementioned set receptors as representatives of membrane proteins and 98 serine proteases with 4753 pairs of structures as representatives of globular proteins. Data analysis using fragmentation strategy revealed that there is some extent of correlation between sequence identity and global RMSD of 11AA width windows. However, spatial conservation is not always close to the endoplasmic side as was reported before. A comparative study with globular proteins shows that GPCRs have higher standard deviation and higher slope in the graph with correlation between sequence identity and RMSD. The extracted information disclosed in this paper could be incorporated in the modeling protocols while using technique for model optimization and refinement. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Structure-function analysis of the extracellular domain of the pneumococcal cell division site positioning protein MapZ

    NASA Astrophysics Data System (ADS)

    Manuse, Sylvie; Jean, Nicolas L.; Guinot, Mégane; Lavergne, Jean-Pierre; Laguri, Cédric; Bougault, Catherine M.; Vannieuwenhze, Michael S.; Grangeasse, Christophe; Simorre, Jean-Pierre

    2016-06-01

    Accurate placement of the bacterial division site is a prerequisite for the generation of two viable and identical daughter cells. In Streptococcus pneumoniae, the positive regulatory mechanism involving the membrane protein MapZ positions precisely the conserved cell division protein FtsZ at the cell centre. Here we characterize the structure of the extracellular domain of MapZ and show that it displays a bi-modular structure composed of two subdomains separated by a flexible serine-rich linker. We further demonstrate in vivo that the N-terminal subdomain serves as a pedestal for the C-terminal subdomain, which determines the ability of MapZ to mark the division site. The C-terminal subdomain displays a patch of conserved amino acids and we show that this patch defines a structural motif crucial for MapZ function. Altogether, this structure-function analysis of MapZ provides the first molecular characterization of a positive regulatory process of bacterial cell division.

  8. The utility of protein structure as a predictor of site-wise dN/dS varies widely among HIV-1 proteins.

    PubMed

    Meyer, Austin G; Wilke, Claus O

    2015-10-06

    Protein structure acts as a general constraint on the evolution of viral proteins. One widely recognized structural constraint explaining evolutionary variation among sites is the relative solvent accessibility (RSA) of residues in the folded protein. In influenza virus, the distance from functional sites has been found to explain an additional portion of the evolutionary variation in the external antigenic proteins. However, to what extent RSA and distance from a reference site in the protein can be used more generally to explain protein adaptation in other viruses and in the different proteins of any given virus remains an open question. To address this question, we have carried out an analysis of the distribution and structural predictors of site-wise dN/dS in HIV-1. Our results indicate that the distribution of dN/dS in HIV follows a smooth gamma distribution, with no special enrichment or depletion of sites with dN/dS at or above one. The variation in dN/dS can be partially explained by RSA and distance from a reference site in the protein, but these structural constraints do not act uniformly among the different HIV-1 proteins. Structural constraints are highly predictive in just one of the three enzymes and one of three structural proteins in HIV-1. For these two proteins, the protease enzyme and the gp120 structural protein, structure explains between 30 and 40% of the variation in dN/dS. Finally, for the gp120 protein of the receptor-binding complex, we also find that glycosylation sites explain just 2% of the variation in dN/dS and do not explain gp120 evolution independently of either RSA or distance from the apical surface. © 2015 The Author(s).

  9. Relevance of rhodopsin studies for GPCR activation.

    PubMed

    Deupi, Xavier

    2014-05-01

    Rhodopsin, the dim-light photoreceptor present in the rod cells of the retina, is both a retinal-binding protein and a G protein-coupled receptor (GPCR). Due to this conjunction, it benefits from an arsenal of spectroscopy techniques that can be used for its characterization, while being a model system for the important family of Class A (also referred to as "rhodopsin-like") GPCRs. For instance, rhodopsin has been a crucial player in the field of GPCR structural biology. Until 2007, it was the only GPCR for which a high-resolution crystal structure was available, so all structure-activity analyses on GPCRs, from structure-based drug discovery to studies of structural changes upon activation, were based on rhodopsin. At present, about a third of currently available GPCR structures are still from rhodopsin. In this review, I show some examples of how these structures can still be used to gain insight into general aspects of GPCR activation. First, the analysis of the third intracellular loop in rhodopsin structures allows us to gain an understanding of the structural and dynamic properties of this region, which is absent (due to protein engineering or poor electron density) in most of the currently available GPCR structures. Second, a detailed analysis of the structure of the transmembrane domains in inactive, intermediate and active rhodopsin structures allows us to detect early conformational changes in the process of ligand-induced GPCR activation. Finally, the analysis of a conserved ligand-activated transmission switch in the transmembrane bundle of GPCRs in the context of the rhodopsin activation cycle, allows us to suggest that the structures of many of the currently available agonist-bound GPCRs may correspond to intermediate active states. While the focus in GPCR structural biology is inevitably moving away from rhodopsin, in other aspects rhodopsin is still at the forefront. For instance, the first studies of the structural basis of disease mutants in GPCRs, or the most detailed analysis of cellular GPCR signal transduction networks using a systems biology approach, have been carried out in rhodopsin. Finally, due again to its unique properties among GPCRs, rhodopsin will likely play an important role in the application of X-ray free electron laser crystallography to time-resolved structural biology in membrane proteins. Rhodopsin, thus, still remains relevant as a model system to study the molecular mechanisms of GPCR activation. This article is part of a Special Issue entitled: Retinal Proteins-You can teach an old dog new tricks. © 2013 Elsevier B.V. All rights reserved.

  10. Protein-Protein Interface and Disease: Perspective from Biomolecular Networks.

    PubMed

    Hu, Guang; Xiao, Fei; Li, Yuqian; Li, Yuan; Vongsangnak, Wanwipa

    Protein-protein interactions are involved in many important biological processes and molecular mechanisms of disease association. Structural studies of interfacial residues in protein complexes provide information on protein-protein interactions. Characterizing protein-protein interfaces, including binding sites and allosteric changes, thus pose an imminent challenge. With special focus on protein complexes, approaches based on network theory are proposed to meet this challenge. In this review we pay attention to protein-protein interfaces from the perspective of biomolecular networks and their roles in disease. We first describe the different roles of protein complexes in disease through several structural aspects of interfaces. We then discuss some recent advances in predicting hot spots and communication pathway analysis in terms of amino acid networks. Finally, we highlight possible future aspects of this area with respect to both methodology development and applications for disease treatment.

  11. Structure and Calcium Binding Properties of a Neuronal Calcium-Myristoyl Switch Protein, Visinin-Like Protein 3.

    PubMed

    Li, Congmin; Lim, Sunghyuk; Braunewell, Karl H; Ames, James B

    2016-01-01

    Visinin-like protein 3 (VILIP-3) belongs to a family of Ca2+-myristoyl switch proteins that regulate signal transduction in the brain and retina. Here we analyze Ca2+ binding, characterize Ca2+-induced conformational changes, and determine the NMR structure of myristoylated VILIP-3. Three Ca2+ bind cooperatively to VILIP-3 at EF2, EF3 and EF4 (KD = 0.52 μM and Hill slope of 1.8). NMR assignments, mutagenesis and structural analysis indicate that the covalently attached myristoyl group is solvent exposed in Ca2+-bound VILIP-3, whereas Ca2+-free VILIP-3 contains a sequestered myristoyl group that interacts with protein residues (E26, Y64, V68), which are distinct from myristate contacts seen in other Ca2+-myristoyl switch proteins. The myristoyl group in VILIP-3 forms an unusual L-shaped structure that places the C14 methyl group inside a shallow protein groove, in contrast to the much deeper myristoyl binding pockets observed for recoverin, NCS-1 and GCAP1. Thus, the myristoylated VILIP-3 protein structure determined in this study is quite different from those of other known myristoyl switch proteins (recoverin, NCS-1, and GCAP1). We propose that myristoylation serves to fine tune the three-dimensional structures of neuronal calcium sensor proteins as a means of generating functional diversity.

  12. Complex Structure and Biochemical Characterization of the Staphylococcus aureus Cyclic Diadenylate Monophosphate (c-di-AMP)-binding Protein PstA, the Founding Member of a New Signal Transduction Protein Family*

    PubMed Central

    Campeotto, Ivan; Zhang, Yong; Mladenov, Miroslav G.; Freemont, Paul S.; Gründling, Angelika

    2015-01-01

    Signaling nucleotides are integral parts of signal transduction systems allowing bacteria to cope with and rapidly respond to changes in the environment. The Staphylococcus aureus PII-like signal transduction protein PstA was recently identified as a cyclic diadenylate monophosphate (c-di-AMP)-binding protein. Here, we present the crystal structures of the apo- and c-di-AMP-bound PstA protein, which is trimeric in solution as well as in the crystals. The structures combined with detailed bioinformatics analysis revealed that the protein belongs to a new family of proteins with a similar core fold but with distinct features to classical PII proteins, which usually function in nitrogen metabolism pathways in bacteria. The complex structure revealed three identical c-di-AMP-binding sites per trimer with each binding site at a monomer-monomer interface. Although distinctly different from other cyclic-di-nucleotide-binding sites, as the half-binding sites are not symmetrical, the complex structure also highlighted common features for c-di-AMP-binding sites. A comparison between the apo and complex structures revealed a series of conformational changes that result in the ordering of two anti-parallel β-strands that protrude from each monomer and allowed us to propose a mechanism on how the PstA protein functions as a signaling transduction protein. PMID:25505271

  13. Structural Integrity of Proteins under Applied Bias during Solid-State Nanopore Translocation

    NASA Astrophysics Data System (ADS)

    Hasan, Mohammad R.; Khanzada, Raja Raheel; Mahmood, Mohammed A. I.; Ashfaq, Adnan; Iqbal, Samir M.

    2015-03-01

    The translocation behavior of proteins through solid-state nanopores can be used as a new way to detect and identify proteins. The ionic current through a nanopore that flows under applied bias gets perturbed when a biomolecule traverses the Nanopore. It is important for a protein detection scheme to know of any changes in the three-dimensional structure of the molecule during the process. Here we report the data on structural integrity of protein during translocation through nanopore under different applied biases. Nanoscale Molecular Dynamic was used to establish a framework to study the changes in protein structures as these travelled across the nanopore. The analysis revealed the contributions of structural changes of protein to its ionic current signature. As a model, thrombin protein crystalline structure was imported and positioned inside a 6 nm diameter pore in a 6 nm thick silicon nitride membrane. The protein was solvated in 1 M KCl at 295 K and the system was equilibrated for 20 ns to attain its minimum energy state. The simulation was performed at different electric fields from 0 to 1 kCal/(mol.Å.e). RMSD, radial distribution function, movement of the center of mass and velocity of the protein were calculated. The results showed linear increments in the velocity and perturbations in ionic current profile with increasing electric potential. Support Acknowledged from NSF through ECCS-1201878.

  14. Exploring the potential of a structural alphabet-based tool for mining multiple target conformations and target flexibility insight

    PubMed Central

    Chéron, Jean-Baptiste; Triki, Dhoha; Senac, Caroline; Flatters, Delphine; Camproux, Anne-Claude

    2017-01-01

    Protein flexibility is often implied in binding with different partners and is essential for protein function. The growing number of macromolecular structures in the Protein Data Bank entries and their redundancy has become a major source of structural knowledge of the protein universe. The analysis of structural variability through available redundant structures of a target, called multiple target conformations (MTC), obtained using experimental or modeling methods and under different biological conditions or different sources is one way to explore protein flexibility. This analysis is essential to improve the understanding of various mechanisms associated with protein target function and flexibility. In this study, we explored structural variability of three biological targets by analyzing different MTC sets associated with these targets. To facilitate the study of these MTC sets, we have developed an efficient tool, SA-conf, dedicated to capturing and linking the amino acid and local structure variability and analyzing the target structural variability space. The advantage of SA-conf is that it could be applied to divers sets composed of MTCs available in the PDB obtained using NMR and crystallography or homology models. This tool could also be applied to analyze MTC sets obtained by dynamics approaches. Our results showed that SA-conf tool is effective to quantify the structural variability of a MTC set and to localize the structural variable positions and regions of the target. By selecting adapted MTC subsets and comparing their variability detected by SA-conf, we highlighted different sources of target flexibility such as induced by binding partner, by mutation and intrinsic flexibility. Our results support the interest to mine available structures associated with a target using to offer valuable insight into target flexibility and interaction mechanisms. The SA-conf executable script, with a set of pre-compiled binaries are available at http://www.mti.univ-paris-diderot.fr/recherche/plateformes/logiciels. PMID:28817602

  15. Exploring the potential of a structural alphabet-based tool for mining multiple target conformations and target flexibility insight.

    PubMed

    Regad, Leslie; Chéron, Jean-Baptiste; Triki, Dhoha; Senac, Caroline; Flatters, Delphine; Camproux, Anne-Claude

    2017-01-01

    Protein flexibility is often implied in binding with different partners and is essential for protein function. The growing number of macromolecular structures in the Protein Data Bank entries and their redundancy has become a major source of structural knowledge of the protein universe. The analysis of structural variability through available redundant structures of a target, called multiple target conformations (MTC), obtained using experimental or modeling methods and under different biological conditions or different sources is one way to explore protein flexibility. This analysis is essential to improve the understanding of various mechanisms associated with protein target function and flexibility. In this study, we explored structural variability of three biological targets by analyzing different MTC sets associated with these targets. To facilitate the study of these MTC sets, we have developed an efficient tool, SA-conf, dedicated to capturing and linking the amino acid and local structure variability and analyzing the target structural variability space. The advantage of SA-conf is that it could be applied to divers sets composed of MTCs available in the PDB obtained using NMR and crystallography or homology models. This tool could also be applied to analyze MTC sets obtained by dynamics approaches. Our results showed that SA-conf tool is effective to quantify the structural variability of a MTC set and to localize the structural variable positions and regions of the target. By selecting adapted MTC subsets and comparing their variability detected by SA-conf, we highlighted different sources of target flexibility such as induced by binding partner, by mutation and intrinsic flexibility. Our results support the interest to mine available structures associated with a target using to offer valuable insight into target flexibility and interaction mechanisms. The SA-conf executable script, with a set of pre-compiled binaries are available at http://www.mti.univ-paris-diderot.fr/recherche/plateformes/logiciels.

  16. Membrane remodeling by amyloidogenic and non-amyloidogenic proteins studied by EPR.

    PubMed

    Varkey, Jobin; Langen, Ralf

    2017-07-01

    The advancement in site-directed spin labeling of proteins has enabled EPR studies to expand into newer research areas within the umbrella of protein-membrane interactions. Recently, membrane remodeling by amyloidogenic and non-amyloidogenic proteins has gained a substantial interest in relation to driving and controlling vital cellular processes such as endocytosis, exocytosis, shaping of organelles like endoplasmic reticulum, Golgi and mitochondria, intracellular vesicular trafficking, formation of filopedia and multivesicular bodies, mitochondrial fusion and fission, and synaptic vesicle fusion and recycling in neurotransmission. Misregulation in any of these processes due to an aberrant protein (mutation or misfolding) or alteration of lipid metabolism can be detrimental to the cell and cause disease. Dissection of the structural basis of membrane remodeling by proteins is thus quite necessary for an understanding of the underlying mechanisms, but it remains a formidable task due to the difficulties of various common biophysical tools in monitoring the dynamic process of membrane binding and bending by proteins. This is largely since membranes generally complicate protein structure analysis and this problem is amplified for structural analysis in the presence of different types of membrane curvatures. Recent EPR studies on membrane remodeling by proteins show that a significant structural information can be generated to delineate the role of different protein modules, domains and individual amino acids in the generation of membrane curvature. These studies also show how EPR can complement the data obtained by high resolution techniques such as X-ray and NMR. This perspective covers the application of EPR in recent studies for understanding membrane remodeling by amyloidogenic and non-amyloidogenic proteins that is useful for researchers interested in using or complimenting EPR to gain better understanding of membrane remodeling. We also discuss how a single protein can generate different type of membrane curvatures using specific conformations for specific membrane structures and how EPR is a versatile tool well-suited to analyze subtle alterations in structures under such modifying conditions which otherwise would have been difficult using other biophysical tools. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.

    PubMed

    Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2017-04-01

    Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  18. Structure and dynamics of Ebola virus matrix protein VP40 by a coarse-grained Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Pandey, Ras; Farmer, Barry

    Ebola virus matrix protein VP40 (consisting of 326 residues) plays a critical role in viral assembly and its functions such as regulation of viral transcription, packaging, and budding of mature virions into the plasma membrane of infected cells. How does the protein VP40 go through structural evolution during the viral life cycle remains an open question? Using a coarse-grained Monte Carlo simulation we investigate the structural evolution of VP40 as a function of temperature with the input of a knowledge-based residue-residue interaction. A number local and global physical quantities (e.g. mobility profile, contact map, radius of gyration, structure factor) are analyzed with our large-scale simulations. Our preliminary data show that the structure of the protein evolves through different state with well-defined morphologies which can be identified and quantified via a detailed analysis of structure factor.

  19. Common structural features of cholesterol binding sites in crystallized soluble proteins

    PubMed Central

    Bukiya, Anna N.; Dopico, Alejandro M.

    2017-01-01

    Cholesterol-protein interactions are essential for the architectural organization of cell membranes and for lipid metabolism. While cholesterol-sensing motifs in transmembrane proteins have been identified, little is known about cholesterol recognition by soluble proteins. We reviewed the structural characteristics of binding sites for cholesterol and cholesterol sulfate from crystallographic structures available in the Protein Data Bank. This analysis unveiled key features of cholesterol-binding sites that are present in either all or the majority of sites: i) the cholesterol molecule is generally positioned between protein domains that have an organized secondary structure; ii) the cholesterol hydroxyl/sulfo group is often partnered by Asn, Gln, and/or Tyr, while the hydrophobic part of cholesterol interacts with Leu, Ile, Val, and/or Phe; iii) cholesterol hydrogen-bonding partners are often found on α-helices, while amino acids that interact with cholesterol’s hydrophobic core have a slight preference for β-strands and secondary structure-lacking protein areas; iv) the steroid’s C21 and C26 constitute the “hot spots” most often seen for steroid-protein hydrophobic interactions; v) common “cold spots” are C8–C10, C13, and C17, at which contacts with the proteins were not detected. Several common features we identified for soluble protein-steroid interaction appear evolutionarily conserved. PMID:28420706

  20. FTIR study of secondary structure of bovine serum albumin and ovalbumin

    NASA Astrophysics Data System (ADS)

    Abrosimova, K. V.; Shulenina, O. V.; Paston, S. V.

    2016-11-01

    Proteins structure is the critical factor for their functioning. Fourier transform infrared spectroscopy provides a possibility to obtain information about secondary structure of proteins in different states and also in a whole biological samples. Infrared spectra of egg white from the untreated and hard-boiled hen's egg, and also of chicken ovalbumin and bovine serum albumin in lyophilic form and in aqueous solution were studied. Lyophilization of investigated globular proteins is accompanied by the decrease of a-helix structures and the increase in amount of intermolecular β-sheets. Analysis of infrared spectrum of egg white allowed to make an estimation of OVA secondary structure and to observe α-to-β structural transformation as a result of the heat denaturation.

  1. Using more than 801 296 small-molecule crystal structures to aid in protein structure refinement and analysis

    PubMed Central

    Cole, Jason C.

    2017-01-01

    The Cambridge Structural Database (CSD) is the worldwide resource for the dissemination of all published three-dimensional structures of small-molecule organic and metal–organic compounds. This paper briefly describes how this collection of crystal structures can be used en masse in the context of macromolecular crystallography. Examples highlight how the CSD and associated software aid protein–ligand complex validation, and show how the CSD could be further used in the generation of geometrical restraints for protein structure refinement. PMID:28291758

  2. FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.

    PubMed

    Budowski-Tal, Inbal; Nov, Yuval; Kolodny, Rachel

    2010-02-23

    Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.

  3. Structural Biology of Tumor Necrosis Factor Demonstrated for Undergraduates Instruction by Computer Simulation

    ERIC Educational Resources Information Center

    Roy, Urmi

    2016-01-01

    This work presents a three-dimensional (3D) modeling exercise for undergraduate students in chemistry and health sciences disciplines, focusing on a protein-group linked to immune system regulation. Specifically, the exercise involves molecular modeling and structural analysis of tumor necrosis factor (TNF) proteins, both wild type and mutant. The…

  4. Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.

    PubMed

    Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju

    2015-01-01

    Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.

  5. Two-step relaxation mode analysis with multiple evolution times applied to all-atom molecular dynamics protein simulation.

    PubMed

    Karasawa, N; Mitsutake, A; Takano, H

    2017-12-01

    Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n]polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μs molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.

  6. Two-step relaxation mode analysis with multiple evolution times applied to all-atom molecular dynamics protein simulation

    NASA Astrophysics Data System (ADS)

    Karasawa, N.; Mitsutake, A.; Takano, H.

    2017-12-01

    Proteins implement their functionalities when folded into specific three-dimensional structures, and their functions are related to the protein structures and dynamics. Previously, we applied a relaxation mode analysis (RMA) method to protein systems; this method approximately estimates the slow relaxation modes and times via simulation and enables investigation of the dynamic properties underlying the protein structural fluctuations. Recently, two-step RMA with multiple evolution times has been proposed and applied to a slightly complex homopolymer system, i.e., a single [n ] polycatenane. This method can be applied to more complex heteropolymer systems, i.e., protein systems, to estimate the relaxation modes and times more accurately. In two-step RMA, we first perform RMA and obtain rough estimates of the relaxation modes and times. Then, we apply RMA with multiple evolution times to a small number of the slowest relaxation modes obtained in the previous calculation. Herein, we apply this method to the results of principal component analysis (PCA). First, PCA is applied to a 2-μ s molecular dynamics simulation of hen egg-white lysozyme in aqueous solution. Then, the two-step RMA method with multiple evolution times is applied to the obtained principal components. The slow relaxation modes and corresponding relaxation times for the principal components are much improved by the second RMA.

  7. Basic Tilted Helix Bundle - a new protein fold in human FKBP25/FKBP3 and HectD1.

    PubMed

    Helander, Sara; Montecchio, Meri; Lemak, Alexander; Farès, Christophe; Almlöf, Jonas; Yi, Yanjun; Yee, Adelinda; Arrowsmith, Cheryl; DhePaganon, Sirano; Sunnerhagen, Maria

    2014-04-25

    In this paper, we describe the structure of a N-terminal domain motif in nuclear-localized FKBP251-73, a member of the FKBP family, together with the structure of a sequence-related subdomain of the E3 ubiquitin ligase HectD1 that we show belongs to the same fold. This motif adopts a compact 5-helix bundle which we name the Basic Tilted Helix Bundle (BTHB) domain. A positively charged surface patch, structurally centered around the tilted helix H4, is present in both FKBP25 and HectD1 and is conserved in both proteins, suggesting a conserved functional role. We provide detailed comparative analysis of the structures of the two proteins and their sequence similarities, and analysis of the interaction of the proposed FKBP25 binding protein YY1. We suggest that the basic motif in BTHB is involved in the observed DNA binding of FKBP25, and that the function of this domain can be affected by regulatory YY1 binding and/or interactions with adjacent domains. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Integrating Mass Spectrometry of Intact Protein Complexes into Structural Proteomics

    PubMed Central

    Hyung, Suk-Joon; Ruotolo, Brandon T.

    2013-01-01

    Summary Mass spectrometry analysis of intact protein complexes has emerged as an established technology for assessing the composition and connectivity within dynamic, heterogeneous multiprotein complexes at low concentrations and in the context of mixtures. As this technology continues to move forward, one of the main challenges is to integrate the information content of such intact protein complex measurements with other mass spectrometry approaches in structural biology. Methods such as H/D exchange, oxidative foot-printing, chemical cross-linking, affinity purification, and ion mobility separation add complementary information that allows access to every level of protein structure and organization. Here, we survey the structural information that can be retrieved by such experiments, demonstrate the applicability of integrative mass spectrometry approaches in structural proteomics, and look to the future to explore upcoming innovations in this rapidly-advancing area. PMID:22611037

  9. Analysis of Functional Dynamics of Modular Multidomain Proteins by SAXS and NMR.

    PubMed

    Thompson, Matthew K; Ehlinger, Aaron C; Chazin, Walter J

    2017-01-01

    Multiprotein machines drive virtually all primary cellular processes. Modular multidomain proteins are widely distributed within these dynamic complexes because they provide the flexibility needed to remodel structure as well as rapidly assemble and disassemble components of the machinery. Understanding the functional dynamics of modular multidomain proteins is a major challenge confronting structural biology today because their structure is not fixed in time. Small-angle X-ray scattering (SAXS) and nuclear magnetic resonance (NMR) spectroscopy have proven particularly useful for the analysis of the structural dynamics of modular multidomain proteins because they provide highly complementary information for characterizing the architectural landscape accessible to these proteins. SAXS provides a global snapshot of all architectural space sampled by a molecule in solution. Furthermore, SAXS is sensitive to conformational changes, organization and oligomeric states of protein assemblies, and the existence of flexibility between globular domains in multiprotein complexes. The power of NMR to characterize dynamics provides uniquely complementary information to the global snapshot of the architectural ensemble provided by SAXS because it can directly measure domain motion. In particular, NMR parameters can be used to define the diffusion of domains within modular multidomain proteins, connecting the amplitude of interdomain motion to the architectural ensemble derived from SAXS. Our laboratory has been studying the roles of modular multidomain proteins involved in human DNA replication using SAXS and NMR. Here, we present the procedure for acquiring and analyzing SAXS and NMR data, using DNA primase and replication protein A as examples. © 2017 Elsevier Inc. All rights reserved.

  10. Integration of Molecular Dynamics Based Predictions into the Optimization of De Novo Protein Designs: Limitations and Benefits.

    PubMed

    Carvalho, Henrique F; Barbosa, Arménio J M; Roque, Ana C A; Iranzo, Olga; Branco, Ricardo J F

    2017-01-01

    Recent advances in de novo protein design have gained considerable insight from the intrinsic dynamics of proteins, based on the integration of molecular dynamics simulations protocols on the state-of-the-art de novo protein design protocols used nowadays. With this protocol we illustrate how to set up and run a molecular dynamics simulation followed by a functional protein dynamics analysis. New users will be introduced to some useful open-source computational tools, including the GROMACS molecular dynamics simulation software package and ProDy for protein structural dynamics analysis.

  11. The costa of trichomonads: A complex macromolecular cytoskeleton structure made of uncommon proteins.

    PubMed

    de Andrade Rosa, Ivone; Caruso, Marjolly Brigido; de Oliveira Santos, Eidy; Gonzaga, Luiz; Zingali, Russolina Benedeta; de Vasconcelos, Ana Tereza R; de Souza, Wanderley; Benchimol, Marlene

    2017-06-01

    The costa is a prominent striated fibre that is found in protozoa of the Trichomonadidae family that present an undulating membrane. It is composed primarily of proteins that have not yet been explored. In this study, we used cell fractionation to obtain a highly enriched costa fraction whose structure and composition was further analysed by electron microscopy and mass spectrometry. Electron microscopy of negatively stained samples revealed that the costa, which is a periodic structure with alternating electron-dense and electron-lucent bands, displays three distinct regions, named the head, neck and body. Fourier transform analysis showed that the electron-lucent bands present sub-bands with a regular pattern. An analysis of the costa fraction via one- and two-dimensional electrophoresis and liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) allowed the identification of 54 hypothetical proteins. Fourteen of those proteins were considered to be major components of the fraction. The costa of T. foetus is a complex and organised cytoskeleton structure made of a large number of proteins which is assembled into filamentous structures. Some of these proteins exhibit uncharacterised domains and no function related according to gene ontology, suggesting that the costa structure may be formed by a new class of proteins that differ from those previously described in other organisms. Seven of these proteins contain prefoldin domains displaying coiled-coil regions. This propriety is shared with proteins of the striated fibres of other protozoan as well as in intermediate filaments. Our observations suggest the presence of a new class of the cytoskeleton filaments in T. foetus. We believe that our data could auxiliate in determining the specific locations of these proteins in the distinct regions that compose the costa, as well as to define the functional roles of each component. Therefore, our study will help in the better understanding of the organisation and function of this structure in unicellular organisms. © 2017 Société Française des Microscopies and Société de Biologie Cellulaire de France. Published by John Wiley & Sons Ltd.

  12. Analysis of free modeling predictions by RBO aleph in CASP11.

    PubMed

    Mabrouk, Mahmoud; Werner, Tim; Schneider, Michael; Putz, Ines; Brock, Oliver

    2016-09-01

    The CASP experiment is a biannual benchmark for assessing protein structure prediction methods. In CASP11, RBO Aleph ranked as one of the top-performing automated servers in the free modeling category. This category consists of targets for which structural templates are not easily retrievable. We analyze the performance of RBO Aleph and show that its success in CASP was a result of its ab initio structure prediction protocol. A detailed analysis of this protocol demonstrates that two components unique to our method greatly contributed to prediction quality: residue-residue contact prediction by EPC-map and contact-guided conformational space search by model-based search (MBS). Interestingly, our analysis also points to a possible fundamental problem in evaluating the performance of protein structure prediction methods: Improvements in components of the method do not necessarily lead to improvements of the entire method. This points to the fact that these components interact in ways that are poorly understood. This problem, if indeed true, represents a significant obstacle to community-wide progress. Proteins 2016; 84(Suppl 1):87-104. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  13. Hidden Structural Codes in Protein Intrinsic Disorder.

    PubMed

    Borkosky, Silvia S; Camporeale, Gabriela; Chemes, Lucía B; Risso, Marikena; Noval, María Gabriela; Sánchez, Ignacio E; Alonso, Leonardo G; de Prat Gay, Gonzalo

    2017-10-17

    Intrinsic disorder is a major structural category in biology, accounting for more than 30% of coding regions across the domains of life, yet consists of conformational ensembles in equilibrium, a major challenge in protein chemistry. Anciently evolved papillomavirus genomes constitute an unparalleled case for sequence to structure-function correlation in cases in which there are no folded structures. E7, the major transforming oncoprotein of human papillomaviruses, is a paradigmatic example among the intrinsically disordered proteins. Analysis of a large number of sequences of the same viral protein allowed for the identification of a handful of residues with absolute conservation, scattered along the sequence of its N-terminal intrinsically disordered domain, which intriguingly are mostly leucine residues. Mutation of these led to a pronounced increase in both α-helix and β-sheet structural content, reflected by drastic effects on equilibrium propensities and oligomerization kinetics, and uncovers the existence of local structural elements that oppose canonical folding. These folding relays suggest the existence of yet undefined hidden structural codes behind intrinsic disorder in this model protein. Thus, evolution pinpoints conformational hot spots that could have not been identified by direct experimental methods for analyzing or perturbing the equilibrium of an intrinsically disordered protein ensemble.

  14. UQlust: combining profile hashing with linear-time ranking for efficient clustering and analysis of big macromolecular data.

    PubMed

    Adamczak, Rafal; Meller, Jarek

    2016-12-28

    Advances in computing have enabled current protein and RNA structure prediction and molecular simulation methods to dramatically increase their sampling of conformational spaces. The quickly growing number of experimentally resolved structures, and databases such as the Protein Data Bank, also implies large scale structural similarity analyses to retrieve and classify macromolecular data. Consequently, the computational cost of structure comparison and clustering for large sets of macromolecular structures has become a bottleneck that necessitates further algorithmic improvements and development of efficient software solutions. uQlust is a versatile and easy-to-use tool for ultrafast ranking and clustering of macromolecular structures. uQlust makes use of structural profiles of proteins and nucleic acids, while combining a linear-time algorithm for implicit comparison of all pairs of models with profile hashing to enable efficient clustering of large data sets with a low memory footprint. In addition to ranking and clustering of large sets of models of the same protein or RNA molecule, uQlust can also be used in conjunction with fragment-based profiles in order to cluster structures of arbitrary length. For example, hierarchical clustering of the entire PDB using profile hashing can be performed on a typical laptop, thus opening an avenue for structural explorations previously limited to dedicated resources. The uQlust package is freely available under the GNU General Public License at https://github.com/uQlust . uQlust represents a drastic reduction in the computational complexity and memory requirements with respect to existing clustering and model quality assessment methods for macromolecular structure analysis, while yielding results on par with traditional approaches for both proteins and RNAs.

  15. 3D RNA and functional interactions from evolutionary couplings

    PubMed Central

    Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.

    2016-01-01

    Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444

  16. Meeting Report: Structural Determination of Environmentally Responsive Proteins

    PubMed Central

    Reinlib, Leslie

    2005-01-01

    The three-dimensional structure of gene products continues to be a missing lynchpin between linear genome sequences and our understanding of the normal and abnormal function of proteins and pathways. Enhanced activity in this area is likely to lead to better understanding of how discrete changes in molecular patterns and conformation underlie functional changes in protein complexes and, with it, sensitivity of an individual to an exposure. The National Institute of Environmental Health Sciences convened a workshop of experts in structural determination and environmental health to solicit advice for future research in structural resolution relative to environmentally responsive proteins and pathways. The highest priorities recommended by the workshop were to support studies of structure, analysis, control, and design of conformational and functional states at molecular resolution for environmentally responsive molecules and complexes; promote understanding of dynamics, kinetics, and ligand responses; investigate the mechanisms and steps in posttranslational modifications, protein partnering, impact of genetic polymorphisms on structure/function, and ligand interactions; and encourage integrated experimental and computational approaches. The workshop participants also saw value in improving the throughput and purity of protein samples and macromolecular assemblies; developing optimal processes for design, production, and assembly of macromolecular complexes; encouraging studies on protein–protein and macromolecular interactions; and examining assemblies of individual proteins and their functions in pathways of interest for environmental health. PMID:16263521

  17. Membrane protein separation and analysis by supercritical fluid chromatography-mass spectrometry.

    PubMed

    Zhang, Xu; Scalf, Mark; Westphall, Michael S; Smith, Lloyd M

    2008-04-01

    Membrane proteins comprise 25-30% of the human genome and play critical roles in a wide variety of important biological processes. However, their hydrophobic nature has compromised efforts at structural characterization by both X-ray crystallography and mass spectrometry. The detergents that are generally used to solubilize membrane proteins interfere with the crystallization process essential to X-ray studies and cause severe ion suppression effects that hinder mass spectrometric analysis. In this report, the use of supercritical fluid chromatography-mass spectrometry for the separation and analysis of integral membrane proteins and hydrophobic peptides is investigated. It is shown that detergents are rapidly and effectively separated from the proteins and peptides, yielding them in a state suitable for direct mass spectrometric analysis.

  18. GAP Final Technical Report 12-14-04

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andrew J. Bordner, PhD, Senior Research Scientist

    2004-12-14

    The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less

  19. Quantum-mechanics-derived 13Cα chemical shift server (CheShift) for protein structure validation

    PubMed Central

    Vila, Jorge A.; Arnautova, Yelena A.; Martin, Osvaldo A.; Scheraga, Harold A.

    2009-01-01

    A server (CheShift) has been developed to predict 13Cα chemical shifts of protein structures. It is based on the generation of 696,916 conformations as a function of the φ, ψ, ω, χ1 and χ2 torsional angles for all 20 naturally occurring amino acids. Their 13Cα chemical shifts were computed at the DFT level of theory with a small basis set and extrapolated, with an empirically-determined linear regression formula, to reproduce the values obtained with a larger basis set. Analysis of the accuracy and sensitivity of the CheShift predictions, in terms of both the correlation coefficient R and the conformational-averaged rmsd between the observed and predicted 13Cα chemical shifts, was carried out for 3 sets of conformations: (i) 36 x-ray-derived protein structures solved at 2.3 Å or better resolution, for which sets of 13Cα chemical shifts were available; (ii) 15 pairs of x-ray and NMR-derived sets of protein conformations; and (iii) a set of decoys for 3 proteins showing an rmsd with respect to the x-ray structure from which they were derived of up to 3 Å. Comparative analysis carried out with 4 popular servers, namely SHIFTS, SHIFTX, SPARTA, and PROSHIFT, for these 3 sets of conformations demonstrated that CheShift is the most sensitive server with which to detect subtle differences between protein models and, hence, to validate protein structures determined by either x-ray or NMR methods, if the observed 13Cα chemical shifts are available. CheShift is available as a web server. PMID:19805131

  20. Structure of the apo form of the catabolite control protein A (CcpA) from Bacillus megaterium with a DNA-binding domain

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, Rajesh Kumar; Palm, Gottfried J.; Panjikar, Santosh

    2007-04-01

    Crystal structure analysis of the apo form of catabolite control protein A reveals the three-helix bundle of the DNA-binding domain. In the crystal packing, this domain interacts with the binding site for the corepressor protein. Crystal structure determination of catabolite control protein A (CcpA) at 2.6 Å resolution reveals for the first time the structure of a full-length apo-form LacI-GalR family repressor protein. In the crystal structures of these transcription regulators, the three-helix bundle of the DNA-binding domain has only been observed in cognate DNA complexes; it has not been observed in other crystal structures owing to its mobility. Inmore » the crystal packing of apo-CcpA, the protein–protein contacts between the N-terminal three-helix bundle and the core domain consisted of interactions between the homodimers that were similar to those between the corepressor protein HPr and the CcpA N-subdomain in the ternary DNA complex. In contrast to the DNA complex, the apo-CcpA structure reveals large subdomain movements in the core, resulting in a complete loss of contacts between the N-subdomains of the homodimer.« less

  1. A scoring function based on solvation thermodynamics for protein structure prediction

    PubMed Central

    Du, Shiqiao; Harano, Yuichi; Kinoshita, Masahiro; Sakurai, Minoru

    2012-01-01

    We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed. PMID:27493529

  2. Sequential protein unfolding through a carbon nanotube pore

    NASA Astrophysics Data System (ADS)

    Xu, Zhonghe; Zhang, Shuang; Weber, Jeffrey K.; Luan, Binquan; Zhou, Ruhong; Li, Jingyuan

    2016-06-01

    An assortment of biological processes, like protein degradation and the transport of proteins across membranes, depend on protein unfolding events mediated by nanopore interfaces. In this work, we exploit fully atomistic simulations of an artificial, CNT-based nanopore to investigate the nature of ubiquitin unfolding. With one end of the protein subjected to an external force, we observe non-canonical unfolding behaviour as ubiquitin is pulled through the pore opening. Secondary structural elements are sequentially detached from the protein and threaded into the nanotube, interestingly, the remaining part maintains native-like characteristics. The constraints of the nanopore interface thus facilitate the formation of stable ``unfoldon'' motifs above the nanotube aperture that can exist in the absence of specific native contacts with the other secondary structure. Destruction of these unfoldons gives rise to distinct force peaks in our simulations, providing us with a sensitive probe for studying the kinetics of serial unfolding events. Our detailed analysis of nanopore-mediated protein unfolding events not only provides insight into how related processes might proceed in the cell, but also serves to deepen our understanding of structural arrangements which form the basis for protein conformational stability.An assortment of biological processes, like protein degradation and the transport of proteins across membranes, depend on protein unfolding events mediated by nanopore interfaces. In this work, we exploit fully atomistic simulations of an artificial, CNT-based nanopore to investigate the nature of ubiquitin unfolding. With one end of the protein subjected to an external force, we observe non-canonical unfolding behaviour as ubiquitin is pulled through the pore opening. Secondary structural elements are sequentially detached from the protein and threaded into the nanotube, interestingly, the remaining part maintains native-like characteristics. The constraints of the nanopore interface thus facilitate the formation of stable ``unfoldon'' motifs above the nanotube aperture that can exist in the absence of specific native contacts with the other secondary structure. Destruction of these unfoldons gives rise to distinct force peaks in our simulations, providing us with a sensitive probe for studying the kinetics of serial unfolding events. Our detailed analysis of nanopore-mediated protein unfolding events not only provides insight into how related processes might proceed in the cell, but also serves to deepen our understanding of structural arrangements which form the basis for protein conformational stability. Electronic supplementary information (ESI) available. See DOI: 10.1039/c6nr00410e

  3. Impact of Protein-Metal Ion Interactions on the Crystallization of Silk Fibroin Protein

    NASA Astrophysics Data System (ADS)

    Hu, Xiao; Lu, Qiang; Kaplan, David; Cebe, Peggy

    2009-03-01

    Proteins can easily form bonds with a variety of metal ions, which provides many unique biological functions for the protein structures, and therefore controls the overall structural transformation of proteins. We use advanced thermal analysis methods such as temperature modulated differential scanning calorimetry and quasi-isothermal TMDSC, combined with Fourier transform infrared spectroscopy, and scanning electron microscopy, to investigate the protein-metallic ion interactions in Bombyx mori silk fibroin proteins. Silk samples were mixed with different metal ions (Ca^2+, K^+, Ma^2+, Na^+, Cu^2+, Mn^2+) with different mass ratios, and compared with the physical conditions in the silkworm gland. Results show that all metallic ions can directly affect the crystallization behavior and glass transition of silk fibroin. However, different ions tend to have different structural impact, including their role as plasticizer or anti-plasticizer. Detailed studies reveal important information allowing us better to understand the natural silk spinning and crystallization process.

  4. Functional Advantages of Conserved Intrinsic Disorder in RNA-Binding Proteins.

    PubMed

    Varadi, Mihaly; Zsolyomi, Fruzsina; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Proteins form large macromolecular assemblies with RNA that govern essential molecular processes. RNA-binding proteins have often been associated with conformational flexibility, yet the extent and functional implications of their intrinsic disorder have never been fully assessed. Here, through large-scale analysis of comprehensive protein sequence and structure datasets we demonstrate the prevalence of intrinsic structural disorder in RNA-binding proteins and domains. We addressed their functionality through a quantitative description of the evolutionary conservation of disordered segments involved in binding, and investigated the structural implications of flexibility in terms of conformational stability and interface formation. We conclude that the functional role of intrinsically disordered protein segments in RNA-binding is two-fold: first, these regions establish extended, conserved electrostatic interfaces with RNAs via induced fit. Second, conformational flexibility enables them to target different RNA partners, providing multi-functionality, while also ensuring specificity. These findings emphasize the functional importance of intrinsically disordered regions in RNA-binding proteins.

  5. Charge-density analysis of a protein structure at subatomic resolution: the human aldose reductase case.

    PubMed

    Guillot, Benoît; Jelsch, Christian; Podjarny, Alberto; Lecomte, Claude

    2008-05-01

    The valence electron density of the protein human aldose reductase was analyzed at 0.66 angstroms resolution. The methodological developments in the software MoPro to adapt standard charge-density techniques from small molecules to macromolecular structures are described. The deformation electron density visible in initial residual Fourier difference maps was significantly enhanced after high-order refinement. The protein structure was refined after transfer of the experimental library multipolar atom model (ELMAM). The effects on the crystallographic statistics, on the atomic thermal displacement parameters and on the structure stereochemistry are analyzed. Constrained refinements of the transferred valence populations Pval and multipoles Plm were performed against the X-ray diffraction data on a selected substructure of the protein with low thermal motion. The resulting charge densities are of good quality, especially for chemical groups with many copies present in the polypeptide chain. To check the effect of the starting point on the result of the constrained multipolar refinement, the same charge-density refinement strategy was applied but using an initial neutral spherical atom model, i.e. without transfer from the ELMAM library. The best starting point for a protein multipolar refinement is the structure with the electron density transferred from the database. This can be assessed by the crystallographic statistical indices, including Rfree, and the quality of the static deformation electron-density maps, notably on the oxygen electron lone pairs. The analysis of the main-chain bond lengths suggests that stereochemical dictionaries would benefit from a revision based on recently determined unrestrained atomic resolution protein structures.

  6. A resource for benchmarking the usefulness of protein structure models.

    PubMed

    Carbajo, Daniel; Tramontano, Anna

    2012-08-02

    Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

  7. The Papillomavirus E2 proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McBride, Alison A., E-mail: amcbride@nih.gov

    2013-10-15

    The papillomavirus E2 proteins are pivotal to the viral life cycle and have well characterized functions in transcriptional regulation, initiation of DNA replication and partitioning the viral genome. The E2 proteins also function in vegetative DNA replication, post-transcriptional processes and possibly packaging. This review describes structural and functional aspects of the E2 proteins and their binding sites on the viral genome. It is intended to be a reference guide to this viral protein. - Highlights: • Overview of E2 protein functions. • Structural domains of the papillomavirus E2 proteins. • Analysis of E2 binding sites in different genera of papillomaviruses.more » • Compilation of E2 associated proteins. • Comparison of key mutations in distinct E2 functions.« less

  8. Conserved nucleation sites reinforce the significance of Phi value analysis in protein-folding studies.

    PubMed

    Gianni, Stefano; Jemth, Per

    2014-07-01

    The only experimental strategy to address the structure of folding transition states, the so-called Φ value analysis, relies on the synergy between site directed mutagenesis and the measurement of reaction kinetics. Despite its importance, the Φ value analysis has been often criticized and its power to pinpoint structural information has been questioned. In this hypothesis, we demonstrate that comparing the Φ values between proteins not only allows highlighting the robustness of folding pathways but also provides per se a strong validation of the method. © 2014 International Union of Biochemistry and Molecular Biology.

  9. Insights into the Antimicrobial Mechanism of Action of Human RNase6: Structural Determinants for Bacterial Cell Agglutination and Membrane Permeation

    PubMed Central

    Pulido, David; Arranz-Trullén, Javier; Prats-Ejarque, Guillem; Velázquez, Diego; Torrent, Marc; Moussaoui, Mohammed; Boix, Ester

    2016-01-01

    Human Ribonuclease 6 is a secreted protein belonging to the ribonuclease A (RNaseA) superfamily, a vertebrate specific family suggested to arise with an ancestral host defense role. Tissue distribution analysis revealed its expression in innate cell types, showing abundance in monocytes and neutrophils. Recent evidence of induction of the protein expression by bacterial infection suggested an antipathogen function in vivo. In our laboratory, the antimicrobial properties of the protein have been evaluated against Gram-negative and Gram-positive species and its mechanism of action was characterized using a membrane model. Interestingly, our results indicate that RNase6, as previously reported for RNase3, is able to specifically agglutinate Gram-negative bacteria as a main trait of its antimicrobial activity. Moreover, a side by side comparative analysis with the RN6(1–45) derived peptide highlights that the antimicrobial activity is mostly retained at the protein N-terminus. Further work by site directed mutagenesis and structural analysis has identified two residues involved in the protein antimicrobial action (Trp1 and Ile13) that are essential for the cell agglutination properties. This is the first structure-functional characterization of RNase6 antimicrobial properties, supporting its contribution to the infection focus clearance. PMID:27089320

  10. Multiscale molecular dynamics simulations of rotary motor proteins.

    PubMed

    Ekimoto, Toru; Ikeguchi, Mitsunori

    2018-04-01

    Protein functions require specific structures frequently coupled with conformational changes. The scale of the structural dynamics of proteins spans from the atomic to the molecular level. Theoretically, all-atom molecular dynamics (MD) simulation is a powerful tool to investigate protein dynamics because the MD simulation is capable of capturing conformational changes obeying the intrinsically structural features. However, to study long-timescale dynamics, efficient sampling techniques and coarse-grained (CG) approaches coupled with all-atom MD simulations, termed multiscale MD simulations, are required to overcome the timescale limitation in all-atom MD simulations. Here, we review two examples of rotary motor proteins examined using free energy landscape (FEL) analysis and CG-MD simulations. In the FEL analysis, FEL is calculated as a function of reaction coordinates, and the long-timescale dynamics corresponding to conformational changes is described as transitions on the FEL surface. Another approach is the utilization of the CG model, in which the CG parameters are tuned using the fluctuation matching methodology with all-atom MD simulations. The long-timespan dynamics is then elucidated straightforwardly by using CG-MD simulations.

  11. Genomic Organization and Molecular Analysis of Virulent Bacteriophage 2972 Infecting an Exopolysaccharide-Producing Streptococcus thermophilus Strain

    PubMed Central

    Lévesque, Céline; Duplessis, Martin; Labonté, Jessica; Labrie, Steve; Fremaux, Christophe; Tremblay, Denise; Moineau, Sylvain

    2005-01-01

    The Streptococcus thermophilus virulent pac-type phage 2972 was isolated from a yogurt made in France in 1999. It is a representative of several phages that have emerged with the industrial use of the exopolysaccharide-producing S. thermophilus strain RD534. The genome of phage 2972 has 34,704 bp with an overall G+C content of 40.15%, making it the shortest S. thermophilus phage genome analyzed so far. Forty-four open reading frames (ORFs) encoding putative proteins of 40 or more amino acids were identified, and bioinformatic analyses led to the assignment of putative functions to 23 ORFs. Comparative genomic analysis of phage 2972 with the six other sequenced S. thermophilus phage genomes confirmed that the replication module is conserved and that cos- and pac-type phages have distinct structural and packaging genes. Two group I introns were identified in the genome of 2972. They interrupted the genes coding for the putative endolysin and the terminase large subunit. Phage mRNA splicing was demonstrated for both introns, and the secondary structures were predicted. Eight structural proteins were also identified by N-terminal sequencing and/or matrix-assisted laser desorption ionization—time-of-flight mass spectrometry. Detailed analysis of the putative minor tail proteins ORF19 and ORF21 as well as the putative receptor-binding protein ORF20 showed the following interesting features: (i) ORF19 is a hybrid protein, because it displays significant identity with both pac- and cos-type phages; (ii) ORF20 is unique; and (iii) a protein similar to ORF21 of 2972 was also found in the structure of the cos-type phage DT1, indicating that this structural protein is present in both S. thermophilus phage groups. The implications of these findings for phage classification are discussed. PMID:16000821

  12. Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

    PubMed Central

    Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

    2005-01-01

    Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747

  13. Mapping of ligand-binding cavities in proteins.

    PubMed

    Andersson, C David; Chen, Brian Y; Linusson, Anna

    2010-05-01

    The complex interactions between proteins and small organic molecules (ligands) are intensively studied because they play key roles in biological processes and drug activities. Here, we present a novel approach to characterize and map the ligand-binding cavities of proteins without direct geometric comparison of structures, based on Principal Component Analysis of cavity properties (related mainly to size, polarity, and charge). This approach can provide valuable information on the similarities and dissimilarities, of binding cavities due to mutations, between-species differences and flexibility upon ligand-binding. The presented results show that information on ligand-binding cavity variations can complement information on protein similarity obtained from sequence comparisons. The predictive aspect of the method is exemplified by successful predictions of serine proteases that were not included in the model construction. The presented strategy to compare ligand-binding cavities of related and unrelated proteins has many potential applications within protein and medicinal chemistry, for example in the characterization and mapping of "orphan structures", selection of protein structures for docking studies in structure-based design, and identification of proteins for selectivity screens in drug design programs. 2009 Wiley-Liss, Inc.

  14. Structure prediction and analysis of MxaF from obligate, facultative and restricted facultative methylobacterium.

    PubMed

    Singh, Raghvendra Pratap; Singh, Ram Nageena; Srivastava, Manish K; Srivastava, Alok Kumar; Kumar, Sudheer; Dubey, Ramesh Chandra; Sharma, Arun Kumar

    2012-01-01

    Methylobacteria are ubiquitous in the biosphere which are capable of growing on C1 compounds such as formate, formaldehyde, methanol and methylamine as well as on a wide range of multi-carbon growth substrates such as C2, C3 and C4 compounds due to the methylotrophic enzymes methanol dehydrogenase (MDH). MDH is performing these functions with the help of a key protein mxaF. Unfortunately, detailed structural analysis and homology modeling of mxaF is remains undefined. Hence, the objective of this research is the characterization and three dimensional modeling of mxaF protein from three different methylotrophs by using I-TASSER server. The predicted model were further optimize and validate by Profile 3D, Errat, Verifiy3-D and PROCHECK server. Predicted and best evaluated models have been successfully deposited to PMDB database with PMDB ID PM0077505, PM0077506 and PM0077507. Active site identification revealed 11, 13 and 14 putative functional site residues in respected models. It may play a major role during protein-protein, and protein-cofactor interactions. This study can provide us an ab-initio and detail information to understand the structure, mechanism of action and regulation of mxaF protein.

  15. Structure prediction and analysis of MxaF from obligate, facultative and restricted facultative methylobacterium

    PubMed Central

    Singh, Raghvendra Pratap; Singh, Ram Nageena; Srivastava, Manish K; Srivastava, Alok Kumar; Kumar, Sudheer; Dubey, Ramesh Chandra; Sharma, Arun Kumar

    2012-01-01

    Methylobacteria are ubiquitous in the biosphere which are capable of growing on C1 compounds such as formate, formaldehyde, methanol and methylamine as well as on a wide range of multi-carbon growth substrates such as C2, C3 and C4 compounds due to the methylotrophic enzymes methanol dehydrogenase (MDH). MDH is performing these functions with the help of a key protein mxaF. Unfortunately, detailed structural analysis and homology modeling of mxaF is remains undefined. Hence, the objective of this research is the characterization and three dimensional modeling of mxaF protein from three different methylotrophs by using I-TASSER server. The predicted model were further optimize and validate by Profile 3D, Errat, Verifiy3-D and PROCHECK server. Predicted and best evaluated models have been successfully deposited to PMDB database with PMDB ID PM0077505, PM0077506 and PM0077507. Active site identification revealed 11, 13 and 14 putative functional site residues in respected models. It may play a major role during protein-protein, and protein-cofactor interactions. This study can provide us an ab-initio and detail information to understand the structure, mechanism of action and regulation of mxaF protein. PMID:23275704

  16. Insight into the Structure of Amyloid Fibrils from the Analysis of Globular Proteins

    PubMed Central

    Trovato, Antonio; Chiti, Fabrizio; Maritan, Amos; Seno, Flavio

    2006-01-01

    The conversion from soluble states into cross-β fibrillar aggregates is a property shared by many different proteins and peptides and was hence conjectured to be a generic feature of polypeptide chains. Increasing evidence is now accumulating that such fibrillar assemblies are generally characterized by a parallel in-register alignment of β-strands contributed by distinct protein molecules. Here we assume a universal mechanism is responsible for β-structure formation and deduce sequence-specific interaction energies between pairs of protein fragments from a statistical analysis of the native folds of globular proteins. The derived fragment–fragment interaction was implemented within a novel algorithm, prediction of amyloid structure aggregation (PASTA), to investigate the role of sequence heterogeneity in driving specific aggregation into ordered self-propagating cross-β structures. The algorithm predicts that the parallel in-register arrangement of sequence portions that participate in the fibril cross-β core is favoured in most cases. However, the antiparallel arrangement is correctly discriminated when present in fibrils formed by short peptides. The predictions of the most aggregation-prone portions of initially unfolded polypeptide chains are also in excellent agreement with available experimental observations. These results corroborate the recent hypothesis that the amyloid structure is stabilised by the same physicochemical determinants as those operating in folded proteins. They also suggest that side chain–side chain interaction across neighbouring β-strands is a key determinant of amyloid fibril formation and of their self-propagating ability. PMID:17173479

  17. Characterization of the Bm61 of the Bombyx mori nucleopolyhedrovirus.

    PubMed

    Shen, Hongxing; Chen, Keping; Yao, Qin; Zhou, Yang

    2009-07-01

    orf61 (bm61) of Bombyx mori Nucleopolyhedrovirus (BmNPV) is a highly conserved baculovirus gene, suggesting that it performs an important role in the virus life cycle whose function is unknown. In this study, we describe the characterization of bm61. Quantitative polymerase chain reaction (qPCR) and western blot analysis demonstrated that bm61 was expressed as a late gene. Immunofluorescence analysis by confocal microscopy showed that BM61 protein was localized on nuclear membrane and in intranuclear ring zone of infected cells. Structure localization of the BM61 in BV and ODV by western analysis demonstrated that BM61 was the protein of both BV and ODV. In addition, our data indicated that BM61 was a late structure protein localized in nucleus.

  18. Challenges in the Development of Functional Assays of Membrane Proteins

    PubMed Central

    Tiefenauer, Louis; Demarche, Sophie

    2012-01-01

    Lipid bilayers are natural barriers of biological cells and cellular compartments. Membrane proteins integrated in biological membranes enable vital cell functions such as signal transduction and the transport of ions or small molecules. In order to determine the activity of a protein of interest at defined conditions, the membrane protein has to be integrated into artificial lipid bilayers immobilized on a surface. For the fabrication of such biosensors expertise is required in material science, surface and analytical chemistry, molecular biology and biotechnology. Specifically, techniques are needed for structuring surfaces in the micro- and nanometer scale, chemical modification and analysis, lipid bilayer formation, protein expression, purification and solubilization, and most importantly, protein integration into engineered lipid bilayers. Electrochemical and optical methods are suitable to detect membrane activity-related signals. The importance of structural knowledge to understand membrane protein function is obvious. Presently only a few structures of membrane proteins are solved at atomic resolution. Functional assays together with known structures of individual membrane proteins will contribute to a better understanding of vital biological processes occurring at biological membranes. Such assays will be utilized in the discovery of drugs, since membrane proteins are major drug targets.

  19. Template-Based Modeling of Protein-RNA Interactions.

    PubMed

    Zheng, Jinfang; Kundrotas, Petras J; Vakser, Ilya A; Liu, Shiyong

    2016-09-01

    Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes.

  20. Molecular and Structural Characterization of the Tegumental 20.6-kDa Protein in Clonorchis sinensis as a Potential Druggable Target.

    PubMed

    Kim, Yu-Jung; Yoo, Won Gi; Lee, Myoung-Ro; Kang, Jung-Mi; Na, Byoung-Kuk; Cho, Shin-Hyeong; Park, Mi-Yeoun; Ju, Jung-Won

    2017-03-04

    The tegument, representing the membrane-bound outer surface of platyhelminth parasites, plays an important role for the regulation of the host immune response and parasite survival. A comprehensive understanding of tegumental proteins can provide drug candidates for use against helminth-associated diseases, such as clonorchiasis caused by the liver fluke Clonorchis sinensis . However, little is known regarding the physicochemical properties of C. sinensis teguments. In this study, a novel 20.6-kDa tegumental protein of the C. sinensis adult worm (CsTegu20.6) was identified and characterized by molecular and in silico methods. The complete coding sequence of 525 bp was derived from cDNA clones and encodes a protein of 175 amino acids. Homology search using BLASTX showed CsTegu20.6 identity ranging from 29% to 39% with previously-known tegumental proteins in C. sinensis . Domain analysis indicated the presence of a calcium-binding EF-hand domain containing a basic helix-loop-helix structure and a dynein light chain domain exhibiting a ferredoxin fold. We used a modified method to obtain the accurate tertiary structure of the CsTegu20.6 protein because of the unavailability of appropriate templates. The CsTegu20.6 protein sequence was split into two domains based on the disordered region, and then, the structure of each domain was modeled using I-TASSER. A final full-length structure was obtained by combining two structures and refining the whole structure. A refined CsTegu20.6 structure was used to identify a potential CsTegu20.6 inhibitor based on protein structure-compound interaction analysis. The recombinant proteins were expressed in Escherichia coli and purified by nickel-nitrilotriacetic acid affinity chromatography. In C. sinensis , CsTegu20.6 mRNAs were abundant in adult and metacercariae, but not in the egg. Immunohistochemistry revealed that CsTegu20.6 localized to the surface of the tegument in the adult fluke. Collectively, our results contribute to a better understanding of the structural and functional characteristics of CsTegu20.6 and homologs of flukes. One compound is proposed as a putative inhibitor of CsTegu20.6 to facilitate further studies for anthelmintics.

  1. Blood proteins analysis by Raman spectroscopy method

    NASA Astrophysics Data System (ADS)

    Artemyev, D. N.; Bratchenko, I. A.; Khristoforova, Yu. A.; Lykina, A. A.; Myakinin, O. O.; Kuzmina, T. P.; Davydkin, I. L.; Zakharov, V. P.

    2016-04-01

    This work is devoted to study the possibility of plasma proteins (albumin, globulins) concentration measurement using Raman spectroscopy setup. The blood plasma and whole blood were studied in this research. The obtained Raman spectra showed significant variation of intensities of certain spectral bands 940, 1005, 1330, 1450 and 1650 cm-1 for different protein fractions. Partial least squares regression analysis was used for determination of correlation coefficients. We have shown that the proposed method represents the structure and biochemical composition of major blood proteins.

  2. Maltose-neopentyl glycol (MNG) amphiphiles for solubilization, stabilization and crystallization of membrane proteins.

    PubMed

    Chae, Pil Seok; Rasmussen, Søren G F; Rana, Rohini R; Gotfryd, Kamil; Chandra, Richa; Goren, Michael A; Kruse, Andrew C; Nurva, Shailika; Loland, Claus J; Pierre, Yves; Drew, David; Popot, Jean-Luc; Picot, Daniel; Fox, Brian G; Guan, Lan; Gether, Ulrik; Byrne, Bernadette; Kobilka, Brian; Gellman, Samuel H

    2010-12-01

    The understanding of integral membrane protein (IMP) structure and function is hampered by the difficulty of handling these proteins. Aqueous solubilization, necessary for many types of biophysical analysis, generally requires a detergent to shield the large lipophilic surfaces of native IMPs. Many proteins remain difficult to study owing to a lack of suitable detergents. We introduce a class of amphiphiles, each built around a central quaternary carbon atom derived from neopentyl glycol, with hydrophilic groups derived from maltose. Representatives of this maltose-neopentyl glycol (MNG) amphiphile family show favorable behavior relative to conventional detergents, as manifested in multiple membrane protein systems, leading to enhanced structural stability and successful crystallization. MNG amphiphiles are promising tools for membrane protein science because of the ease with which they may be prepared and the facility with which their structures may be varied.

  3. Algorithm to find distant repeats in a single protein sequence

    PubMed Central

    Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

    2008-01-01

    Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663

  4. CAVER 3.0: A Tool for the Analysis of Transport Pathways in Dynamic Protein Structures

    PubMed Central

    Strnad, Ondrej; Brezovsky, Jan; Kozlikova, Barbora; Gora, Artur; Sustr, Vilem; Klvana, Martin; Medek, Petr; Biedermannova, Lada; Sochor, Jiri; Damborsky, Jiri

    2012-01-01

    Tunnels and channels facilitate the transport of small molecules, ions and water solvent in a large variety of proteins. Characteristics of individual transport pathways, including their geometry, physico-chemical properties and dynamics are instrumental for understanding of structure-function relationships of these proteins, for the design of new inhibitors and construction of improved biocatalysts. CAVER is a software tool widely used for the identification and characterization of transport pathways in static macromolecular structures. Herein we present a new version of CAVER enabling automatic analysis of tunnels and channels in large ensembles of protein conformations. CAVER 3.0 implements new algorithms for the calculation and clustering of pathways. A trajectory from a molecular dynamics simulation serves as the typical input, while detailed characteristics and summary statistics of the time evolution of individual pathways are provided in the outputs. To illustrate the capabilities of CAVER 3.0, the tool was applied for the analysis of molecular dynamics simulation of the microbial enzyme haloalkane dehalogenase DhaA. CAVER 3.0 safely identified and reliably estimated the importance of all previously published DhaA tunnels, including the tunnels closed in DhaA crystal structures. Obtained results clearly demonstrate that analysis of molecular dynamics simulation is essential for the estimation of pathway characteristics and elucidation of the structural basis of the tunnel gating. CAVER 3.0 paves the way for the study of important biochemical phenomena in the area of molecular transport, molecular recognition and enzymatic catalysis. The software is freely available as a multiplatform command-line application at http://www.caver.cz. PMID:23093919

  5. CAVER 3.0: a tool for the analysis of transport pathways in dynamic protein structures.

    PubMed

    Chovancova, Eva; Pavelka, Antonin; Benes, Petr; Strnad, Ondrej; Brezovsky, Jan; Kozlikova, Barbora; Gora, Artur; Sustr, Vilem; Klvana, Martin; Medek, Petr; Biedermannova, Lada; Sochor, Jiri; Damborsky, Jiri

    2012-01-01

    Tunnels and channels facilitate the transport of small molecules, ions and water solvent in a large variety of proteins. Characteristics of individual transport pathways, including their geometry, physico-chemical properties and dynamics are instrumental for understanding of structure-function relationships of these proteins, for the design of new inhibitors and construction of improved biocatalysts. CAVER is a software tool widely used for the identification and characterization of transport pathways in static macromolecular structures. Herein we present a new version of CAVER enabling automatic analysis of tunnels and channels in large ensembles of protein conformations. CAVER 3.0 implements new algorithms for the calculation and clustering of pathways. A trajectory from a molecular dynamics simulation serves as the typical input, while detailed characteristics and summary statistics of the time evolution of individual pathways are provided in the outputs. To illustrate the capabilities of CAVER 3.0, the tool was applied for the analysis of molecular dynamics simulation of the microbial enzyme haloalkane dehalogenase DhaA. CAVER 3.0 safely identified and reliably estimated the importance of all previously published DhaA tunnels, including the tunnels closed in DhaA crystal structures. Obtained results clearly demonstrate that analysis of molecular dynamics simulation is essential for the estimation of pathway characteristics and elucidation of the structural basis of the tunnel gating. CAVER 3.0 paves the way for the study of important biochemical phenomena in the area of molecular transport, molecular recognition and enzymatic catalysis. The software is freely available as a multiplatform command-line application at http://www.caver.cz.

  6. Structural Basis for Antifreeze Activity of Ice-binding Protein from Arctic Yeast*

    PubMed Central

    Lee, Jun Hyuck; Park, Ae Kyung; Do, Hackwon; Park, Kyoung Sun; Moh, Sang Hyun; Chi, Young Min; Kim, Hak Jun

    2012-01-01

    Arctic yeast Leucosporidium sp. produces a glycosylated ice-binding protein (LeIBP) with a molecular mass of ∼25 kDa, which can lower the freezing point below the melting point once it binds to ice. LeIBP is a member of a large class of ice-binding proteins, the structures of which are unknown. Here, we report the crystal structures of non-glycosylated LeIBP and glycosylated LeIBP at 1.57- and 2.43-Å resolution, respectively. Structural analysis of the LeIBPs revealed a dimeric right-handed β-helix fold, which is composed of three parts: a large coiled structural domain, a long helix region (residues 96–115 form a long α-helix that packs along one face of the β-helix), and a C-terminal hydrophobic loop region (243PFVPAPEVV251). Unexpectedly, the C-terminal hydrophobic loop region has an extended conformation pointing away from the body of the coiled structural domain and forms intertwined dimer interactions. In addition, structural analysis of glycosylated LeIBP with sugar moieties attached to Asn185 provides a basis for interpreting previous biochemical analyses as well as the increased stability and secretion of glycosylated LeIBP. We also determined that the aligned Thr/Ser/Ala residues are critical for ice binding within the B face of LeIBP using site-directed mutagenesis. Although LeIBP has a common β-helical fold similar to that of canonical hyperactive antifreeze proteins, the ice-binding site is more complex and does not have a simple ice-binding motif. In conclusion, we could identify the ice-binding site of LeIBP and discuss differences in the ice-binding modes compared with other known antifreeze proteins and ice-binding proteins. PMID:22303017

  7. Structural and Functional Analysis of VQ Motif-Containing Proteins in Arabidopsis as Interacting Proteins of WRKY Transcription Factors1[W][OA

    PubMed Central

    Cheng, Yuan; Zhou, Yuan; Yang, Yan; Chi, Ying-Jun; Zhou, Jie; Chen, Jian-Ye; Wang, Fei; Fan, Baofang; Shi, Kai; Zhou, Yan-Hong; Yu, Jing-Quan; Chen, Zhixiang

    2012-01-01

    WRKY transcription factors are encoded by a large gene superfamily with a broad range of roles in plants. Recently, several groups have reported that proteins containing a short VQ (FxxxVQxLTG) motif interact with WRKY proteins. We have recently discovered that two VQ proteins from Arabidopsis (Arabidopsis thaliana), SIGMA FACTOR-INTERACTING PROTEIN1 and SIGMA FACTOR-INTERACTING PROTEIN2, act as coactivators of WRKY33 in plant defense by specifically recognizing the C-terminal WRKY domain and stimulating the DNA-binding activity of WRKY33. In this study, we have analyzed the entire family of 34 structurally divergent VQ proteins from Arabidopsis. Yeast (Saccharomyces cerevisiae) two-hybrid assays showed that Arabidopsis VQ proteins interacted specifically with the C-terminal WRKY domains of group I and the sole WRKY domains of group IIc WRKY proteins. Using site-directed mutagenesis, we identified structural features of these two closely related groups of WRKY domains that are critical for interaction with VQ proteins. Quantitative reverse transcription polymerase chain reaction revealed that expression of a majority of Arabidopsis VQ genes was responsive to pathogen infection and salicylic acid treatment. Functional analysis using both knockout mutants and overexpression lines revealed strong phenotypes in growth, development, and susceptibility to pathogen infection. Altered phenotypes were substantially enhanced through cooverexpression of genes encoding interacting VQ and WRKY proteins. These findings indicate that VQ proteins play an important role in plant growth, development, and response to environmental conditions, most likely by acting as cofactors of group I and IIc WRKY transcription factors. PMID:22535423

  8. Structural and functional analysis of VQ motif-containing proteins in Arabidopsis as interacting proteins of WRKY transcription factors.

    PubMed

    Cheng, Yuan; Zhou, Yuan; Yang, Yan; Chi, Ying-Jun; Zhou, Jie; Chen, Jian-Ye; Wang, Fei; Fan, Baofang; Shi, Kai; Zhou, Yan-Hong; Yu, Jing-Quan; Chen, Zhixiang

    2012-06-01

    WRKY transcription factors are encoded by a large gene superfamily with a broad range of roles in plants. Recently, several groups have reported that proteins containing a short VQ (FxxxVQxLTG) motif interact with WRKY proteins. We have recently discovered that two VQ proteins from Arabidopsis (Arabidopsis thaliana), SIGMA FACTOR-INTERACTING PROTEIN1 and SIGMA FACTOR-INTERACTING PROTEIN2, act as coactivators of WRKY33 in plant defense by specifically recognizing the C-terminal WRKY domain and stimulating the DNA-binding activity of WRKY33. In this study, we have analyzed the entire family of 34 structurally divergent VQ proteins from Arabidopsis. Yeast (Saccharomyces cerevisiae) two-hybrid assays showed that Arabidopsis VQ proteins interacted specifically with the C-terminal WRKY domains of group I and the sole WRKY domains of group IIc WRKY proteins. Using site-directed mutagenesis, we identified structural features of these two closely related groups of WRKY domains that are critical for interaction with VQ proteins. Quantitative reverse transcription polymerase chain reaction revealed that expression of a majority of Arabidopsis VQ genes was responsive to pathogen infection and salicylic acid treatment. Functional analysis using both knockout mutants and overexpression lines revealed strong phenotypes in growth, development, and susceptibility to pathogen infection. Altered phenotypes were substantially enhanced through cooverexpression of genes encoding interacting VQ and WRKY proteins. These findings indicate that VQ proteins play an important role in plant growth, development, and response to environmental conditions, most likely by acting as cofactors of group I and IIc WRKY transcription factors.

  9. Structural and Functional Studies of H. seropedicae RecA Protein – Insights into the Polymerization of RecA Protein as Nucleoprotein Filament

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leite, Wellington C.; Galvão, Carolina W.; Saab, Sérgio C.

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminalmore » polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. In conclusion, our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.« less

  10. Structural and Functional Studies of H. seropedicae RecA Protein – Insights into the Polymerization of RecA Protein as Nucleoprotein Filament

    PubMed Central

    Galvão, Carolina W.; Saab, Sérgio C.; Iulek, Jorge; Etto, Rafael M.; Steffens, Maria B. R.; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L.; Cox, Michael M.

    2016-01-01

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament. PMID:27447485

  11. PROVAT: a tool for Voronoi tessellation analysis of protein structures and complexes.

    PubMed

    Gore, Swanand P; Burke, David F; Blundell, Tom L

    2005-08-01

    Voronoi tessellation has proved to be a useful tool in protein structure analysis. We have developed PROVAT, a versatile public domain software that enables computation and visualization of Voronoi tessellations of proteins and protein complexes. It is a set of Python scripts that integrate freely available specialized software (Qhull, Pymol etc.) into a pipeline. The calculation component of the tool computes Voronoi tessellation of a given protein system in a way described by a user-supplied XML recipe and stores resulting neighbourhood information as text files with various styles. The Python pickle file generated in the process is used by the visualization component, a Pymol plug-in, that offers a GUI to explore the tessellation visually. PROVAT source code can be downloaded from http://raven.bioc.cam.ac.uk/~swanand/Provat1, which also provides a webserver for its calculation component, documentation and examples.

  12. Local Structural Differences in Homologous Proteins: Specificities in Different SCOP Classes

    PubMed Central

    Joseph, Agnel Praveen; Valadié, Hélène; Srinivasan, Narayanaswamy; de Brevern, Alexandre G.

    2012-01-01

    The constant increase in the number of solved protein structures is of great help in understanding the basic principles behind protein folding and evolution. 3-D structural knowledge is valuable in designing and developing methods for comparison, modelling and prediction of protein structures. These approaches for structure analysis can be directly implicated in studying protein function and for drug design. The backbone of a protein structure favours certain local conformations which include α-helices, β-strands and turns. Libraries of limited number of local conformations (Structural Alphabets) were developed in the past to obtain a useful categorization of backbone conformation. Protein Block (PB) is one such Structural Alphabet that gave a reasonable structure approximation of 0.42 Å. In this study, we use PB description of local structures to analyse conformations that are preferred sites for structural variations and insertions, among group of related folds. This knowledge can be utilized in improving tools for structure comparison that work by analysing local structure similarities. Conformational differences between homologous proteins are known to occur often in the regions comprising turns and loops. Interestingly, these differences are found to have specific preferences depending upon the structural classes of proteins. Such class-specific preferences are mainly seen in the all-β class with changes involving short helical conformations and hairpin turns. A test carried out on a benchmark dataset also indicates that the use of knowledge on the class specific variations can improve the performance of a PB based structure comparison approach. The preference for the indel sites also seem to be confined to a few backbone conformations involving β-turns and helix C-caps. These are mainly associated with short loops joining the regular secondary structures that mediate a reversal in the chain direction. Rare β-turns of type I’ and II’ are also identified as preferred sites for insertions. PMID:22745680

  13. Structural analysis of Bacillus pumilus phenolic acid decarboxylase, a lipocalin-fold enzyme

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Matte, Allan; Grosse, Stephan; Bergeron, Hélène

    The decarboxylation of phenolic acids, including ferulic and p-coumaric acids, to their corresponding vinyl derivatives is of importance in the flavoring and polymer industries. Here, the crystal structure of phenolic acid decarboxylase (PAD) from Bacillus pumilus strain UI-670 is reported. The enzyme is a 161-residue polypeptide that forms dimers both in the crystal and in solution. The structure of PAD as determined by X-ray crystallography revealed a -barrel structure and two -helices, with a cleft formed at one edge of the barrel. The PAD structure resembles those of the lipocalin-fold proteins, which often bind hydrophobic ligands. Superposition of structurally relatedmore » proteins bound to their cognate ligands shows that they and PAD bind their ligands in a conserved location within the -barrel. Analysis of the residue-conservation pattern for PAD-related sequences mapped onto the PAD structure reveals that the conservation mainly includes residues found within the hydrophobic core of the protein, defining a common lipocalin-like fold for this enzyme family. A narrow cleft containing several conserved amino acids was observed as a structural feature and a potential ligand-binding site.« less

  14. Protein secondary structure and stability determined by combining exoproteolysis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry.

    PubMed

    Villanueva, Josep; Villegas, Virtudes; Querol, Enrique; Avilés, Francesc X; Serrano, Luis

    2002-09-01

    In the post-genomic era, several projects focused on the massive experimental resolution of the three-dimensional structures of all the proteins of different organisms have been initiated. Simultaneously, significant progress has been made in the ab initio prediction of protein three-dimensional structure. One of the keys to the success of such a prediction is the use of local information (i.e. secondary structure). Here we describe a new limited proteolysis methodology, based on the use of unspecific exoproteases coupled with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), to map quickly secondary structure elements of a protein from both ends, the N- and C-termini. We show that the proteolytic patterns (mass spectra series) obtained can be interpreted in the light of the conformation and local stability of the analyzed proteins, a direct correlation being observed between the predicted and the experimentally derived protein secondary structure. Further, this methodology can be easily applied to check rapidly the folding state of a protein and characterize mutational effects on protein conformation and stability. Moreover, given global stability information, this methodology allows one to locate the protein regions of increased or decreased conformational stability. All of this can be done with a small fraction of the amount of protein required by most of the other methods for conformational analysis. Thus limited exoproteolysis, together with MALDI-TOF MS, can be a useful tool to achieve quickly the elucidation of protein structure and stability. Copyright 2002 John Wiley & Sons, Ltd.

  15. Strand-like structures and the nonstructural proteins 5, 3 and 1 are present in the nucleus of mosquito cells infected with dengue virus.

    PubMed

    Reyes-Ruiz, José M; Osuna-Ramos, Juan F; Cervantes-Salazar, Margot; Lagunes Guillen, Anel E; Chávez-Munguía, Bibiana; Salas-Benito, Juan S; Del Ángel, Rosa M

    2018-02-01

    Dengue virus (DENV) is an arbovirus, which replicates in the endoplasmic reticulum. Although replicative cycle takes place in the cytoplasm, some viral proteins such as NS5 and C are translocated to the nucleus during infection in mosquitoes and mammalian cells. To localized viral proteins in DENV-infected C6/36 cells, an immunofluorescence (IF) and immunoelectron microscopy (IEM) analysis were performed. Our results indicated that C, NS1, NS3 and NS5 proteins were found in the nucleus of DENV-infected C6/36 cells. Additionally, complex structures named strand-like structures (Ss) were observed in the nucleus of infected cells. Interestingly, the NS5 protein was located in these structures. Ss were absent in mock-infected cells, suggesting that DENV induces their formation in the nucleus of infected mosquito cells. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints

    PubMed Central

    Chan, Yvonne H.; Venev, Sergey V.; Zeldovich, Konstantin B.; Matthews, C. Robert

    2017-01-01

    Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. PMID:28262665

  17. The Inner Membrane Complex Sub-compartment Proteins Critical for Replication of the Apicomplexan Parasite Toxoplasma gondii Adopt a Pleckstrin Homology Fold*

    PubMed Central

    Tonkin, Michelle L.; Beck, Josh R.; Bradley, Peter J.; Boulanger, Martin J.

    2014-01-01

    Toxoplasma gondii, an apicomplexan parasite prevalent in developed nations, infects up to one-third of the human population. The success of this parasite depends on several unique structures including an inner membrane complex (IMC) that lines the interior of the plasma membrane and contains proteins important for gliding motility and replication. Of these proteins, the IMC sub-compartment proteins (ISPs) have recently been shown to play a role in asexual T. gondii daughter cell formation, yet the mechanism is unknown. Complicating mechanistic characterization of the ISPs is a lack of sequence identity with proteins of known structure or function. In support of elucidating the function of ISPs, we first determined the crystal structures of representative members TgISP1 and TgISP3 to a resolution of 2.10 and 2.32 Å, respectively. Structural analysis revealed that both ISPs adopt a pleckstrin homology fold often associated with phospholipid binding or protein-protein interactions. Substitution of basic for hydrophobic residues in the region that overlays with phospholipid binding in related pleckstrin homology domains, however, suggests that ISPs do not retain phospholipid binding activity. Consistent with this observation, biochemical assays revealed no phospholipid binding activity. Interestingly, mapping of conserved surface residues combined with crystal packing analysis indicates that TgISPs have functionally repurposed the phospholipid-binding site likely to coordinate protein partners. Recruitment of larger protein complexes may also be aided through avidity-enhanced interactions resulting from multimerization of the ISPs. Overall, we propose a model where TgISPs recruit protein partners to the IMC to ensure correct progression of daughter cell formation. PMID:24675080

  18. Rift Valley Fever Virus Structural and Nonstructural Proteins: Recombinant Protein Expression and Immunoreactivity Against Antisera from Sheep

    PubMed Central

    Faburay, Bonto; Wilson, William; McVey, D. Scott; Drolet, Barbara S.; Weingartl, Hana; Madden, Daniel; Young, Alan; Ma, Wenjun

    2013-01-01

    Abstract The Rift Valley fever virus (RVFV) encodes the structural proteins nucleoprotein (N), aminoterminal glycoprotein (Gn), carboxyterminal glycoprotein (Gc), and L protein, 78-kD, and the nonstructural proteins NSm and NSs. Using the baculovirus system, we expressed the full-length coding sequence of N, NSs, NSm, Gc, and the ectodomain of the coding sequence of the Gn glycoprotein derived from the virulent strain of RVFV ZH548. Western blot analysis using anti-His antibodies and monoclonal antibodies against Gn and N confirmed expression of the recombinant proteins, and in vitro biochemical analysis showed that the two glycoproteins, Gn and Gc, were expressed in glycosylated form. Immunoreactivity profiles of the recombinant proteins in western blot and in indirect enzyme-linked immunosorbent assay against a panel of antisera obtained from vaccinated or wild type (RVFV)-challenged sheep confirmed the results obtained with anti-His antibodies and demonstrated the suitability of the baculo-expressed antigens for diagnostic assays. In addition, these recombinant proteins could be valuable for the development of diagnostic methods that differentiate infected from vaccinated animals (DIVA). PMID:23962238

  19. Characterization of KCNE1 inside Lipodisq Nanoparticles for EPR Spectroscopic Studies of Membrane Proteins.

    PubMed

    Sahu, Indra D; Zhang, Rongfu; Dunagan, Megan M; Craig, Andrew F; Lorigan, Gary A

    2017-06-01

    EPR spectroscopic studies of membrane proteins in a physiologically relevant native membrane-bound state are extremely challenging due to the complexity observed in inhomogeneity sample preparation and dynamic motion of the spin-label. Traditionally, detergent micelles are the most widely used membrane mimetics for membrane proteins due to their smaller size and homogeneity, providing high-resolution structure analysis by solution NMR spectroscopy. However, it is often difficult to examine whether the protein structure in a micelle environment is the same as that of the respective membrane-bound state. Recently, lipodisq nanoparticles have been introduced as a potentially good membrane mimetic system for structural studies of membrane proteins. However, a detailed characterization of a spin-labeled membrane protein incorporated into lipodisq nanoparticles is still lacking. In this work, lipodisq nanoparticles were used as a membrane mimic system for probing the structural and dynamic properties of the integral membrane protein KCNE1 using site-directed spin labeling EPR spectroscopy. The characterization of spin-labeled KCNE1 incorporated into lipodisq nanoparticles was carried out using CW-EPR titration experiments for the EPR spectral line shape analysis and pulsed EPR titration experiment for the phase memory time (T m ) measurements. The CW-EPR titration experiment indicated an increase in spectral line broadening with the addition of the SMA polymer which approaches close to the rigid limit at a lipid to polymer weight ratio of 1:1, providing a clear solubilization of the protein-lipid complex. Similarly, the T m titration experiment indicated an increase in T m values with the addition of SMA polymer and approaches ∼2 μs at a lipid to polymer weight ratio of 1:2. Additionally, CW-EPR spectral line shape analysis was performed on six inside and six outside the membrane spin-label probes of KCNE1 in lipodisq nanoparticles. The results indicated significant differences in EPR spectral line broadening and a corresponding inverse central line width between spin-labeled KCNE1 residues located inside and outside of the membrane for lipodisq nanoparticle samples when compared to lipid vesicle samples. These results are consistent with the solution NMR structure of KCNE1. This study will be beneficial for researchers working on studying the structural and dynamic properties of membrane proteins.

  20. Principal component similarity analysis of Raman spectra to study the effects of pH, heating, and kappa-carrageenan on whey protein structure.

    PubMed

    Alizadeh-Pasdar, Nooshin; Nakai, Shuryo; Li-Chan, Eunice C Y

    2002-10-09

    Raman spectroscopy was used to elucidate structural changes of beta-lactoglobulin (BLG), whey protein isolate (WPI), and bovine serum albumin (BSA), at 15% concentration, as a function of pH (5.0, 7.0, and 9.0), heating (80 degrees C, 30 min), and presence of 0.24% kappa-carrageenan. Three data-processing techniques were used to assist in identifying significant changes in Raman spectral data. Analysis of variance showed that of 12 characteristics examined in the Raman spectra, only a few were significantly affected by pH, heating, kappa-carrageenan, and their interactions. These included amide I (1658 cm(-1)) for WPI and BLG, alpha-helix for BLG and BSA, beta-sheet for BSA, CH stretching (2880 cm(-1)) for BLG and BSA, and CH stretching (2930 cm(-1)) for BSA. Principal component analysis reduced dimensionality of the characteristics. Heating and its interaction with kappa-carrageenan were identified as the most influential in overall structure of the whey proteins, using principal component similarity analysis.

  1. Molecular Cloning and Characterization of cDNA Encoding a Putative Stress-Induced Heat-Shock Protein from Camelus dromedarius

    PubMed Central

    Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.

    2011-01-01

    Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074

  2. Interplay between Peptide Bond Geometrical Parameters in Nonglobular Structural Contexts

    PubMed Central

    Esposito, Luciana; De Simone, Alfonso; Vitagliano, Luigi

    2013-01-01

    Several investigations performed in the last two decades have unveiled that geometrical parameters of protein backbone show a remarkable variability. Although these studies have provided interesting insights into one of the basic aspects of protein structure, they have been conducted on globular and water-soluble proteins. We report here a detailed analysis of backbone geometrical parameters in nonglobular proteins/peptides. We considered membrane proteins and two distinct fibrous systems (amyloid-forming and collagen-like peptides). Present data show that in these systems the local conformation plays a major role in dictating the amplitude of the bond angle N-Cα-C and the propensity of the peptide bond to adopt planar/nonplanar states. Since the trends detected here are in line with the concept of the mutual influence of local geometry and conformation previously established for globular and water-soluble proteins, our analysis demonstrates that the interplay of backbone geometrical parameters is an intrinsic and general property of protein/peptide structures that is preserved also in nonglobular contexts. For amyloid-forming peptides significant distortions of the N-Cα-C bond angle, indicative of sterical hidden strain, may occur in correspondence with side chain interdigitation. The correlation between the dihedral angles Δω/ψ in collagen-like models may have interesting implications for triple helix stability. PMID:24455689

  3. Interplay between peptide bond geometrical parameters in nonglobular structural contexts.

    PubMed

    Esposito, Luciana; Balasco, Nicole; De Simone, Alfonso; Berisio, Rita; Vitagliano, Luigi

    2013-01-01

    Several investigations performed in the last two decades have unveiled that geometrical parameters of protein backbone show a remarkable variability. Although these studies have provided interesting insights into one of the basic aspects of protein structure, they have been conducted on globular and water-soluble proteins. We report here a detailed analysis of backbone geometrical parameters in nonglobular proteins/peptides. We considered membrane proteins and two distinct fibrous systems (amyloid-forming and collagen-like peptides). Present data show that in these systems the local conformation plays a major role in dictating the amplitude of the bond angle N-C(α)-C and the propensity of the peptide bond to adopt planar/nonplanar states. Since the trends detected here are in line with the concept of the mutual influence of local geometry and conformation previously established for globular and water-soluble proteins, our analysis demonstrates that the interplay of backbone geometrical parameters is an intrinsic and general property of protein/peptide structures that is preserved also in nonglobular contexts. For amyloid-forming peptides significant distortions of the N-C(α)-C bond angle, indicative of sterical hidden strain, may occur in correspondence with side chain interdigitation. The correlation between the dihedral angles Δω/ψ in collagen-like models may have interesting implications for triple helix stability.

  4. Towards fully automated structure-based function prediction in structural genomics: a case study.

    PubMed

    Watson, James D; Sanderson, Steve; Ezersky, Alexandra; Savchenko, Alexei; Edwards, Aled; Orengo, Christine; Joachimiak, Andrzej; Laskowski, Roman A; Thornton, Janet M

    2007-04-13

    As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.

  5. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    PubMed

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  6. Objective identification of residue ranges for the superposition of protein structures

    PubMed Central

    2011-01-01

    Background The automation of objectively selecting amino acid residue ranges for structure superpositions is important for meaningful and consistent protein structure analyses. So far there is no widely-used standard for choosing these residue ranges for experimentally determined protein structures, where the manual selection of residue ranges or the use of suboptimal criteria remain commonplace. Results We present an automated and objective method for finding amino acid residue ranges for the superposition and analysis of protein structures, in particular for structure bundles resulting from NMR structure calculations. The method is implemented in an algorithm, CYRANGE, that yields, without protein-specific parameter adjustment, appropriate residue ranges in most commonly occurring situations, including low-precision structure bundles, multi-domain proteins, symmetric multimers, and protein complexes. Residue ranges are chosen to comprise as many residues of a protein domain that increasing their number would lead to a steep rise in the RMSD value. Residue ranges are determined by first clustering residues into domains based on the distance variance matrix, and then refining for each domain the initial choice of residues by excluding residues one by one until the relative decrease of the RMSD value becomes insignificant. A penalty for the opening of gaps favours contiguous residue ranges in order to obtain a result that is as simple as possible, but not simpler. Results are given for a set of 37 proteins and compared with those of commonly used protein structure validation packages. We also provide residue ranges for 6351 NMR structures in the Protein Data Bank. Conclusions The CYRANGE method is capable of automatically determining residue ranges for the superposition of protein structure bundles for a large variety of protein structures. The method correctly identifies ordered regions. Global structure superpositions based on the CYRANGE residue ranges allow a clear presentation of the structure, and unnecessary small gaps within the selected ranges are absent. In the majority of cases, the residue ranges from CYRANGE contain fewer gaps and cover considerably larger parts of the sequence than those from other methods without significantly increasing the RMSD values. CYRANGE thus provides an objective and automatic method for standardizing the choice of residue ranges for the superposition of protein structures. PMID:21592348

  7. Intermolecular detergent-membrane protein noes for the characterization of the dynamics of membrane protein-detergent complexes.

    PubMed

    Eichmann, Cédric; Orts, Julien; Tzitzilonis, Christos; Vögeli, Beat; Smrt, Sean; Lorieau, Justin; Riek, Roland

    2014-12-11

    The interaction between membrane proteins and lipids or lipid mimetics such as detergents is key for the three-dimensional structure and dynamics of membrane proteins. In NMR-based structural studies of membrane proteins, qualitative analysis of intermolecular nuclear Overhauser enhancements (NOEs) or paramagnetic resonance enhancement are used in general to identify the transmembrane segments of a membrane protein. Here, we employed a quantitative characterization of intermolecular NOEs between (1)H of the detergent and (1)H(N) of (2)H-perdeuterated, (15)N-labeled α-helical membrane protein-detergent complexes following the exact NOE (eNOE) approach. Structural considerations suggest that these intermolecular NOEs should show a helical-wheel-type behavior along a transmembrane helix or a membrane-attached helix within a membrane protein as experimentally demonstrated for the complete influenza hemagglutinin fusion domain HAfp23. The partial absence of such a NOE pattern along the amino acid sequence as shown for a truncated variant of HAfp23 and for the Escherichia coli inner membrane protein YidH indicates the presence of large tertiary structure fluctuations such as an opening between helices or the presence of large rotational dynamics of the helices. Detergent-protein NOEs thus appear to be a straightforward probe for a qualitative characterization of structural and dynamical properties of membrane proteins embedded in detergent micelles.

  8. Integration of electrochemistry with ultra-performance liquid chromatography/mass spectrometry.

    PubMed

    Cai, Yi; Zheng, Qiuling; Liu, Yong; Helmy, Roy; Loo, Joseph A; Chen, Hao

    2015-01-01

    This study presents the development of ultra-performance liquid chromatography (UPLC) mass spectrometry (MS) combined with electrochemistry (EC) for the first time and its application for the structural analysis of proteins/peptides that contain disulfide bonds. In our approach, a protein/peptide mixture sample undergoes a fast UPLC separation and subsequent electrochemical reduction in an electrochemical flow cell followed by online MS and tandem mass spectrometry (MS/MS) analyses. The electrochemical cell is coupled to the mass spectrometer using our recently developed desorption electrospray ionization (DESI) interface. Using this UPLC/EC/DESI-MS method, peptides that contain disulfide bonds can be differentiated from those without disulfide bonds, as the former are electroactive and reducible. MS/MS analysis of the disulfide-reduced peptide ions provides increased information on the sequence and disulfide-linkage pattern. In a reactive DESI- MS detection experiment in which a supercharging reagent was used to dope the DESI spray solvent, increased charging was obtained for the UPLC-separated proteins. Strikingly, upon online electrolytic reduction, supercharged proteins (e.g., α-lactalbumin) showed even higher charging, which will be useful in top- down protein structure MS analysis as increased charges are known to promote protein ion dissociation. Also, the separation speed and sensitivity are enhanced by approximately 1(~)2 orders of magnitude by using UPLC for the liquid chromatography (LC)/EC/MS platform, in comparison to the previously used high- performance liquid chromatography (HPLC). This UPLC/EC/DESI-MS method combines the power of fast UPLC separation, fast electrochemical conversion, and online MS structural analysis for a potentially valuable tool for proteomics research and bioanalysis.

  9. Integration of Electrochemistry with Ultra Performance Liquid Chromatography/Mass Spectrometry (UPLC/MS)

    PubMed Central

    Cai, Yi; Zheng, Qiuling; Liu, Yong; Helmy, Roy; Loo, Joseph A.; Chen, Hao

    2015-01-01

    This study presents the development of ultra-performance liquid chromatography/mass spectrometry (UPLC/MS) combined with electrochemistry (EC) for the first time and its application for the structural analysis of disulfide bond-containing proteins/peptides. In our approach, a protein/peptide mixture sample undergoes fast UPLC separation and subsequent electrochemical reduction in an electrochemical flow cell followed by online MS and MS/MS analyses. The electrochemical cell is coupled to MS using our recently developed desorption electrospray ionization (DESI) interface. Using this UPLC/EC/DESI-MS method, disulfide bond-containing peptides can be differentiated from those without disulfide bonds as the former are electroactive and reducible. Tandem MS analysis of the disulfide-reduced peptide ions provides increased sequence and disulfide linkage pattern information. In a reactive DESI-MS detection experiment in which a supercharging reagent was used to dope the DESI spray solvent, increased charging was obtained for the UPLC-separated proteins. Strikingly, upon online electrolytic reduction, supercharged proteins (e.g., α-lactalbumin) showed even higher charging, which would be useful in top-down protein structure analysis as increased charges are known to promote protein ion dissociation. Also, the separation speed and sensitivity are enhanced by approximately 1~2 orders of magnitude by using UPLC for the LC/EC/MS platform, in comparison to the previously used high performance liquid chromatography (HPLC). This UPLC/EC/DESI-MS method combines the power of fast UPLC separation, fast electrochemical conversion and online MS structural analysis for a potentially valuable tool for proteomics research and bioanalysis. PMID:26307715

  10. Genome-wide analysis of the homeodomain-leucine zipper (HD-ZIP) gene family in peach (Prunus persica).

    PubMed

    Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L

    2014-04-08

    In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.

  11. Effects of urea induced protein conformational changes on ion exchange chromatographic behavior.

    PubMed

    Hou, Ying; Hansen, Thomas B; Staby, Arne; Cramer, Steven M

    2010-11-19

    Urea is widely employed to facilitate protein separations in ion exchange chromatography at various scales. In this work, five model proteins were used to examine the chromatographic effects of protein conformational changes induced by urea in ion exchange chromatography. Linear gradient experiments were carried out at various urea concentrations and the protein secondary and tertiary structures were evaluated by far UV CD and fluorescence measurements, respectively. The results indicated that chromatographic retention times were well correlated with structural changes and that they were more sensitive to tertiary structural change. Steric Mass Action (SMA) isotherm parameters were also examined and the results indicated that urea induced protein conformational changes could affect both the characteristic charge and equilibrium constants in these systems. Dynamic light scattering analysis of changes in protein size due to urea-induced unfolding indicated that the size of the protein was not correlated with SMA parameter changes. These results indicate that while urea-induced structural changes can have a marked effect on protein chromatographic behavior in IEX, this behavior can be quite complicated and protein specific. These differences in protein behavior may provide insight into how these partially unfolded proteins are interacting with the resin material. Copyright © 2010 Elsevier B.V. All rights reserved.

  12. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins.

    PubMed

    Perez, Romel B; Tischer, Alexander; Auton, Matthew; Whitten, Steven T

    2014-12-01

    Molecular transduction of biological signals is understood primarily in terms of the cooperative structural transitions of protein macromolecules, providing a mechanism through which discrete local structure perturbations affect global macromolecular properties. The recognition that proteins lacking tertiary stability, commonly referred to as intrinsically disordered proteins (IDPs), mediate key signaling pathways suggests that protein structures without cooperative intramolecular interactions may also have the ability to couple local and global structure changes. Presented here are results from experiments that measured and tested the ability of disordered proteins to couple local changes in structure to global changes in structure. Using the intrinsically disordered N-terminal region of the p53 protein as an experimental model, a set of proline (PRO) and alanine (ALA) to glycine (GLY) substitution variants were designed to modulate backbone conformational propensities without introducing non-native intramolecular interactions. The hydrodynamic radius (R(h)) was used to monitor changes in global structure. Circular dichroism spectroscopy showed that the GLY substitutions decreased polyproline II (PP(II)) propensities relative to the wild type, as expected, and fluorescence methods indicated that substitution-induced changes in R(h) were not associated with folding. The experiments showed that changes in local PP(II) structure cause changes in R(h) that are variable and that depend on the intrinsic chain propensities of PRO and ALA residues, demonstrating a mechanism for coupling local and global structure changes. Molecular simulations that model our results were used to extend the analysis to other proteins and illustrate the generality of the observed PRO and alanine effects on the structures of IDPs. © 2014 Wiley Periodicals, Inc.

  13. Atomic Force Microscopy Analysis of the Role of Major DNA-Binding Proteins in Organization of the Nucleoid in Escherichia coli

    PubMed Central

    Ohniwa, Ryosuke L.; Muchaku, Hiroki; Saito, Shinji; Wada, Chieko; Morikawa, Kazuya

    2013-01-01

    Bacterial genomic DNA is packed within the nucleoid of the cell along with various proteins and RNAs. We previously showed that the nucleoid in log phase cells consist of fibrous structures with diameters ranging from 30 to 80 nm, and that these structures, upon RNase A treatment, are converted into homogeneous thinner fibers with diameter of 10 nm. In this study, we investigated the role of major DNA-binding proteins in nucleoid organization by analyzing the nucleoid of mutant Escherichia coli strains lacking HU, IHF, H–NS, StpA, Fis, or Hfq using atomic force microscopy. Deletion of particular DNA-binding protein genes altered the nucleoid structure in different ways, but did not release the naked DNA even after the treatment with RNase A. This suggests that major DNA-binding proteins are involved in the formation of higher order structure once 10-nm fiber structure is built up from naked DNA. PMID:23951337

  14. Insights into Fanconi Anaemia from the structure of human FANCE

    PubMed Central

    Nookala, Ravi K.; Hussain, Shobbir; Pellegrini, Luca

    2007-01-01

    Fanconi Anaemia (FA) is a cancer predisposition disorder characterized by spontaneous chromosome breakage and high cellular sensitivity to genotoxic agents. In response to DNA damage, a multi-subunit assembly of FA proteins, the FA core complex, monoubiquitinates the downstream FANCD2 protein. The FANCE protein plays an essential role in the FA process of DNA repair as the FANCD2-binding component of the FA core complex. Here we report a crystallographic and biological study of human FANCE. The first structure of a FA protein reveals the presence of a repeated helical motif that provides a template for the structural rationalization of other proteins defective in Fanconi Anaemia. The portion of FANCE defined by our crystallographic analysis is sufficient for interaction with FANCD2, yielding structural information into the mode of FANCD2 recruitment to the FA core complex. Disease-associated mutations disrupt the FANCE–FANCD2 interaction, providing structural insight into the molecular mechanisms of FA pathogenesis. PMID:17308347

  15. New assessment of a structural alphabet

    PubMed Central

    de Brevern, Alexandre G.

    2005-01-01

    Summary A statistical analysis of the Protein Databank (PDB) structures had led us to define a set of small 3D structural prototypes called Protein Blocks (PBs). This structural alphabet includes 16 PBs, each one defined by the (Φ, Ψ) dihedral angles of 5 consecutive residues. Here, we analyze the effect of the enlargement of the PDB on the PBs’ definition. The results highlight the quality of the 3D approximation ensured by the PBs. These last could be of great interest in ab initio modeling. PMID:15996119

  16. Structural analysis of a highly glycosylated and unliganded gp120-based antigen using mass spectrometry†

    PubMed Central

    Wang, Liwen; Qin, Yali; Ilchenko, Serguei; Bohon, Jen; Shi, Wuxian; Cho, Michael W.; Takamoto, Keiji; Chance, Mark R.

    2010-01-01

    Structural characterization of the HIV envelope protein gp120 is very important to provide an understanding of the protein's immunogenicity and it's binding to cell receptors. So far, crystallographic structure determination of gp120 with an intact V3 loop (in the absence of CD4 co-receptor or antibody) has not been achieved. The third variable region (V3) of the gp120 is immunodominant and contains glycosylation signatures that are essential for co-receptor binding and viral entry to T-cells. In this study, we characterized the structure of the outer domain of gp120 with an intact V3 loop (gp120-OD8) purified from Drosophila S2 cells utilizing mass spectrometry-based approaches. We mapped the glycosylation sites and calculated glycosylation occupancy of gp120-OD8; eleven sites from fifteen glycosylation motifs were determined as having high mannose or hybrid glycosylation structures. The specific glycan moieties of nine glycosylation sites from eight unique glycopeptides were determined by a combination of ECD and CID MS approaches. Hydroxyl radical-mediated protein footprinting coupled with mass spectrometry analysis was employed to provide detailed information on protein structure of gp120-OD8 by directly identifying accessible and hydroxyl radical-reactive side chain residues. Comparison of gp120-OD8 experimental footprinting data with a homology model derived from the ligated CD4/ gp120-OD8 crystal structure revealed a flexible V3 loop structure where the V3 tip may provide contacts with the rest of the protein while residues in the V3 base remain solvent accessible. In addition, the data illustrate interactions between specific sugar moieties and amino acid side chains potentially important to the gp120-OD8 structure. PMID:20825246

  17. Common Evolutionary Origin for the Rotor Domain of Rotary Atpases and Flagellar Protein Export Apparatus

    PubMed Central

    Kishikawa, Jun-ichi; Ibuki, Tatsuya; Nakamura, Shuichi; Nakanishi, Astuko; Minamino, Tohru; Miyata, Tomoko; Namba, Keiichi; Konno, Hiroki; Ueno, Hiroshi; Imada, Katsumi; Yokoyama, Ken

    2013-01-01

    The V1- and F1- rotary ATPases contain a rotor that rotates against a catalytic A3B3 or α3β3 stator. The rotor F1-γ or V1-DF is composed of both anti-parallel coiled coil and globular-loop parts. The bacterial flagellar type III export apparatus contains a V1/F1-like ATPase ring structure composed of FliI6 homo-hexamer and FliJ which adopts an anti-parallel coiled coil structure without the globular-loop part. Here we report that FliJ of Salmonella enterica serovar Typhimurium shows a rotor like function in Thermus thermophilus A3B3 based on both biochemical and structural analysis. Single molecular analysis indicates that an anti-parallel coiled-coil structure protein (FliJ structure protein) functions as a rotor in A3B3. A rotary ATPase possessing an F1-γ-like protein generated by fusion of the D and F subunits of V1 rotates, suggesting F1-γ could be the result of a fusion of the genes encoding two separate rotor subunits. Together with sequence comparison among the globular part proteins, the data strongly suggest that the rotor domains of the rotary ATPases and the flagellar export apparatus share a common evolutionary origin. PMID:23724081

  18. A structural analysis of the AAA+ domains in Saccharomyces cerevisiae cytoplasmic dynein.

    PubMed

    Gleave, Emma S; Schmidt, Helgo; Carter, Andrew P

    2014-06-01

    Dyneins are large protein complexes that act as microtubule based molecular motors. The dynein heavy chain contains a motor domain which is a member of the AAA+ protein family (ATPases Associated with diverse cellular Activities). Proteins of the AAA+ family show a diverse range of functionalities, but share a related core AAA+ domain, which often assembles into hexameric rings. Dynein is unusual because it has all six AAA+ domains linked together, in one long polypeptide. The dynein motor domain generates movement by coupling ATP driven conformational changes in the AAA+ ring to the swing of a motile element called the linker. Dynein binds to its microtubule track via a long antiparallel coiled-coil stalk that emanates from the AAA+ ring. Recently the first high resolution structures of the dynein motor domain were published. Here we provide a detailed structural analysis of the six AAA+ domains using our Saccharomycescerevisiae crystal structure. We describe how structural similarities in the dynein AAA+ domains suggest they share a common evolutionary origin. We analyse how the different AAA+ domains have diverged from each other. We discuss how this is related to the function of dynein as a motor protein and how the AAA+ domains of dynein compare to those of other AAA+ proteins. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Theory and Normal Mode Analysis of Change in Protein Vibrational Dynamics on Ligand Binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mortisugu, Kei; Njunda, Brigitte; Smith, Jeremy C

    2009-12-01

    The change of protein vibrations on ligand binding is of functional and thermodynamic importance. Here, this process is characterized using a simple analytical 'ball-and-spring' model and all-atom normal-mode analysis (NMA) of the binding of the cancer drug, methotrexate (MTX) to its target, dihydrofolate reductase (DHFR). The analytical model predicts that the coupling between protein vibrations and ligand external motion generates entropy-rich, low-frequency vibrations in the complex. This is consistent with the atomistic NMA which reveals vibrational softening in forming the DHFR-MTX complex, a result also in qualitative agreement with neutron-scattering experiments. Energy minimization of the atomistic bound-state (B) structure whilemore » gradually decreasing the ligand interaction to zero allows the generation of a hypothetical 'intermediate' (I) state, without the ligand force field but with a structure similar to that of B. In going from I to B, it is found that the vibrational entropies of both the protein and MTX decrease while the complex structure becomes enthalpically stabilized. However, the relatively weak DHFR:MTX interaction energy results in the net entropy gain arising from coupling between the protein and MTX external motion being larger than the loss of vibrational entropy on complex formation. This, together with the I structure being more flexible than the unbound structure, results in the observed vibrational softening on ligand binding.« less

  20. Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: The case of domain motions

    NASA Astrophysics Data System (ADS)

    Naritomi, Yusuke; Fuchigami, Sotaro

    2011-02-01

    Protein dynamics on a long time scale was investigated using all-atom molecular dynamics (MD) simulation and time-structure based independent component analysis (tICA). We selected the lysine-, arginine-, ornithine-binding protein (LAO) as a target protein and focused on its domain motions in the open state. A MD simulation of the LAO in explicit water was performed for 600 ns, in which slow and large-amplitude domain motions of the LAO were observed. After extracting domain motions by rigid-body domain analysis, the tICA was applied to the obtained rigid-body trajectory, yielding slow modes of the LAO's domain motions in order of decreasing time scale. The slowest mode detected by the tICA represented not a closure motion described by a largest-amplitude mode determined by the principal component analysis but a twist motion with a time scale of tens of nanoseconds. The slow dynamics of the LAO were well described by only the slowest mode and were characterized by transitions between two basins. The results show that tICA is promising for describing and analyzing slow dynamics of proteins.

Top